ICCV 2023 Oral | How to conduct test segment training in the open world? Self-training method based on dynamic prototype expansion...

Click on the card below to follow the " CVer " public account

AI/CV heavy-duty information, delivered as soon as possible

Click to enter -> [Target Detection and Transformer] communication group

Reprinted from: Heart of the Machine

This paper proposes a test segment training method for the open world for the first time.

Improving the generalization ability of the model is an important basis for promoting the implementation of vision-based perception methods. Test-Time Training/Adaptation generalizes the model to unknown target domain data distribution by adjusting the model parameter weights in the test section. part. Existing TTT/TTA methods usually focus on improving test segment training performance under target domain data in the closed-loop world.

However, in many application scenarios, the target domain is easily contaminated by strong out-of-domain data (Strong OOD) data, such as irrelevant semantic category data. This scenario can also be called open world test segment training (OWTTT). In this scenario, existing TTT/TTA usually forcibly classify strong out-of-domain data into known categories, thus ultimately interfering with weak images such as noise interference images. The ability to distinguish out-of-domain data (Weak OOD).

Recently, the team from South China University of Technology and A*STAR proposed the setting of open world test segment training for the first time, and launched a method for open world test segment training.

cf4836b7fe2055ff58871414ddbd1c9a.png

Reply in the background of CVer WeChat public account: OWTTT, you can download the pdf and code of this paper

  • Paper: https://arxiv.org/abs/2308.09942

  • Code: https://github.com/Yushu-Li/OWTTT

This paper first proposes an adaptive threshold strong out-of-domain data sample filtering method, which improves the robustness of the self-training TTT method in the open world. This method further proposes a method to characterize strong out-of-domain samples based on dynamically extended prototypes to improve the weak/strong out-of-domain data separation effect. Finally, self-training is constrained by distribution alignment.

The method in this paper achieves optimal performance on 5 different OWTTT benchmarks, and provides a new direction for subsequent research on TTT to explore more robust TTT methods. The research has been accepted as an Oral paper by ICCV 2023.

introduction

Training to Test (TTT) can access target domain data only during the inference phase and perform on-the-fly inference on distribution-shifted test data. The success of TTT has been demonstrated on a number of artificially selected synthetically corrupted target domain data. However, the capability boundaries of existing TTT methods have not been fully explored.

To promote TTT applications in open scenarios, the focus of research has shifted to investigating scenarios where TTT methods may fail. Many efforts have been made to develop stable and robust TTT methods in more realistic open world environments. In this work, we delve into a common but overlooked open-world scenario, where the target domain may contain test data distributions drawn from significantly different environments, such as different semantic categories than the source domain, or simply random noise.

We call the above test data strong out-of-distribution data (strong OOD). What is called weak OOD data in this work is test data with distribution shifts, such as common synthetic damage. Therefore, the lack of existing work on this real-life environment motivates us to explore improving the robustness of Open World Test Segment Training (OWTTT), where the test data is contaminated by strong OOD samples.

0ba52ed6010cda9f6380004d1bc480a5.png

Figure 1: Evaluation results of existing TTT methods under OWTTT setting

As shown in Figure 1, we first evaluate existing TTT methods under the OWTTT setting and find that both TTT methods through self-training and distribution alignment are affected by strong OOD samples. These results demonstrate that safe test-time training in the open world cannot be achieved by applying existing TTT techniques. We attribute their failure to the following two reasons.

  • It is difficult for TTT based on self-training to handle strong OOD samples because it must assign test samples to known categories. Although some low-confidence samples can be filtered out by applying the threshold adopted in semi-supervised learning, it is still not guaranteed to filter out all strong OOD samples.

  • Distribution alignment-based methods will be affected when strong OOD samples are computed to estimate the target domain distribution. Both global distribution alignment [1] and class distribution alignment [2] can be affected and lead to inaccurate feature distribution alignment.

Considering the potential reasons for the failure of existing TTT methods, we propose a combination of two techniques to improve the robustness of open-world TTT under a self-training framework.

First, we construct the baseline of TTT on the self-trained variant, that is, clustering in the target domain with the source domain prototype as the cluster center. To mitigate the impact of self-training on strong OOD with incorrect pseudo-labels, we design a hyperparameter-free method to reject strong OOD samples.

To further separate the characteristics of weak OOD samples and strong OOD samples, we allow the prototype pool to expand by selecting isolated strong OOD samples. Therefore, self-training will allow strong OOD samples to form tight clusters around the newly expanded strong OOD prototype. This will facilitate distribution alignment between source and target domains. We further propose to regularize self-training through global distribution alignment to reduce the risk of confirmation bias.

Finally, to synthesize open-world TTT scenarios, we adopt CIFAR10-C, CIFAR100-C, ImageNet-C, VisDA-C, ImageNet-R, Tiny-ImageNet, MNIST, and SVHN datasets, and use a dataset for weak OOD, others establish benchmark data sets for strong OOD. We refer to this benchmark as the Open World Test Segment Training Benchmark and hope that this encourages more future work to focus on the robustness of test segment training in more realistic scenarios.

method

The paper is divided into four parts to introduce the proposed method.

1) Overview of the setting of test segment training tasks in the open world .

2) Introduced how to implement TTT through prototype clustering and how to extend the prototype for open-world test-time training.

3) Introduces how to use target domain data for dynamic prototype expansion .

4) Introducing distribution alignment combined with prototype clustering to enable powerful open-world test-time training.

6160db0f0aa8199ea0a7f2d19103ab67.png

Figure 2: Method overview diagram

Task setting

The purpose of TTT is to adapt the source domain pre-trained model to the target domain, where the target domain may have a distribution shift relative to the source domain. In standard closed-world TTT, the label spaces of the source and target domains are the same. However, in open-world TTT, the label space of the target domain contains the target space of the source domain, which means that the target domain has new semantic categories that have not been seen before.

To avoid confusion between TTT definitions, we adopt the sequential test time training (sTTT) protocol proposed in TTAC [2] for evaluation. Under the sTTT protocol, test samples are tested sequentially, and model updates are performed after observing small batches of test samples. The prediction for any test sample arriving at timestamp t will not be affected by any test sample arriving at t+k (whose k is greater than 0).

prototype clustering

Inspired by work using clustering in domain adaptation tasks [3, 4], we view test segment training as discovering cluster structures in target domain data. By identifying representative prototypes as cluster centers, cluster structures are identified in the target domain and test samples are encouraged to embed near one of the prototypes. The goal of prototype clustering is defined as minimizing the negative log-likelihood loss of the cosine similarity between the sample and the cluster center, as shown in the following formula.

fad122a38ae16947f2e1d6be82379ec2.png

We develop a hyperparameter-free method to filter out strong OOD samples to avoid the negative impact of adjusting model weights. Specifically, we define a strong OOD score os for each test sample as the highest similarity with the source domain prototype, as shown in the following equation.

0ccb6d2935633300543b26ed5283413b.png

aee4795d4e6c5526cb2d1f02207356f0.png

Figure 3 Outliers present a bimodal distribution

We observe that the outliers follow a bimodal distribution, as shown in Figure 3. Therefore, instead of specifying a fixed threshold, we define the optimal threshold as the best value that separates the two distributions. Specifically, the problem can be formulated as dividing the outliers into two clusters, and the optimal threshold will minimize the within-cluster variance in . Optimizing the following equation can be efficiently achieved by exhaustively searching all possible thresholds from 0 to 1 in steps of 0.01.

aa943ef8c66cca778d8f4066c0d3b1cc.png

Dynamic prototype extension

Expanding the pool of strong OOD prototypes requires considering both the source domain and strong OOD prototypes to evaluate test samples. To dynamically estimate the number of clusters from data, previous studies have investigated similar problems. The deterministic hard clustering algorithm DP-means [5] is developed by measuring the distance of data points to known cluster centers, and a new cluster is initialized when the distance is above a threshold. DP-means is shown to be equivalent to optimizing the K-means objective but with an additional penalty on the number of clusters, providing a feasible solution for dynamic prototype expansion.

To alleviate the difficulty of estimating additional hyperparameters, we first define a test sample with an extended strong OOD score as the closest distance to the existing source domain prototype and the strong OOD prototype, as follows. Therefore, testing samples above this threshold will build a new prototype. To avoid adding nearby test samples, we incrementally repeat this prototype expansion process.

53cc8d2b80c01c4e4719e82fdf900442.png

With other strong OOD prototypes identified, we define a prototype clustering loss for testing samples, taking into account two factors. First, test samples classified into known classes should be embedded closer to prototypes and farther away from other prototypes, which defines the K-class classification task. Secondly, test samples classified as strong OOD prototypes should be far away from any source domain prototypes, which defines the K+1 class classification task. With these goals in mind, we define the prototype clustering loss as follows.

e587ef925cd0a21d937688825618ba31.png

Distribution alignment constraints

Self-training is known to be susceptible to erroneous pseudo-labels. The situation is even worse when the target domain consists of OOD samples. To reduce the risk of failure, we further use distribution alignment [1] as a regularization for self-training, as follows.

da66132c1bd0f2937eb69f513c728a20.png

experiment

We test on 5 different OWTTT benchmark datasets, including synthetically damaged datasets and style-varying datasets. The experiment mainly uses three evaluation indicators: weak OOD classification accuracy ACCS, strong OOD classification accuracy ACCN and the harmonic mean ACCH of the two.

ac263083363fdb98464bc604876687e9.png

Table 1 Performance of different methods on Cifar10-C data set

5720ac2bb1e59ed41b992cd6e5545493.png

Table 2 Performance of different methods on Cifar100-C data set

af6415c68f33b34db4e0266817accfd8.png

Table 3 Performance of different methods on ImageNet-C data set

f9c95307d04b02a5307b6e99517c399c.png

Table 4 Performance of different methods on the ImageNet-R data set

7e891a26f2348a65f6fda24a5684fb1a.png

Table 5 Performance of different methods on VisDA-C data set

As shown in the table above, our method has greatly improved compared to the current best methods on almost all data sets, and can effectively identify strong OOD samples and reduce its impact on the classification of weak OOD samples. Our method can achieve more robust TTT in open-world scenarios.

Summarize

This paper first proposes the issues and settings of Open World Test Segment Training (OWTTT), points out that existing methods will encounter difficulties when processing target domain data containing strong OOD samples that have semantic offsets from source domain samples, and proposes A self-training method based on dynamic prototype expansion solves the above problems. We hope that this work can provide new directions for subsequent research on TTT to explore more robust TTT methods.

Reply in the background of CVer WeChat public account: OWTTT, you can download the pdf and code of this paper

references:

[1] Yuejiang Liu, Parth Kothari, Bastien van Delft, Baptiste Bellot-Gurlet, Taylor Mordan, and Alexandre Alahi. Ttt++: When does self-supervised test-time training fail or thrive? In Advances in Neural Information Processing Systems, 2021.

[2] Yongyi Su, Xun Xu, and Kui Jia. Revisiting realistic test-time training: Sequential inference and adaptation by anchored clustering. In Advances in Neural Information Processing Systems, 2022.

[3] Hui Tang and Kui Jia. Discriminative adversarial domain adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 5940–5947, 2020.

[4] Kuniaki Saito, Shohei Yamamoto, Yoshitaka Ushiku, and Tatsuya Harada. Open set domain adaptation by backpropagation. In European Conference on Computer Vision, 2018.

[5] Brian Kulis and Michael I Jordan. Revisiting k-means: New algorithms via bayesian nonparametrics. In International Conference on Machine Learning, 2012.

Click to enter -> [Target Detection and Transformer] communication group

ICCV/CVPR 2023 paper and code download

 
  

Backstage reply: CVPR2023, you can download the collection of CVPR 2023 papers and code open source papers

后台回复:ICCV2023,即可下载ICCV 2023论文和代码开源的论文合集
目标检测和Transformer交流群成立
扫描下方二维码,或者添加微信:CVer333,即可添加CVer小助手微信,便可申请加入CVer-目标检测或者Transformer 微信交流群。另外其他垂直方向已涵盖:目标检测、图像分割、目标跟踪、人脸检测&识别、OCR、姿态估计、超分辨率、SLAM、医疗影像、Re-ID、GAN、NAS、深度估计、自动驾驶、强化学习、车道线检测、模型剪枝&压缩、去噪、去雾、去雨、风格迁移、遥感图像、行为识别、视频理解、图像融合、图像检索、论文投稿&交流、PyTorch、TensorFlow和Transformer、NeRF等。
一定要备注:研究方向+地点+学校/公司+昵称(如目标检测或者Transformer+上海+上交+卡卡),根据格式备注,可更快被通过且邀请进群

▲扫码或加微信号: CVer333,进交流群
CVer计算机视觉(知识星球)来了!想要了解最新最快最好的CV/DL/AI论文速递、优质实战项目、AI行业前沿、从入门到精通学习教程等资料,欢迎扫描下方二维码,加入CVer计算机视觉,已汇集数千人!

▲扫码进星球
▲点击上方卡片,关注CVer公众号
整理不易,请点赞和在看

Guess you like

Origin blog.csdn.net/amusi1994/article/details/132913825