[Semi-supervised learning] Match series.2

This article briefly introduces the Match series methods in the semi-supervised algorithm: CoMatch (ICCV2021), CRMatch (GCPR2021), Dash (ICML2021), UPS (ICLR2021), SimMatch (CVPR2022), AdaMatch (ICLR2022).

代码:GitHub - microsoft/Semi-supervised-learning: A Unified Semi-Supervised Learning Codebase (NeurIPS'22)

CoMatch: Semi-supervised Learning with Contrastive Graph Regularization, ICCV2021

Interpretation: CoMatch paper reading - Zhihu (zhihu.com)

【ICCV2021】CoMatch: Semi-supervised Learning with Contrastive Graph Regularization - 知乎 (zhihu.com)

Salesforce Research Institute | CoMatch: Semi-supervised Learning Based on Contrast Graph Regularization - Zhiyuan Community (baai.ac.cn)

Paper: https://arxiv.org/abs/2011.11183

代码:GitHub - salesforce/CoMatch: Code for CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

Semi-supervised learning is a paradigm that effectively utilizes unlabeled data to reduce the dependence on data annotation. There are currently two mainstream semi-supervised learning research trends: (1) use classifiers to assign pseudo-labels required for training to each unsupervised sample (2) first perform unsupervised, self-supervised pre-training, and then based on the obtained representation Perform supervised tuning and self-learning. However, self-learning methods are highly dependent on the quality of classification results, causing confirmation bias and accumulating prediction errors. On the other hand, self-supervised learning methods such as contrastive learning learn suboptimal solutions for specific classification tasks.

This paper proposes a new semi-supervised learning method CoMatch, which is a master of current mainstream semi-supervised methods and addresses their limitations. CoMatch simultaneously learns two representations of the training data (classification probabilities and low-dimensional embeddings). These two representations interact and co-evolve. The embeddings impose a smoothness constraint on the classification probabilities to improve the pseudo-labels, while the pseudo-labels regularize the structure of the embeddings via graph-based contrastive learning.

 CoMatch jointly learns two representations of the training data: class probabilities and low-dimensional embeddings. These two representations interact and co-evolve. Low-dimensional embeddings impose smooth constraints on class probabilities to improve pseudo-labels, while pseudo-labels regularize the structure of embeddings through graph-based contrastive learning.

Given a batch of unlabeled images, the authors perform weak data augmentation on them, and then use the augmented images to generate memory-smoothed pseudo-labels. These pseudo-labels will be used as target labels for class prediction on images with strong data augmentation applied. The author constructs a pseudo-label map with self-loops to measure the similarity between samples, so as to train an embedding map so that images with similar pseudo-labels have similar embeddings.

CRMatch: Revisiting Consistency Regularization for Semi-Supervised Learning, GCPR2021

论文:[2112.05825] Revisiting Consistency Regularization for Semi-Supervised Learning (arxiv.org)

代码:GitHub - subugoe/crmatch: Match unstructured references to DOIs using Crossref metadata search

This paper revisits the idea of ​​consistency regularization, which can improve performance by enforcing consistency by reducing the distance between features of different augmented images. However, incentivizing consistency by increasing feature distances can further improve performance. This paper proposes an improved consistency regularization framework CRMatch through the FeatDistLoss technique. This regularization framework imposes consistency and equivalence at the classifier and feature levels, respectively.

Left: Binary classification task. Stars are features of strongly enhanced images, and circles are features of weakly enhanced images. While encouraging invariance by reducing the distance between features from different augmented images provides good performance (left panel), encouraging equivalence representations by increasing the distance makes the feature space more regular, leading to better generalization performance. 

Dash: Semi-Supervised Learning with Dynamic Thresholding, ICML 2021

Interpretation: Paper notes --- "Dash" - Zhihu (zhihu.com)

Dash, an open source semi-supervised learning framework of DAMO Academy, refreshes multiple SOTAs - Zhihu (zhihu.com)

Paper: https://arxiv.org/abs/2109.00650

代码:idstcv/Dash: Tensorflow implementation for Dash (github.com)

This paper proposes a general-purpose SSL algorithm with a dynamic threshold (Dash) that can dynamically select unlabeled data during training. Specifically, Dash first traverses the labeled data to obtain a threshold for unlabeled data selection. Then it selects the unlabeled data whose loss value is less than the threshold as the training data set. The threshold is gradually decreased during the optimization iterations. 

In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning, ICLR 2021

Interpretation: Pseudo-labels can still be used in this way? Semi-supervised masterpiece UPS (ICLR 21) revealed! - Zhihu (zhihu.com)

Interpretation of the paper: IN DEFENSE OF PSEUDO-LABELING - Zhihu (zhihu.com)

论文:[2101.06329] In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning (arxiv.org)

代码:nayeemrizve/ups: "In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning" by Mamshad Nayeem Rizve, Kevin Duarte, Yogesh S Rawat, Mubarak Shah (ICLR 2021) (github.com)

Pseudo-labels tend to have high confidence. If poor-quality pseudo-labels are used for training, a large number of noisy samples will be introduced, which will seriously affect the performance of the model (the category of learning with noisy labels ). Therefore, it is necessary to correct the output of the network model (calibration of Neural Networks). This paper refers to the technology of uncertainty estimation of deep network ( MC-dropout , ICML 2016), and the probability of softmax layer output is two-pronged to screen out reliable pseudo-labeled samples. Negative Learning Noisy learning, although I don't know which category the sample belongs to, but I am sure which category it does not belong to (Negative learning for noisy labels, ICCV 2019). Such a pseudo-label is more accurate than the traditional Positive Learning pseudo-label, so it can well reduce the noise rate of the label and play a role in calibrating the model.

The Uncertainty -Aware  P seudo-Label  Selection Framework (UPS) strategy proposed in this paper is a technology that combines uncertainty estimation (Uncertainty estimation) and Negative learning.

  • Pseudo-labeling method that takes both Positive & Negative Pseudo Label into account
  • Uncertainty-Based Pseudo-Label Selection

Generally speaking, it is divided into 3 steps:

  1.  Train a model with only labeled data;
  2. Use the trained model combined with the UPS method to screen out samples for false labeling;
  3. Take the pseudo-labeled data and the labeled data to train the model (re-initialize randomly), then skip to (2) and continue until the loop reaches the maximum number of iterations. The reason for re-initialization is to avoid the error caused by the wrong pseudo-labeled samples from continuously propagating during iterative training.

SimMatch: Semi-supervised Learning with Similarity Matching, CVPR2022

Interpretation: SimMatch paper sharing___init__:'s blog - CSDN Blog

Paper: https://arxiv.org/abs/2203.06915

Code: GitHub - mingkai-zheng/simmatch

semi-supervised learning,

  • Pre-train on a large-scale dataset and fine-tune with a small amount of labeled data. The disadvantage is that it is difficult to use the label information.
  • Pseudo-labels are usually generated from weak views or the average predictions of multiple enhanced views. The disadvantage is that when the label data is very limited, the trained semantic classifier is not reliable, and the generated pseudo-labels will have an "over confidence" problem, that is, the model will fit those pseudo-labels with good confidence but errors. tags, resulting in performance degradation.

This paper proposes the SimMatch framework,

  1. First, it is desirable that the strongly augmented view and the weakly augmented view have the same semantic similarity (predicted label).
  2. Strongly augmented views have the same instance features (i.e., similarity between instances) as weakly augmented views to facilitate more intrinsic feature matching. It is desirable that strongly augmented views have a similar similarity distribution as weakly augmented views.

 SimMatch performs label propagation and allows semantic similarity and instance similarity to interact with each other.

  • Use semantic similarity to calibrate instance similarity;
  • Adjust semantic similarity using instance similarity.

When the semantic similarity and strength similarity are close, meaning that the two distributions agree with each other's predictions, the resulting pseudo-labels will have higher confidence and thus be more reliable.

 

Class-Aware Contrastive Semi-Supervised Learning, CVPR2022

Interpretation: Class-Aware Contrastive Semi-Supervised Learning (Class-Aware Contrastive Semi-Supervised Learning) paper reading notes_What is class perception_Remoa's blog-CSDN blog

论文:[2203.02261] Class-Aware Contrastive Semi-Supervised Learning (arxiv.org)

Code: https://github.com/TencentYoutuResearch/Classification-SemiCLS

Problems with existing pseudo-label-based semi-supervised learning methods:

  • Pseudo-label → Confirmation Bias exists
  • Out-of-distribution noise data → affects the discriminative ability of the model

Is there a general gain method that can be applied across pseudo-label based semi-supervised methods?

  • MixMatch[1](NIPS, 2019): Data Mixup → Prediction Sharpening (Sharpen)
  • FixMatch[2] (NIPS, 2020): Confidence threshold, weak enhancement → generate pseudo labels → supervised strong enhancement

This paper proposes a general framework for alleviating Confirmation Bias:

  • For reliable in-distribution data: use supervised contrastive learning. In-distribution data: Refers to data where the unlabeled dataset does not contain new categories, or has a balanced data distribution.
  • For Out-of-distribution Data with noise: perform unsupervised comparative learning on features. Out-of-distribution data: Refers to unlabeled datasets containing unknown classes, or data with an unbalanced data distribution.

For the noise problem of pseudo-labels: weight distribution.

AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation, ICLR2022

Interpretation: AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation - Zhihu (zhihu.com)

Paper: https://arxiv.org/abs/2106.04732

Code: GitHub - google-research/adamatch

GitHub - yizhe-ang/AdaMatch-PyTorch: Unofficial PyTorch Implementation of AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation

https://github.com/zysymu/AdaMatch-pytorch

This paper extends semi-supervised learning to domain adaptation problems, enabling models to be trained on one data distribution and tested on another. AdaMatch unifies the tasks of unsupervised domain adaptation (UDA), semi-supervised learning (SSL) and semi-supervised domain adaptation (SSDA). 

 Three means: random output interpolation (random logit interpolation), relative confidence threshold adjustment (relative confidence threshold) and improved distribution alignment (derived from ReMixMatch, but the algorithm backbone comes from FixMatch)

Guess you like

Origin blog.csdn.net/m0_61899108/article/details/130523054