Semi-supervised semantic segmentation aims to use as few labeled images as possible and a large number of unlabeled images to learn a better segmentation model. Among them, the learning of labeled images is generally similar to fully supervised semantic segmentation, such as calculating the cross-entropy loss between prediction results and manual annotations. The key to the problem is how to use unlabeled images .
This article briefly introduces the Match series methods in the semi-supervised algorithm: FreeMatch (ICLR 2023), SoftMatch (ICLR 2023), UniMatch (CVPR 2023).
FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning, ICLR2023
Interpretation: FreeMatch paper reading - Zhihu (zhihu.com)
ICLR 2023 semi-supervised learning highest scoring paper FreeMatch: Adaptive Threshold Method - 知乎
论文:FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning | OpenReview
Existing methods may not be able to effectively utilize unlabeled data as they either use predefined/fixed thresholds or specialized heuristic thresholding schemes . This will lead to poor model performance and slow convergence. In this paper, we first theoretically analyze a simple binary classification model to gain intuition about the relationship between the ideal threshold and the model's learning state. Based on the analysis, FreeMatch is proposed to adjust the confidence threshold in an adaptive manner according to the learning state of the model . An adaptive class-fair regularization penalty is further introduced to encourage the model to make diverse predictions in the early training stage.
FreeMatch consists of two parts: adaptive threshold and adaptive fair regularization penalty .
Adaptive thresholds can be specifically divided into adaptive global thresholds and adaptive local thresholds. Local thresholding aims to adjust global thresholding in a class-specific manner to account for intra-class diversity and possible class adjacency.
Adaptive fair regularization penalty, instead of using the class average prior that is often used before to punish the model (because real scenes often do not meet the class balance condition), but using the moving average EMA from the model prediction as the expected estimated unlabeled data predictive distribution.
Significant performance improvement.
SoftMatch: Addressing the Quantity-Quality Tradeoff in Semi-supervised Learning, ICLR2023
Interpretation: SoftMatch paper reading - Zhihu (zhihu.com)
论文:SoftMatch: Addressing the Quantity-Quality Tradeoff in Semi-supervised Learning | OpenReview
Confidence thresholding is a more mainstream way of using pseudo-labels. Too high a threshold discards a lot of uncertain pseudo-labels, leading to learning "imbalance" between categories and "low utilization" of pseudo-labels . The dynamic threshold introduces more pseudo-labels to participate in training in the early stage by lowering the threshold (different categories/different data) in the early stage, but the low threshold in the early stage will inevitably introduce low-quality pseudo-labels.
The method background is to train the model using pseudo-labels. The core argument is that the existing pseudo-label work uses hard thresholds to screen pseudo-labels to select samples with high confidence, but such potential impacts include: (1) High thresholds will lead to a large number of low-confidence but actually The correct pseudo-labels are discarded, thereby reducing the efficiency of training (the solution is FreeMatch at the same year); (2) The dynamic growth threshold or category threshold can indeed encourage the model to use more pseudo-labels, but inevitably introduces wrong Pseudo-labels (supervisory signals).
SoftMatch focuses on solving the trade-off between pseudo-label "quantity-quality" . And improve the marginal probability of different categories to achieve the same level of weighting as possible for different categories of data.
The classification effect is remarkable.
UniMatch: Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation, CVPR2023
Interpretation: CVPR 2023 | UniMatch: Re-examining the strong and weak consistency in semi-supervised semantic segmentation - Zhihu (zhihu.com)
This paper revisits the " strong and weak consistency " approach in semi-supervised semantic segmentation . The paper found that the most basic method of constraining strong and weak consistency, FixMatch, can achieve considerable performance. Inspired by this, the paper further expands the perturbation space, and uses two-way perturbation to explore the original perturbation space more fully.FixMatch
Strong perturbations can lead to huge performance gains. However, FixMatch only performs strong perturbations at the image level, and the paper further expands the perturbation space of FixMatch:
- Add a training branch to perform strong perturbation on the feature space (dropout=0.5) (UniPerb).
- Add another strong perturbation branch at the image level to perform dual-branch perturbation (DusPerb).
Combine the two modules UniPerb and DusPerb to get UniMatch.
For unlabeled images, UniMatch includes a total of four forward propagation branches, including a "clean" branch to generate pseudo-labels, a feature-level strong perturbation branch (acting on features of weakly enhanced images), and two image The strong perturbation branch of the level (no characteristic perturbation). The last three branches are used for network training (the training branch with labeled images is omitted in the figure).
Significant performance improvement.