Article directory
- MixMatch: A Holistic Approach to Semi-Supervised Learning(2019)
- CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features(2019)
- Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution
- U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection
- Bootstrap Your Own Latent A New Approach to Self-Supervised Learning
- CBAM: Convolutional Block Attention Module
- FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
- Res2Net: A New Multi-scale Backbone Architecture(2019)
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction(2021)
- Emerging Properties in Self-Supervised Vision Transformers(2021)
- MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE,AND MOBILE-FRIENDLY VISION TRANSFORMER(2022)
- Supervised Contrastive Learning(2020)
- RepVGG: Making VGG-style ConvNets Great Again(2021)
- Pay Attention to MLPs(2021)
- Dual Path Networks(2017)
- Visual Attention Network(2022)
- PVT v2: Improved Baselines with Pyramid Vision Transformer(2021)
- Swin Transformer V2: Scaling Up Capacity and Resolution
- MetaFormer Is Actually What You Need for Vision(2022)
- CvT: Introducing Convolutions to Vision Transformers(2021)
MixMatch: A Holistic Approach to Semi-Supervised Learning(2019)
sharpening formula
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features(2019)
pictures mixed up
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution
Update and exchange of two low- and high-frequency information
U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection
Is the nested UNet
Bootstrap Your Own Latent A New Approach to Self-Supervised Learning
Use the average teacher model to perform losses on two different outputs
CBAM: Convolutional Block Attention Module
class GhostModule(nn.Module):
def __init__(self, inp, oup, kernel_size=1, ratio=2, dw_size=3, stride=1, relu=True):
super(GhostModule, self).__init__()
self.oup = oup
init_channels = math.ceil(oup / ratio)
new_channels = init_channels*(ratio-1)
self.primary_conv = nn.Sequential(
nn.Conv2d(inp, init_channels, kernel_size, stride, kernel_size//2, bias=False),
nn.BatchNorm2d(init_channels),
nn.ReLU(inplace=True) if relu else nn.Sequential(),
)
self.cheap_operation = nn.Sequential(
nn.Conv2d(init_channels, new_channels, dw_size, 1, dw_size//2, groups=init_channels, bias=False),
nn.BatchNorm2d(new_channels),
nn.ReLU(inplace=True) if relu else nn.Sequential(),
)
def forward(self, x):
x1 = self.primary_conv(x)
x2 = self.cheap_operation(x1)
out = torch.cat([x1,x2], dim=1)
return out[:,:self.oup,:,:]
It is easier to understand through the code. The essence of this is to reduce the parameter
code address of the convolution.
FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
It is a loss of their probability distribution using cross-entropy computers with different enhancements.
Res2Net: A New Multi-scale Backbone Architecture(2019)
increase multi-scale
Barlow Twins: Self-Supervised Learning via Redundancy Reduction(2021)
In fact, it is relatively simple. It is the loss of the result of different transformations of the same image going through the same network.
Emerging Properties in Self-Supervised Vision Transformers(2021)
After different enhanced versions, the teacher network is then averaged to calculate the loss
MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE,AND MOBILE-FRIENDLY VISION TRANSFORMER(2022)
It is equivalent to adding convolution to reduce the number of parameters.
Supervised Contrastive Learning(2020)
Probably, self-supervised comparative learning treats another dog as a negative example, and supervised solution to this problem
RepVGG: Making VGG-style ConvNets Great Again(2021)
Just look at the Internet
Pay Attention to MLPs(2021)
A bit like the idea of MLP-mixer
Dual Path Networks(2017)
I don't know if it is based on the two branches of channel division
Visual Attention Network(2022)
PVT v2: Improved Baselines with Pyramid Vision Transformer(2021)
Swin Transformer V2: Scaling Up Capacity and Resolution
It’s just that the calculation method of qkv has changed.
MetaFormer Is Actually What You Need for Vision(2022)
CvT: Introducing Convolutions to Vision Transformers(2021)
Some use MLP to generate tokens, and some use convolution. When using convolution, pay attention to dimension transformation.