Training Skills of Depth Inspection Model Competition (T ricks) Training Skills of Depth Inspection Model Competition (Tricks) Depth of inspection sensing mode type than the race training training techniques Qiao ( T R & lt I C K S )
One data enhancement
Offline enhancement:: Directly process the data set, the number of data will become the enhancement factor × the number of the original data set, this method is often used when the data set is small
Online enhancement: After obtaining the batch data, this enhancement method is used to enhance the batch data, such as rotation, translation, and turning, etc. Corresponding changes, because some data sets cannot accept linear level growth, this The method is suitable for large data sets. Many machine learning frameworks already support this data enhancement method and can use GPU to optimize calculations.
Online enhancement commonly used:
- Spatial geometric transformation: flip (horizontal and vertical), random cropping, rotation, radiation transformation, visual transformation (four-point perspective transformation), segmented radiation
- Pixel color transformation class: CoarseDropout, SimplexNoiseAlpha, FrequencyNoiseAlpha, ElasticTransformation
- HSV contrast conversion
- RGB color disturbance
- Random erase
- Super pixel method
- Border detection
- Sharpening and embossing
Two training strategies:
Warmup : In the early stage of training, because it is far from the target, it is generally necessary to choose a large learning rate, but using a large learning rate may easily lead to instability. So you can do a learning rate warm-up phase, use a smaller learning rate at the beginning, and then adjust the learning rate back when the training process is stable
Label smoothing: (Solve the shortcomings of one-hot)
Problems caused by one-hot:
For the loss function, we need to use the predicted probability to fit the true probability, and fitting the true probability function of one-hot will bring two problems:
1) The generalization ability of the model cannot be guaranteed. It is easy to cause over-fitting;
2) Full probability and 0 probability encourage the gap between the category and other categories to be as wide as possible, and from the bounded gradient, this situation is difficult to adapt. Map the categories that cause the model to trust the prediction too much
Label smoothing increases the generalization ability of the model and prevents overfitting to a certain extent.
Multi-fold cross-validation (usually 5 fold)
The emergence of multi-fold cross-validation is to solve: under normal circumstances, a validation set will be left to calculate the index, but if one more validation set is left, it means that one training set is missing. Training, it is very possible to improve the index, so multi-fold cross-validation has appeared, which better solves this problem, so that all data sets participate in the training and have indicators
Forecast method:
1. Fusion of all trained KFold
2最优模型重新训练全部数据后预测
Three inference strategies
- NMS
1. NMS (Non-Maximum Suppression) The
same object may have several frames. Our goal is that an object only needs to keep an optimal frame: so we will use non-maximum suppression to suppress those redundant The remaining box: The process of suppression is an iterative-traversal-elimination process.
2. Soft nms
should not rudely delete all boxes whose IOU is greater than the threshold, but lower their confidence
Offline training techniques
1. Data enhancement: Enhance the variability of the input image to make the detection model more robust
Including:
A. Lighting processing: adjust the brightness, contrast, hue, saturation, noise of the image
B, geometric processing: random zoom, crop, flip, and rotate
2. Object occlusion
- random erase ,
CutOut
Randomly select a rectangular area in the image and fill it with a random value or complementary value of 0
- hide-and-seek ,
grid mask
Randomly or uniformly select multiple rectangular areas in the image and fill them with 0
- Processing of feature spectrum
Dropout, DropConnect, DropBlock (used during training, disabled during prediction)
3. Use multiple images to enhance
MixUp uses two images and
CutMix uses different angles of cropping areas
4.GAN is used for data augmentation
5. Uneven data distribution
- OHEM: online difficult sample mining
- S-OHEM: Online difficult sample mining based on loss distribution sampling
- A-FAST-RCNN: Generate difficult samples based on a confrontational generation network
- FOCAL LOSS: Weight adjustment of loss function
- GHM: Loss function gradient equalization mechanism
6. Category relevance
7. Target box regression function
Techniques to increase a little cost in the inference process for a greater accuracy improvement:
1. Increase the receptive field
Feature pyramid SPP, RFB, ASPP
2. attention
Channel attention mechanism SE, spatial attention mechanism SAM
3. Feature fusion
skip connection、hyper-colomn FPN
4. Activation function
Mish activation function:
Mish advantages
- The above borderless (that is, positive values can reach any height) avoids saturation due to capping. In theory, the slight allowance for negative values allows for better gradient flow instead of hard zero boundaries as in ReLU.
- The smooth activation function allows better information to penetrate the neural network, resulting in better accuracy and generalization.
5. Post-processing method: NMS
references
CVPR 2020 ATSS