In-depth inspection model competition training skills (Tricks)

Training Skills of Depth Inspection Model Competition (T ricks) Training Skills of Depth Inspection Model Competition (Tricks) Depth of inspection sensing mode type than the race training training techniques Qiao ( T R & lt I C K S )

One data enhancement

Offline enhancement:: Directly process the data set, the number of data will become the enhancement factor × the number of the original data set, this method is often used when the data set is small

Online enhancement: After obtaining the batch data, this enhancement method is used to enhance the batch data, such as rotation, translation, and turning, etc. Corresponding changes, because some data sets cannot accept linear level growth, this The method is suitable for large data sets. Many machine learning frameworks already support this data enhancement method and can use GPU to optimize calculations.

Online enhancement commonly used:

  • Spatial geometric transformation: flip (horizontal and vertical), random cropping, rotation, radiation transformation, visual transformation (four-point perspective transformation), segmented radiation
  • Pixel color transformation class: CoarseDropout, SimplexNoiseAlpha, FrequencyNoiseAlpha, ElasticTransformation
  • HSV contrast conversion
  • RGB color disturbance
  • Random erase
  • Super pixel method
  • Border detection
  • Sharpening and embossing

Two training strategies:

Warmup : In the early stage of training, because it is far from the target, it is generally necessary to choose a large learning rate, but using a large learning rate may easily lead to instability. So you can do a learning rate warm-up phase, use a smaller learning rate at the beginning, and then adjust the learning rate back when the training process is stable

Label smoothing: (Solve the shortcomings of one-hot)

Problems caused by one-hot:
For the loss function, we need to use the predicted probability to fit the true probability, and fitting the true probability function of one-hot will bring two problems:
1) The generalization ability of the model cannot be guaranteed. It is easy to cause over-fitting;
2) Full probability and 0 probability encourage the gap between the category and other categories to be as wide as possible, and from the bounded gradient, this situation is difficult to adapt. Map the categories that cause the model to trust the prediction too much

Label smoothing increases the generalization ability of the model and prevents overfitting to a certain extent.

Multi-fold cross-validation (usually 5 fold)

The emergence of multi-fold cross-validation is to solve: under normal circumstances, a validation set will be left to calculate the index, but if one more validation set is left, it means that one training set is missing. Training, it is very possible to improve the index, so multi-fold cross-validation has appeared, which better solves this problem, so that all data sets participate in the training and have indicators

Forecast method:

1. Fusion of all trained KFold
2最优模型重新训练全部数据后预测

Three inference strategies

  • NMS

1. NMS (Non-Maximum Suppression) The
same object may have several frames. Our goal is that an object only needs to keep an optimal frame: so we will use non-maximum suppression to suppress those redundant The remaining box: The process of suppression is an iterative-traversal-elimination process.
2. Soft nms
should not rudely delete all boxes whose IOU is greater than the threshold, but lower their confidence

Offline training techniques

1. Data enhancement: Enhance the variability of the input image to make the detection model more robust

Including:
A. Lighting processing: adjust the brightness, contrast, hue, saturation, noise of the image
B, geometric processing: random zoom, crop, flip, and rotate

2. Object occlusion

  • random erase , CutOut

Randomly select a rectangular area in the image and fill it with a random value or complementary value of 0

  • hide-and-seek , grid mask

Randomly or uniformly select multiple rectangular areas in the image and fill them with 0

  • Processing of feature spectrum

Dropout, DropConnect, DropBlock (used during training, disabled during prediction)

3. Use multiple images to enhance

MixUp uses two images and
CutMix uses different angles of cropping areas

4.GAN is used for data augmentation

5. Uneven data distribution

  1. OHEM: online difficult sample mining
  2. S-OHEM: Online difficult sample mining based on loss distribution sampling
  3. A-FAST-RCNN: Generate difficult samples based on a confrontational generation network
  4. FOCAL LOSS: Weight adjustment of loss function
  5. GHM: Loss function gradient equalization mechanism

6. Category relevance

7. Target box regression function

Techniques to increase a little cost in the inference process for a greater accuracy improvement:

1. Increase the receptive field

Feature pyramid SPP, RFB, ASPP

2. attention

Channel attention mechanism SE, spatial attention mechanism SAM

3. Feature fusion

skip connection、hyper-colomn FPN

4. Activation function

Mish activation function:

Mish advantages

  • The above borderless (that is, positive values ​​can reach any height) avoids saturation due to capping. In theory, the slight allowance for negative values ​​allows for better gradient flow instead of hard zero boundaries as in ReLU.
  • The smooth activation function allows better information to penetrate the neural network, resulting in better accuracy and generalization.

Insert picture description here

5. Post-processing method: NMS

references

CVPR 2020 ATSS

Guess you like

Origin blog.csdn.net/qq_41375318/article/details/112603611