A preliminary study of yolov4

1. Data enhancement and improvement (mosaic, you can still recognize you in a vest)
2. Label smoothing (originally either a cat or a dog, now 9 points cat, 7 points dog)
3. If the traditional IOU is not handed in, then it is 0 and the gradient disappears . The improvement introduces the formula of C (C can wrap the groudtruth and the prediction box), which can make the prediction box move toward the true box when there is no overlap. Because GIOU and IOU are the same in different positions, DIoU is introduced, which is the distance between the center points of the two boxes, and the distance between the diagonal lines of the C box is added to the formula, and the problem is solved. The final CIoU used is that the aspect ratio is added to the loss function (whether the aspect ratio of the real and the predicted frame is the same)
4, the soft-nms does not meet the reduction score, and the final calculation is qualified.
5. CSPNet Half of the channel (characteristic map) was originally downloaded, and the other half continued. Splicing, the advantages are fast, and the accuracy is slightly improved.
6. CBAM adds an attention mechanism: weight each feature map (that is, the degree of importance), and the points/positions in a feature map are also important or not, and they are also
used in V4 The SAM only uses the positional attention mechanism, because of the amount of calculation.
7, Max-Pooling/Average-Pooling is removed from the network, and directly connected to Convolution+SAM
8, PAN—similar to maskrcnn, from top to bottom, use p Shortcuts to layers, from bottom to top. Feature splicing instead of addition
9. Use Mish instead of relu (too absolute), increase the amount of calculation, and improve the effect

Guess you like

Origin blog.csdn.net/qq_41834780/article/details/110292918
Recommended