Analysis of yolov4 papers

Since the author of yolov3 withdrew from the yolo algorithm update, but did not stop improving yolo. In 2020, yolov4 was issued and was affirmed by the original author. In the yolov4 paper, a lot of ablation experiments were done for comparison, using a lot of training strategies, and the accuracy has been greatly improved compared with yolov3.

Paper download address: https://arxiv.org/pdf/2004.10934

Paper code address: https://github.com/AlexeyAB/darknet


1 Introduction

First, let's take a look at the AP value of yolov4 on the coco data set, which guarantees the speed and mentions the progress. Compared with yolov3, the AP value has been increased by nearly 10%. It can be seen that the effect of yolov4 is very good. Compared with efficientDet, although the accuracy is low, it far exceeds it in speed and can achieve the effect of real-time detection.

 What improvements did yolov4 make to make its effect so obvious? Similarly, we first analyze the network architecture, training strategy, and data processing. First of all, in terms of network architecture, the backbone uses CSPDarknet53, the neck part uses PAN feature fusion, the head is similar to yolov3, and uses a 1x1 convolution output. BoF and BoS are used in the training strategy. BoF is to improve the network without increasing the cost of the network itself, that is, data enhancement. BoS is to change the network architecture and add some modules to the network to improve network capabilities.

2. Network Architecture

        2.1backbone:

                Why use CSPDarknet53? The author did an experiment. Among them, CSPDarknet53 has advantages in detection, that is, it has a strong feature extraction ability, and at the same time, its receptive field is large, and the large number of parameters has a strong learning ability of the model.

  At the same time, the SPP module is added to CSPDarknet53, in order to further expand the receptive field of the network and reduce information loss.

insert image description here

 

        2.2 neck and head:

                In the neck part, the FPN in yolov3 is not used, but PANet is used as a parameter aggregation method, where a is FPN, and the sum of a and b is PAN. Compared with FPN, one more round of downsampling is added. And the head is the same as yolov3

insert image description here

        But concat is used in yolov4's PAN instead of add

  The network structure of yolov4 is formed from the above two parts

3. Training strategy

        2.1, BoF and BoS

        The activation function uses Mish, IOU Loss uses CIoU, data enhancement uses Mosaic, regularization uses DropBlock, normalization uses CmBN, label smoothing, etc. This series of improvement strategies,

 

 

The author further proves the effectiveness of the strategy used through the above experiments. Among yolov4, relevant strategies are mainly proposed, and improvements are made based on the basis of yolov3.

I recommend everyone to take a look at this big guy’s blog for details:

Detailed explanation of YOLOv4 network

Guess you like

Origin blog.csdn.net/weixin_44711102/article/details/127816150