Target detection: YOLOV4

CVPR was published in 2020, not the author of the original YOLO series

Compared with YOLOV3, there is a relatively large improvement, but compared with YOLOV3-SPP, the improvement is relatively small.

1. Compared with the network structure improvement of YOLOV3:

1. Introduce CSP structure: DARKNET53->CSPDARKNET53

The author believes that the role of CSP:

1) Enhance the learning ability of CNN network;

2) Remove the bottleneck of calculation;

3) Reduce memory overhead.

The CSP module is as shown below:

2. Introduce SPP structure: solve multi-scale problems, this is the same as YOLOV3-SPP

3. Introduce the PAN structure:

The PANET feature fusion of the original features is an element-wise addition, while YOLOV4 is a concat operation here.

4. The overall network structure of CSPDARKNET53

1. Compared with the improvement of YOLOV3 optimization strategy:

1. Eliminate the grid sensitivity of GRID:

When the center point of the target happens to be on the GRID, σ(tx) or σ(tx) is 0 or 1, and tx or ty must be ±∞. This extreme numerical network is usually impossible to achieve. The author's method is to multiply a scaling factor, usually the scaling factor is 2, as shown in the figure below:

At this time, the offset has been expanded to between -0.5 and 1.5. If it is to be achieved between 0 and 1, tx and ty can be achieved as long as they are between the dotted lines in the figure below.

2. Mosaic data enhancement: Same as YOLOV3-SPP, the process of splicing multiple pictures and sending them to the network for training

3. Positive sample IOU threshold:

The previous YOLOV3 positive sample method was: compare each GT with all anchor templates. Use GT and the upper left corner of the anchor template to coincide, and then calculate its IOU. Afterwards, by setting a threshold such as IOU>0.3, all are set as positive samples. Only the second in the graph satisfies the condition. Next, map the GT to the GRID grid (or predict the feature layer), and which grid cell is the center point of the GT, then the anchor template 2 of this grid cell is a positive sample. If the IOU of GT and multiple anchor templates is greater than the threshold, then multiple anchor templates corresponding to the currently specified grid cell are regarded as positive samples. This will expand the number of samples. Practice has found that the effect will be better.

In YOLOV4, the AT2 corresponding to the three grid cells in the figure below are all regarded as positive samples. Because of the operation of eliminating the sensitivity of the GRID grid, the center point of the current sample is between -0.5 and 1.5 of the upper and left grid cells. This can further increase the number of positive samples.

Five more cases:

4. Anchor template optimization

For the 512*512 size, it has been re-optimized. I don’t know if there is a better one.

5. Using CIOU

Same as in YOLOV3-SPP.

Guess you like

Origin blog.csdn.net/wanchengkai/article/details/124512263