foreword
yolov7, yoloX related papers have not been carefully read, yolov8 came out. Too curly!
YOLOv8 is the next major update of YOLOv5 that ultralytics will open source on January 10, 2023.
GitHub address: github.com/ultralytics/ultralytics
YOLOv8 is an update on YOLOv5, so this article mainly compares the differences between the two:
Table of contents
YOLOv5 architecture:
Architecture features:
1.Backbone
The main idea of the idea of CSP (/gradient shunt), most of which are CBS and C3 modules.
2. PAN/FPN
Dual-stream FPN, most of which are CBS and C3 modules.
3.Head
Coupled head + Anchor-base (the term coupled head is proposed by YoloX to correspond to decoupled head, and the difference between coupled head and decoupled head will be explained later ).
4. Positive and negative sample allocation strategy
Static allocation strategy.
5.Loss
BCE Loss is used for classification and CIOU Loss is used for regression. There is also a confidence loss (BCE Loss) for the existence of objects.
Other supplements: The difference between SPP and SPPF in Yolov5.
Function/Function: The feature maps generated by 1×1, 5×5, 9×9, 13×13 pooling are all 13×13, keeping the original size unchanged. The role here is to fuse local and global features.
Difference: SPPF can reduce the amount of calculation. It used to require 3 MaxPool2ds, but now it only needs 1 MaxPool2d. It is a bit like the algorithm idea of fast power solution.
See details: https://zhuanlan.zhihu.com/p/584153158
YOLOv8 architecture:
Architecture features:
1.Backbone
The same: the idea of CSP (/gradient split); and use the SPPF module.
Different: Replace C3 module with C2f module.
2. PAN-FPN
Dual-stream FPN, (most of which are CBS and C3 modules).
Same: the idea of PAN.
Different: Deleted the CBS 1*1 sampled on PAN-FPN in YOLOv5, and replaced the C3 module with the C2f module.
3.Head
Decoupled head + Anchor-free
4. Positive and negative sample allocation strategy
TAL (Task Alignment Learning) dynamic matching is adopted.
5. Loss
Same: BCE Loss is still used for classification loss
. Different: (1). Confidence loss of discarded objects; (2). Regression branch loss: CIOU loss+ DFL
DFL description see: https://zhuanlan.zhihu.com/p /147691786
The difference between the reasoning process of the two
The reasoning process of YOLOv8 is almost the same as that of YOLOv5. The only difference is that the bbox form of the integral representation in the Distribution Focal Loss needs to be decoded to become a conventional 4-dimensional bbox. The subsequent calculation process is the same as that of YOLOv5.
What is the difference between coupled headand decoupled head?
reference:
https://www.cnblogs.com/chentiao/p/16420907.html
the difference:
When using coupled head, the network directly outputs shape (1,85,80,80);
if using decoupled head, the network will be divided into regression branch and classification branch, and finally aggregated together to get the same shape (1,85,80 ,80).
Why use decoupled head?
If you use a coupled head, the output channel puts the classification task and the regression task together, and these two tasks are conflicting. (The paper says that there is conflict, but I don’t understand why there is conflict. I consider that there is conflict from the perspective of loss function.) Through
experiments, it is found that after replacing it with Decoupled Head, not only the accuracy of the model will be improved, but the convergence speed of the network will also be accelerated. Yes, the expressive power of using Decoupled Head is better.
The comparison curve of Couple Head and Decoupled Head is as follows: