[Target detection] Summary of the general framework of target detection

Note viewing link: personal summary of the target detection framework
(PS: The recommendation is to use the note viewing link for reading, because I am lazy and don’t want to arrange it again on the blog)

The following is the content contained in the notes:
Summary of the target detection network framework (pictured in YOLOV4)
From the initial neural network to the deeper and deeper target detection deep learning neural network with more modules, it can be roughly divided into Two-Stage- Detector (typically RCNN series papers) and One-Stage-Detector (typically YOLO series papers), each network can be subdivided into the structure shown in the following figure:
Target detection framework diagram
Input: image, image pyramid, etc.;
backbone network (Backbone): used to extract the feature map of the picture, commonly used are VGG16, ResNet-50, SpineNet, EfficientNet-B0/B7, CSPResNetXt50, CSPDarknet53, etc.;
Neck network (Neck):
additional modules: SPP, ASPP, RFB, SAM
integration module: FPN, PAN, NAS-FPN, Full-connected FPN, BiFPN, ASFF, SFAM
Head network (Head):
Dense Prediction (one-stage)
RPN, SSD, YOLO, RetinaNet (anchor based)
CornerNet, CenterNet , MatrixNet, FCOS (anchor free)
Sparse Prediction (two-stage)
Faster R-CNN, R-FCN, Mask R-CNN (anchor based)
RepPoints (anchor free) The
above is basically the overall structure of target detection, generally for the target The improvement of the detection network is based on the distribution of each module.
The subsequent sequence will refine the specific summary in each module

Guess you like

Origin blog.csdn.net/Leomn_J/article/details/112462743