Just my own study records, thesis refinement
Summary
- Proposed on the basis of RCNN. Advantages of two improvements: faster and more accurate.
- Fast RCNN uses VGG16 network, the training set is 9 times faster than RCNN, and the test set is 213 times faster.
introduce
-
There are huge challenges in target detection - it is necessary to accurately locate the target: generate a candidate box; correct the candidate box. Such problems are bound to affect speed, accuracy, and simplicity.
-
Disadvantages of RCNN: muti-stage, training takes time and space, slow detection
-
Disadvantages of SPPnet: muti-stage, unable to update the convolutional layer before pooling, resulting in inaccuracy
-
Fast RCNN advantages: high mAP; single-stage; can update all network layers; no need for disk cache features.
-
Specific operation steps: input the entire image into the convolutional neural network -> RoI pooling -> fully connected layer -> feature vector -> softmax and bbox (4 bounding box values)
-
RoI pooling layer: converts the feature map to a fixed size (7×7).
-
Pre-trained network initialization: use three ImageNets, each with 5 max pooling layers and 5~13 convolutional layers.
-
There are three changes in the network:
(1) RoI maximum pooling replaces the last maximum pooling
(2) softmax and regressor double branch replaces the last full connection and softmax layer
(3) There are two inputs: the image and the RoI of the image.