YOLO works
- YOLO works is that the picture input to the multi-layered convolution to extract picture feature. Then return to the target directly in the output layer frame coordinate its category belongs. Finally, by removing the overlapping NMS process target block.
- Faster-RCNN with different, Faster-RCNN after extraction is a characteristic graph used to scan the candidate block diagram embodiment to find the target feature. And then return categorized by two sub-networks. A candidate block for adding an offset scaled to return to the target coordinates. A target used for classification. Finally, by removing the overlapping NMS process target block.
YOLO advantages and disadvantages
advantage:
- Network simple and intuitive, and easy to implement.
- Fast identification and training speed.
- High accuracy of large objects.
Disadvantages:
- Since the output directly from FIG classification characteristic, features, FIG lower resolution, corresponding to a high overlap, or small dense target object, the detection rate is low.
YOLO V1
Network architecture:
- Network input image size (448 * 448 * 3), and the maximum value convolutional pooled, to give (7 * 7 * 1024) feature.
- Wherein the connecting layer through the full 4096, is converted into (7 * 7 * 30) results.
- (7 * 7 * 30) results in the corresponding 7 * 7 = 49 feature points, can detect up to 49 targets in the picture. 30 here comprises two pieces of information, a portion of the target box is the two coordinates (x, y, w, h) and the confidence level classification part 20, and therefore (1 + 4) * 2 + 20 = 30.
Disadvantages:
- Less intensive target low detection rate.
- Target box is not accurate, there will be some deviation.
- Large differences in the aspect ratio of the object, poor generalization capabilities.
YOLO V2
Improvements:
- Regression directly by the target coordinates (x, y, w, h), the candidate block size is set in advance to each of feature points corresponding to each of FIG. 3 in size and shape different candidate block, then return to the candidate frame offset . When such a large target detection can be optimized aspect ratio difference, increase the precision of the goal box, speed up training.
- Image feature extraction network instead Darknet-19 network, providing training speed.
- By the input image size (448 * 448 * 3) was changed to (416 * 416 * 3).
- Extracted (13 * 13) feature FIG. Further taking the foregoing (26 * 26) wherein FIG converted into (13 * 13) a characteristic graph for predicting smaller objects.
Network architecture:
- Docknet19, and finally the classification, a total of 1000 category. YOLO network feature extraction is removed as the end of the network:
- And then (300 * 13 * 13) the convolution of the target frame coordinates and classification. (1 + 4 + 20) * 3 = 300.
Speed comparison:
YOLO V3
Improvements:
- The image processing speed may reach 608x608 20FPS on Pascal Titan X, in the [email protected] COCO test-dev reached 57.9%.
- There are three kinds of input picture size (320 * 320), (416 * 416), (608 * 608). Rate from fast to slow, from low to high accuracy.
- Use FPN pyramid feature extraction different scales of 3 layers, each candidate feature using 3 different size and shape of the frame.
- Feature extraction network of Darknet-19 to Darknet-53, increasing the depth of the network.
- Each block corresponding to the respective classification do Softmax, Logistic to a plurality of classifiers.
- Category losses using binary cross-entropy loss.
Network architecture:
- Darknet-53, and finally the classification, a total of 1000 category. YOLO network feature extraction is removed as the end of the network:
- darknet-53 close to ResNet-101 or accuracy ResNet-152, but faster, the following comparison:
- Complete network as follows:
Speed comparison:
in conclusion:
- YOLOv3 with good results on [email protected] and small target APs, but with the increase, performance degradation IOU, indicating YOLOv3 not fit well with the ground truth.
References:
https://www.cnblogs.com/pprp/p/10124591.html
https://blog.csdn.net/guleileo/article/details/80581858