[Note] Artificial Intelligence Section VI: evolutionary history YOLO target detection algorithm

YOLO works

  • YOLO works is that the picture input to the multi-layered convolution to extract picture feature. Then return to the target directly in the output layer frame coordinate its category belongs. Finally, by removing the overlapping NMS process target block.
  • Faster-RCNN with different, Faster-RCNN after extraction is a characteristic graph used to scan the candidate block diagram embodiment to find the target feature. And then return categorized by two sub-networks. A candidate block for adding an offset scaled to return to the target coordinates. A target used for classification. Finally, by removing the overlapping NMS process target block.

 

YOLO advantages and disadvantages

advantage:

  • Network simple and intuitive, and easy to implement.
  • Fast identification and training speed.
  • High accuracy of large objects.

Disadvantages:

  • Since the output directly from FIG classification characteristic, features, FIG lower resolution, corresponding to a high overlap, or small dense target object, the detection rate is low.

 

YOLO V1

Network architecture:

  •  Network input image size (448 * 448 * 3), and the maximum value convolutional pooled, to give (7 * 7 * 1024) feature.
  • Wherein the connecting layer through the full 4096, is converted into (7 * 7 * 30) results.
  • (7 * 7 * 30) results in the corresponding 7 * 7 = 49 feature points, can detect up to 49 targets in the picture. 30 here comprises two pieces of information, a portion of the target box is the two coordinates (x, y, w, h) and the confidence level classification part 20, and therefore (1 + 4) * 2 + 20 = 30.

Disadvantages:

  • Less intensive target low detection rate.
  • Target box is not accurate, there will be some deviation.
  • Large differences in the aspect ratio of the object, poor generalization capabilities.

 

YOLO V2

Improvements:

  • Regression directly by the target coordinates (x, y, w, h), the candidate block size is set in advance to each of feature points corresponding to each of FIG. 3 in size and shape different candidate block, then return to the candidate frame offset . When such a large target detection can be optimized aspect ratio difference, increase the precision of the goal box, speed up training.
  • Image feature extraction network instead Darknet-19 network, providing training speed.
  • By the input image size (448 * 448 * 3) was changed to (416 * 416 * 3).
  • Extracted (13 * 13) feature FIG. Further taking the foregoing (26 * 26) wherein FIG converted into (13 * 13) a characteristic graph for predicting smaller objects.

Network architecture:

  • Docknet19, and finally the classification, a total of 1000 category. YOLO network feature extraction is removed as the end of the network:

  • And then (300 * 13 * 13) the convolution of the target frame coordinates and classification. (1 + 4 + 20) * 3 = 300.

Speed ​​comparison:

 

YOLO V3

Improvements:

  • The image processing speed may reach 608x608 20FPS on Pascal Titan X, in the [email protected] COCO test-dev reached 57.9%.
  • There are three kinds of input picture size (320 * 320), (416 * 416), (608 * 608). Rate from fast to slow, from low to high accuracy.
  • Use FPN pyramid feature extraction different scales of 3 layers, each candidate feature using 3 different size and shape of the frame.
  • Feature extraction network of Darknet-19 to Darknet-53, increasing the depth of the network.
  • Each block corresponding to the respective classification do Softmax, Logistic to a plurality of classifiers.
  • Category losses using binary cross-entropy loss.

Network architecture:

  • Darknet-53, and finally the classification, a total of 1000 category. YOLO network feature extraction is removed as the end of the network:

  • darknet-53 close to ResNet-101 or accuracy ResNet-152, but faster, the following comparison:

  • Complete network as follows:

Speed ​​comparison:

 

in conclusion:

  • YOLOv3 with good results on [email protected] and small target APs, but with the increase, performance degradation IOU, indicating YOLOv3 not fit well with the ground truth.

 

References:

https://www.cnblogs.com/pprp/p/10124591.html

https://blog.csdn.net/guleileo/article/details/80581858

 

 

Published 28 original articles · won praise 2 · views 10000 +

Guess you like

Origin blog.csdn.net/highlevels/article/details/103048844