Main points:
Three YOLO v3
3.1 Darknet-53 (backbone)
3.2 Prediction of object bounding boxes
Limit the predicted bounding box center to the current cell, s ( x ) = Sigmoid( x ).
3.3 Matching of positive and negative samples
3.4 Calculation of losses
3.4.1 Confidence loss (Binary Cross Entropy)
Among them , the IOU c representing the predicted target bounding box and the real target bounding box is the predicted value, which is the predicted confidence obtained by c through the Sigmoid function, and N is the number of positive and negative samples.
3.4.2 Category loss (Binary Cross Entropy)
3.4.3 Class loss
3.4.4 Localization loss
3.5 YOLOv3 SPP
3.5.1 Mosaic Image Enhancement
3.5.2 SPP module
The feature fusion of different scales is realized.
Note : The SPP here is different SPP structure in SPPnet , Spatial Pyramid Pooling