Target Detection: Fast R-CNN principle

Fast R-CNN Papers Address:   https://arxiv.org/pdf/1504.08083.pdf

1 Overview:

Considering the R-CNN is very slow, the authors proposed a model for improvement:. Fast R-CNN compared to R-CNN, Fast advantages of R-CNN that accelerate the steps of selective search and classification and regression training process at the same time, the whole on accelerated. 

Fast R-CNN of the modified portion of the R-CNN:

  • The R-CNN three modules (CNN, SVM, Regression) integrated, greatly reducing the amount of calculation and accelerates
  • Not to carry out selective search extract the original image, but the first time through CNN, the use of selective search intercept candidate region on the feature map, classify return
  • For compatibility with different scales picture, the authors used the ROI Pooling algorithm, the feature map to a fixed pool of dimensions of a feature vector.

 

2 ROI Pooling principle

ROI Pooling paper Address:  https://arxiv.org/pdf/1406.4729.pdf

ROI Pooling also called Spatial Pyramid Pooling (space pyramid pooling). 

Since the image size varies, need to go through pantograph stretcher to a uniform size and then into CNN, which indirectly result in recognition accuracy is deteriorated, may be characterized FIG output different scales is eigenvectors fixed dimensions, also obtained by ROI Pooling feature information of the image, and fast.

__________________________________________________________________

 

3 steps

3.1 a classification pretraining CNN
3.2 modify CNN, flatten the last layer and subsequent layers deleted, replaced ROI Pooling layer
3.3 through image CNN, wherein FIG obtained using selectiv search area candidate selecting 2k
3.4 ROI Pooling layers followed by several FC, the output of the last two branches:

  • The first branch is softmax layer, k + 1 output Category
  • The second branch is the regression, the predicted output parameters box categories k

 

4 loss to understand

 

Why use smooth L1?

Since 2:00: 

  • 0:00 derivable
  • loss smaller and smaller, the gradient be appropriately reduced, facilitate convergence

 

5 compared with the performance of R-CNN

Guess you like

Origin www.cnblogs.com/dxscode/p/11443752.html