[Deep learning] target detection network structure MASK RCNN FPN

The improvements relative to FASTER RCNN are as follows:

1. Simultaneous segmentation, detection and classification

2. Introduce ROI ALIGN (has little effect on classification, but has great effect on segmentation, because ROI POOLING corresponds to the pixels in the image will be biased, this method will be relatively accurate) instead of ROI POOLING in faster rcnn

ROI ALIGN: https://www.cnblogs.com/wangyong/p/8523814.html

For the detection of large target objects in the picture, there is little difference between the two solutions, and if there are more small target objects in the picture that need to be detected, RoiAlign is preferred, which is more accurate

3. Introduce semantic segmentation branch to realize the decoupling of mask and class prediction relationship. The mask branch only does semantic segmentation, and the task of type prediction is handed over to another branch (different from FCN: FCN classification and segmentation are at the same time, and the classification is also predicted when predicting the segmentation. Different channels are the segmentation probabilities of different categories)

 

 

FPN obtains strong semantic features by combining bottom-up and top-down methods to improve the performance of target detection and instance segmentation on multiple data sets.

FPN can be applied to the front network structure (fast, faster, mask rcnn)

FPN: https://blog.csdn.net/WZZ18191171661/article/details/79494534

MAP: https://www.cnblogs.com/klitech/p/9242700.html

 

 

Guess you like

Origin blog.csdn.net/Sun7_She/article/details/90407103