[Neural Network] Instance Segmentation Mask RCNN

I. Overview

        Both instance segmentation and semantic segmentation can segment images at the pixel level. The difference is that instance segmentation can distinguish different individuals, but semantic segmentation cannot.

                                 (Instance Segmentation) (Semantic Segmentation)

2. Network structure

        Compared with Faster RCNN, Mask RCNN replaces RoI Plooing with RoI Align . And a branch of instance segmentation is connected in parallel outside Faster RCNN .

        1.Mask structure

                The Mask branch and the prediction branch do not share RoI Align, and the final feature map of the RoI of the Mask branch is 14*14. The structure is as follows:

                The result is 28*28*numclasses, with masks predicted for each class.

During network training, the target of the Mask branch is provided by RPN; but during prediction, the target of the Mask branch is provided by Fast RCNN, because during training, PRN may provide multiple prediction values ​​(boxes)                 for a network, which is equivalent to data enhancement; while prediction , only one prediction box will be obtained.

         2.RoI Align

                RoI Pooling calculates the offset distance from the target to the upper left corner, involving double rounding. will affect the final result

                RoI Align calculates the distance from the target point to the upper left corner without rounding, which can improve the calculation accuracy

                 The specific calculation is to use bilinear interpolation to calculate

4. Calculation of loss function

         Loss = L_{rpn}+L_{fast\_rcnn}+L_{mask}

        Among them, the loss calculation of rpn and fast_rcnn is the same as faster rcnn, and the loss function of the Mask branch is calculated as follows:

                         Calculate BCELoss by different categories.

Guess you like

Origin blog.csdn.net/weixin_37878740/article/details/129488655