Paper: Deformable ConvNets v2 read notes

Disclaimer: This article is a blogger original article, reproduced, please attach Bowen link! https://blog.csdn.net/m0_37263345/article/details/90673378

A thesis

Deformable ConvNets v2: More Deformable, Better Results

https://arxiv.org/abs/1811.11168

The code for DCNv2 will be released.

Second, the paper notes

1. Background

A), based on the first edition problems found convolutional network using a deformable region did learn and adapt to the shape of the object of interest was covered with a different shape, but the object region of interest is not very accurate, generally exceeding the object region portion.

 

B), the error may use the candidate block wherein the outer frame, because with such a box may contain some external features of the region of the object, and is given, it is given in this box is not complete coverage of the object, belong to the wrong box, variability convolution there are the problem (likely reason is because the branch classification and regression branch, shared some convolution layer, and for the classification task is the need for greater receptive fields, it may use to characteristics outside the box, it will be some of the proposals is to make classification and regression branch branches share fewer layers, or considering Faster R-CNN and classification of the results of R-CNN, to get a better return results because R-CNN entirely It is trained based on the shear block out the objects )

 

2, innovation

The first two points are innovative and "affect the final result of the predicted value of a network of data only one pixel region, two points embodied herein, a number of positions, one is the activation value (pixels)" is minded.

A), more variability convolution stack, to more variability convolution with a convolution layer (in conv3, conv4, and conv5 all 3 x 3 convolution are deformable into a convolution, total layer 12 ) (v1 are deformable convolution res5 do replace only the 3 × 3 convolution three layers ) used in the data set v1 PASCAL VOC paper, it is difficult to view this part of the lift.

 

B), a deformable modulation scheme convolution module, which only need to learn the offset, and the offset characteristic of the amplitude modulation also studied. The adjustment mechanism is to be a learning coefficient (0-1) at each position of the feature map so that the factor has the original feature values ​​of the reduced function, the most extreme example in a specific position outside the range of the object, the object detection no help, then it can be set to zero. Such a mechanism be automatically selecting convolution result convolved region of the variability of the convolution on the basis of a choice made.

They are

deformable convolution

deformable RoIpooling

 

C), a definition of A Feature MIMIC Loss, wherein the object to a focus region, using only a positive sample of the frame, given the negative sample block may require more contextual information

A similar network was added R-CNN's branch, the input branch is taken according to the results of RPN out on the original object region, and training, have their normal classification functions ( ( C + 1'd) -WAY ), then characterized in this branch and the output characteristics of the R-CNN Faster output cos make a distance, the distance is a force, i.e. the same force output from the feature, then wanted to add a supervisory frame feature focusing operation, two branches the shared parameters, only when training with RCNN branch, inferring only Faster RCNN

 

For more details, please refer to: https://blog.csdn.net/u014380165/article/details/88072737

Guess you like

Origin blog.csdn.net/m0_37263345/article/details/90673378