Chem

Information papers

Abhinav Shrivastava, Abhinav Gupta, Ross Girshick, Training Region-based Object Detectors with Online Hard Example Mining, CVPR 2016.

http://arxiv.org/abs/1604.03540

Introduction

In order to solve the class imbalance problem, it was suggested at least 20 years ago bootstraping (now known as hard negative mining) this algorithm is gaining popularity, resulting in a kind of alternate algorithm, which alternate between the following two states:

  1. Model optimization sample for detection.
  2. Select a new false positive samples detected with the optimized model.

From the previous start detection target using SVM, R-CNN, SPPnet using both hard negative mining.

And published in this paper, the existing detection model Fast R-CNN did not use bootstraping, authors believe that the main reason is the deep ConvNet (with SGD optimization) algorithm, this method is difficult to join in.

Bootstraping achieve specific cases:

  1. Using a fixed model to find a new sample, and the sample added to the training set.
  2. Then, at a fixed model training set for training.

The author believes that it is this fixed order algorithm so that the overall training low speed. In this regard, the authors propose a use SGD optimization, the use of loss of newly designed algorithm samples from non uniform-& non-stationary in a desired resolve this issue. the main contribution of the new model are the following

  • It does not require the then popular exploratory sampling.
  • You can achieve stable mAP.
  • The more complex the backbone, the more obvious speed advantage.

Related work

Hard example mining.

The author believes the current hard example, there are two methods:

  1. Use when optimizing SVM.

    Easy to remove the classification of easy examples - and therefore the training set really use is a fraction of the actual training set.

  2. When used to optimize non-SVM.

    Mainly used in shallow neural networks or decision trees, in general, is not easy because of convergence, will rejoin the training set false positives repeat the training twice.

Overview of Fast R-CNN

On the main reason for choosing Faster R-CNN improvements are:

  1. He is a fast end-to-end system.
  2. Faster R-CNN was very popular, many algorithms derived.
  3. Faster R-CNN instead of the entire network can be trained so that part of the web must be fixed.
  4. Not using SVM, and Faster authors R-CNN proves whether to use SVM basically does not cause a big impact on the results.

Training

The author believes background RoIs is suboptimal because it lost some of infrequent, but important, difficult background regions, and therefore the authors removed the limit on the maximum threshold of background.

They also found that 1: 3 fg, bg proportion after using their algorithms value does not exist, therefore they also removed the limit on.

Their approach

Online hard example mining

Previous sampling stage of the algorithm are examples when the number reaches a certain threshold to enter the training phase is also a cause of slow.

In this paper the algorithm mainly for the following phases:

  1. Generating a convolutional network feature map.
  2. The feature map is generated using RoINet RoIs, and all pre RoI propagation are performed.
  3. By sorting the input loss RoI, select $ B / N $ a "hard example"

Since we shared and select some reasons of example, before a small amount of computation to spread.

In particular , the author mentions a caveat: if there is overlap there will be a repeat statistics is relatively close to the case of loss of the box, the authors used NMS and set the threshold value of 0.7 to resolve this problem.

Additional authors also mentioned that does not require fg, bg proportion of the reason: If the missing class, then the loss will rise, this phenomenon will eventually be resolved.

Implementation details

The two authors propose a method to apply OHEM Fast R-CNN of:

  1. Directly to the loss sorting, the non-hard example to zero. Because it still needs to allocate memory and they do not contribute to the derivative of the zero, and therefore a waste of resources.

  2. Generating two RoINet, one of which is read-only. The whole process can only read-only portion of memory allocated to the front propagation, a standard part of the normal memory allocation. Before each iteration will read only the propagation network and is calculated for each input RoI the loss, followed by OHEM to filter sample, the screening results into the network RoI original network.

    arch

The first party and the second method is substantially the same amount of memory, but the speed reaches twice.

Analyzing online hard example mining

This article each batch with only two pictures, the author of this thought there would be similar problems have similar image similar loss of double counting. In order to verify the conjecture, the author chose an extreme case only one picture, the results of the two experiments found no difference large, the side that has certain characteristics OHEM robust.

r1

r2

Author and experimental tests only if hard examples superior than all the examples found both mAP similar, but there is no doubt that only the hard examples training faster.

After they try to use some of the popular trick was to enhance the effect such as:. Multi scale (M), bbox regression (B) have achieved some improvement.

r3

Conclusion

at 2:26 am, too sleepy, I do not want to write.

There are more interesting design and a read-only network without a phased training.

Guess you like

Origin www.cnblogs.com/edbean/p/11280154.html