Understanding neural networks (vii) R-CNN

Region Proposal

Here Insert Picture Description
You can solve the problem of the sliding window

Candidate region (Region Proposal)

It is to identify in advance the position of the target figure may appear. It utilizes the image texture, edge, color information can be ensured to maintain a high recall rate (the Recall) in the case of the window select fewer (or even hundreds of thousands) of.
RCNN (Regions with CNN features) is a method to CNN a milestone target detection, by means of CNN good feature extraction and classification performance to achieve the transformation target detection method by Region Proposal.

Selecting candidate regions

Region Proposal is a traditional extraction methods class area, can be seen as different width and height of the sliding window, sliding obtain a potential target image through the window, you can look on Proposal SelectiveSearch, general Candidate options for the 2k can no longer here dETAILED DESCRIPTION; normalized, CNN as a standard input image in accordance with the target Proposal extracted.

CNN feature extraction

Standard CNN process, convolving / pool according to an input operation etc., to obtain a fixed dimension output

Classification and Regression border

Actually comprises two sub-steps, first, the classifying step output vector (feature need to train the classifier); the second is obtained by precise target region boundary regression (bounding-box regression), since the actual target will produce a plurality of sub-regions , aims to complete the classification of foreground objects precise positioning and consolidation, to avoid a plurality of detection.

Extracting features related to the steps of FIG
Here Insert Picture Description
divided into four steps:
Here Insert Picture Description

  1. Input test image
  2. Selective search algorithm using the image from the extracted to about 2000 Region Proposal
  3. Each Region Proposal scaling (Warp) to a size of 227x227 and input to CNN, CNN output layer as a feature of the fc7
  4. Each Region Proposal extracted CNN feature input to the SVM classification

RCNN three distinct problems exist:
(1) divided into a plurality of stages of training, cumbersome steps: trimming + Network Training Training the SVM + regressor frame
(2) corresponding to the plurality of candidate regions requires the image extracted in advance, the training time consuming, taking up disk space: 5000 G image generating hundreds of profile
(3) to produce an object or truncated stretching can cause the loss of the information input CNN CNN conventional input image requires a fixed size, crop / warp (normalized);
( 4) test slow: each candidate needs to run the entire area before calculating the CNN. Each ProposalRegion of going into network computing CNN, there are a lot of thousands Region ranges overlap, repeated feature extraction calculation huge waste. Use GPU, VGG16 a model image processing needs 47s.
(5) SVM and return an afterthought: In the process of regression SVM and CNN features have not been updated learning

step

  1. Use of the selective input image search START (selective search) to select a plurality of high- quality, the proposed area. These proposals are generally selected regions at multiple scales, and has a small different shape and zoomed. Each proposal will be marked with category and regional real bounding box.
  2. Select ⼀ pre-training a convolutional neural Open networks, and truncation prior to the output layer. The deformed regions each offer input START SIZE Open networks required and proposed by the front region feature extraction to calculate the output.
  3. The characteristics of each region together with its proposed labeling categories as ⼀ samples, training more ⽀ support vector machine destination time scale classification. Wherein each ⽀ support vector machine to determine whether a sample belongs Using a ⼀ categories.
  4. The characteristics of each area of ​​the proposal along with its bounding box marked as ⼀ samples, training linear regression model to predict the true bounding box.
Published 163 original articles · won praise 117 · views 210 000 +

Guess you like

Origin blog.csdn.net/u010095372/article/details/91147900