Interpretation CVPR 2019 paper | small sample domain adaptive target detection

Quotation

I recently also looking in other directions target detection, can continue to tap the general direction is to start from the data object detection, object detection difficult samples, such as detecting objects obscured, tiny face detection, or by lack of data samples of algorithm. Here the author describes a small sample (few-shot) under the direction of data domain adaptation (Domain Adaptation) target detection algorithm, this National University of Singapore & Huawei Noah's Ark Labs Paper "Few-SHOT Adaptive Faster R-CNN " it was included in the CVPR2019 , to solve specific problems scenario is that we have the car target detection in the ordinary common scene in our sample car only a few foggy storm very bad weather conditions, then we can use sampling in pairs (pairing-sampling ) method, the source domain (source domain) i.e. the sample at an ordinary scene car \ (Car_ {s} \) and the target domain (target domain) ie cars sample harsh weather \ (Car_ {t} \) pairs \ ((Car_s, Car_t) \) composed of negative samples , on the other hand the pair consisting of the source domain sample n \ ((Car_s, Car_s) \) , using the GAN architecture, discriminator (discriminator) as to distinguish between positive and negative samples different, i.e. to distinguish the samples of the source and target domains , generator (generator) is an attempt to confuse discriminator. This is the main idea of this algorithm, the main idea is to adapt the application domain to the target detection.

Paper Source is not completely open, only to find the official repo: https://github.com/twangnh/FAFRCNN

Think

Before designing specific article on network design and loss of function, we can take a problem to think about.

  1. Structural GAN, the data samples using the \ (Car_s \) as the positive sample, \ (Car_t \) as a negative sample may be that the discriminator (discriminator) distinguish the sample source and target domains, why here to the composition of the de training?

algorithm design

1568550456835

Fig 1. Few-shot Adaptive Faster R-CNN (FAFRCNN) an overall structure of a network (which will be introduced to the rear of the module SMFR)

In target detection task, the authors domain adaptation into two levels :

    1. Image-level domain adaptation
    1. Examples of adaptation level domain

We next Fig2 the first and third rows of the image domain migration in the migration level of the entire image field is composed of individual pixels, the instance-level domain migration is the migration of the car domain samples.

1568551078207

Fig 2. the middle of two pictures from Cityspaces and Foggy Cityspaces of. The first image domain migration behavior level, third-level domain migration examples of behavior.

Image-level domain adaptation

Image-level domain adaptation (Image-level Adaptation) is to complete the image-to-image conversion, the paper proposed split pooling (SP) method, the role is to randomly placed grid, practice is also very simple, grid cell width It is w, height is h, and then randomly generating sx xy, grid according to adjust the position sx and sy.

1568730320548

Fig 3. grid option

After obtaining the grid, the grid with the paper Faster R-CNN selected as anchor boxes, taken three and three kinds of scale ratio, split pooling corresponding to the extracted feature \ (f (x) \) is also large (L ), medium (m), small (s) three kinds Scale: \ (sp_l (F (X)), sp_m (F (X)), sp_s (F (X)) \) .

Back can train generator and a discriminator by way of combat training, but since the sample is a small sample of the target domain data, presented here paired training mode, i.e., the source domain \ (G_ {s_1} = { (g_s, g_s)} \) and the source domain - on the target domain \ (S_2 G_ {} = {(G_S, G_T)} \) . Discriminator determines the source of the sample, a generator device feature extractor is confusing target discriminator .
\ [G_s \ sim sp_kf (X_s ), g_t \ sim sp_k (f (X_T)), k = \ {l, m, s \} \]

\[ L_{sp_{sd}}=-\mathbb{E}_{x\sim{G_{s1}}}[logD^{sp_s}(x)]-\mathbb{E}_{x\sim{G_{s2}}}[log(1-D^{sp_s}(x))] \]

\ [L_ {im_d} = L_
{sp_ {sd}} + L_ {sp_ {md}} + L_ {sp_ {ld}} \] Further papers adaptation field level image with three GAN, no practical Know how.
This article will be re-edited in a week, want to go to view the full text of tweets:

CVPR 2019 | small sample domain adaptive target detection

Guess you like

Origin www.cnblogs.com/ManWingloeng/p/11617208.html