[Turn] Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd Detailed (with cover pedestrian detection)

Article address: https: //arxiv.org/pdf/1807.08407.pdf
temporarily not released the source code, if there is a small partner to find the code words of welcome message to me.

I. Overview
still to solve the problem to improve the detection of occluded objects in the case of occlusion detection articles on people, respectively, from the loss of two stage detector and the core operation of ROI Pooling these two angles. After my article will be to a simple conclusion based on these two aspects. First put a simple network structure in 2.2, I will be mentioned:

Two, Occlusion-aware R-CNN
order for RPN module to better, more accurate extraction of the proposal, the authors designed a AggLoss to limit the proposals can be closer to the GT and can rely on the same object between the proposals of a possible compact. So overall RPN Loss of function as follows:

Note: The index IS i of anchor in Mini-A BATCH;
PI p_ip 
i  is the i belonging probability anchor background / foreground of; Ti t_it  i  is the i th anchor bbox belonging to a pedestrian coordinate; P * P_i ^ P * i  i *  is the i-th label of a pedestrian in the anchor, i.e. label class; T * T * i t_i ^  i *  is the i th anchor pedestrians belong to certain bbox GT; [alpha] is balance two kinds of ultra-loss parameters;
















Where: Classification loss using log loss:

2.1, Aggregation Loss (loss angle improvement question occlusion)
in order to reduce to close to the pedestrian error detection, the authors can force the proposals near real GT, and to be able to close the distribution between the proposals. Therefore, the authors propose a AggLoss in RPN and Fast-RCNN layer. This loss can make different proposals can be close to their GT, and be able to make the minimum distance between the proposals of belonging to the same GT.

The first term in the formula (3) is a regression loss, proposal is for each object can be closer to its true GT, using smooth L1 loss, the following (4) formula; the second term is the compactness loss, is to belong to the object GT's proposals can be distributed the same firm.

Lcom L_comL Term for compactness 
c  OM sought to bring about proposals GT can belong to the same distribution of compact as possible, making it possible to reduce the error detection problem between two people close to the. as follows:

2.2, Part Occlusion-aware RoI Pooling Unit (ROI Pooling operation improved occlusion problem)


As shown in the Fig.1 diagram of the new divided into five regions, with the size RoI pooling layer secured HxW the feature map.


On the front of each local region 5 were blocked a predicted score 0 to 1, to indicate whether the region 5 is blocked, this can be considered to be part of a simple mask signal, using the last of the five and its score by multiplying the corresponding area feature added to obtain a final RoI feature.
But here the barrier part of the feature and global feature added directly with I do not know what does it mean, but it does not make sense intuitively, the feeling is not very reasonable.
Therefore, the authors propose FRCNN the loss as follows:

Finally, the author carried out on different data sets were tested here do not go into details, interested can go directly to the original.


Original: https://blog.csdn.net/gbyy42299/article/details/83986883

Published 28 original articles · won praise 2 · views 10000 +

Guess you like

Origin blog.csdn.net/highlevels/article/details/98479201