[Tadeas] Image Detection

Object Detection

Bounding box(最小外截距项)和category label

Difference with the other task:

single task: classification <---> classification and localization
multi-task: object detection <---> instance segmentation

contest

imagenet large scale visual recognition challenge

URL: http://image-net.org/challenges/LSVRC/2017/index

object detection

2013-2017, 200 category, multi-category and bounding box each image

other task

image classification
scene classification
object localization
scene parsing

PASCAL VOC

URL： http://host.robots.ox.ac.uk/pascal/VOC/

eject Detection

Bounding box(最小外截距项)和category label

Difference with the other task:

single task: classification <---> classification and localization
multi-task: object detection <---> instance segmentation

contest

imagenet large scale visual recognition challenge

URL: http://image-net.org/challenges/LSVRC/2017/index

object detection

2013-2017, 200 category, multi-category and bounding box each image

other task

image classification
scene classification
object localization
scene parsing

PASCAL VOC

URL： http://host.robots.ox.ac.uk/pascal/VOC/

evolution：R-CNN --> SPP-Net --> Fast R-CNN --> Faster R-CNN --> YOLO --> SSD -->R-FCN

R-CNN

model structure:

Region proposal: devisity boxing with different position and size
Classification: CNN classificator, heavy calculation

$ $
R-CNN model

$ $ R-CNN in models

R-CNN trainning process

pre-train a imageNet CNN model --> Model1
fine-tune proposal regions with SS(selective search) -->Model2
-- Log loss
-- softmax layer --> (N+1) way
-- 32 positive samples （N classes）: IoU with Ground truth > 0.5
-- 96 negative samples (1 classes): IoU < 0.5

based on Model2, on its Fc7 train linear SVM classificator --> Model3
-- Hinge loss
-- each class with a SVM classficator (N classes)
-- postive samples: all Ground-truth areas
-- negative samples: IoU < 0.3 of SS of SS areas

with Model2, on Fc7, train Bounding box regression model
-- improve bounding box accuracy
-- a regression model for each class (N class)
--- bounding box in SS regression：P --> G
--- input:
----- bounding box ${P^i, G^i}_{i=1,...,N}$
----- center: $(x, y)$, width and height:$(w, h)$
----- $ P^i=(P^i_x, P^i_y, P^i_w, P^i_h), P^i=(P^i_x, P^i_y, P^i_w, P^i_h)$
----- feature for Conv5 of CNN: $\phi_5(P)$
--- IoU for P > 0.6
--- Squared loss:
----- ${{\rm{W}}_*} =\mathop {\arg \min }\limits_{{{{\rm{\hat w}}}_*}} {( {t_o^i - {{{\rm{\hat w}}}_o}{\phi_5}( {{P^i}} )} )^2} + \lambda ||{{\rm{\hat w}}_*}||^2$
----- where $t_x = (G_x-P_x)/{P_w}, t_y = (G_y-P_y)/{P_y}, t_w = log(G_w/{P_w}), t_h= log(G_h/{P_h})$
--- test process:
----- ${\hat G}_x = P_x d_x(P)+P_x, {\hat G}_y = P_y d_y(P)+P_y, {\hat G}_w = P_w exp(d_w(P)), d_*(P) = {\rm}^{\rm T}_* \phi_5 (P)$

explain for cost function: Ref

R-CNN test
-- SS (fast model) get about 2000 proposal areas/images
-- resize(expand and scaling) them into size of 227*227
-- Model2 to get 2 feature map: proposal areas subset and regressed Bbox
--- for each class
1). Fc7 feature --> SVM classificator --> class score
2). with IoU > 0.5, get non-redundant area subset

reorder all areas in decrease order with the score
remove redundant areas: IoU > 0.5 with the area of highest score
save the highest score area, the left areas to candidate set
3). conv5 --> Bounding box regression model --> regressed Bbox
4). with regressed Bboxs fine tune ares subset

performance evaluation

mAp@5 (mean Average Precision)
-- calculate AP for each class, then get mean value
AP is the area of Precision-Recall Curve
precision: TP/(TP+FP)
recall: TP/(TP+FN)
** True positive area: IoU >=0.5 with ground Truth
** False positive area: IoU < 0.5
** False negative area: unmarked Gound truth area:
** IoU = Intersection over Unit = (AnB)/(AUB)

defect：

too much time for training process
-- fine-tune (18) + feature extraction (63) + SUM/Box training (3)
too much time for test: a VGG16 47S
complex and multi-training processes

[Tadeas] Image Detection

Object Detection

Bounding box(最小外截距项)和category label

Difference with the other task:

contest

imagenet large scale visual recognition challenge

PASCAL VOC

Bounding box(最小外截距项)和category label

Difference with the other task:

contest

imagenet large scale visual recognition challenge

PASCAL VOC

R-CNN

猜你喜欢