Fine-grained classification

1, when to use?

Are classified in the same class for different subclasses

Attention: the expansion of intra-class gap

  • Semantic Segmentation (semantic segmentation) for dense prediction (dense prediction) inferred based label each pixel, in this way, areas or pixels inside the object are marked as corresponding class.
  • Dividing the object (instance segmentation): labeled different objects of the same class, even based on the divided part (part based segmentation) portion, which has been segmented class is further divided into the bottom layer.

 

 

2, nature: how effectively the foreground object is detected, and find local area information important

 

3, classification

Based on strong supervision and information fine-grained image classification model:

             In order to obtain a better classification accuracy, in addition to the category label image, also used callout box object (object bounding box) and the portion marked point (part annotation) additional manual annotation information and the like

                                    

 

Part-based R-CNN [1]:

Objective: fine-grained object level image (e.g., a bird) and its partial detection regions (head, body and other parts) of

Results: 1) by means of fine-grained image object bounding box and part annotation model can be trained three detection (detection model), a fine-grained level corresponding to the detection object, an object corresponding to the detection head, the other body part corresponding to the detection

           2) the three detection model obtained position detection frame plus geometric constraints, e.g., head and torso general orientation, position shift not too far and the like. Thus can be obtained an ideal object / site detection results (FIG right)

            3) The resultant image blocks (image patch) as an input, a respective training CNN, CNN may learn the features for the object / site. The final layer is fully connected to the three concatenation feature (CONCATENATE) represents the whole fine-grained as the feature image

 

Pose Normalized CNN [3]:

                         

Results: 1) by means of fine-grained image object bounding box and part annotation model can be trained three detection (detection model), a fine-grained level corresponding to the detection object, an object corresponding to the detection head, the other body part corresponding to the detection

           2) the three detection model obtained position detection frame plus geometric constraints, e.g., head and torso general orientation, position shift not too far and the like. Thus can be obtained an ideal object / site detection results (FIG right)

           3)Pose Normalized CNN对部位级别图像块做了姿态对齐操作,并且针对细粒度图像不同级别的图像块(不同部位),提取不同层的卷积特征(例如,很对全局特征,提取了fc8的特征)

 

Mask-CNN [4]:

结果: 1、第一个实现端到端的训练

       2、仅依靠训练时提供的part annotation(不需要bounding box,同时测试时不需额外监督信息)取得了目前细粒度图像分类最高的分类精度

 

 

 

2 、基于弱监督信息的细粒度图像分类模型 :

目标:希望在模型训练时仅使用图像级别标注信息,而不再使用额外的part annotation信息

  • Two Level Attention Model [5]

 

说明:1、预处理模型:从输入图像中产生大量的候选区域,对这些区域进行过滤,保留包含前景物体的候选区域

          2、候选区域大小不一,有些可能包含了头部,有些可能只有脚。此时对这些特征进行谱聚类,得到k个不同的聚类簇

 

 

Bilinear CNN [7]:

网络A:对物体进行定位, 即完成传统算法的对象与局部区域检测工作

网络B:对网络A检测到的物体位置进行特征提取

                         

B=(fA,fB,P,C)B=(fA,fB,P,C),其中fA,fBfA,fB为来个不同的特征,P为Pooling操作,C表示分类器;对特征的每一个位置ll,进行如下计算。 

 

Guess you like

Origin blog.csdn.net/weixin_38740463/article/details/91818112