Introduction of Yolov5 small target performance improvement solution

Table of contents

1. Introduction to small target detection

1.1 Definition of small goals

1.2 Difficulties

2. Small target and difficult solution

2.1 Attention improves the detection accuracy of small targets

2.1.1 Context Information CAM

2.1.2 ConvNeXt

 2.1.3 ECVBlock

 2.1.4 Generalized building blocks of multi-head context aggregation (Context Aggregation)

 2.2 Multi-head detection head

2.3 loss optimization

2.3.1 Wasserstein Distance Loss

 3. To be continued

1. Introduction to small target detection

1.1 Definition of small goals

1) Taking the COCO object definition, a general data set in the field of object detection, as an example, a small object refers to a pixel smaller than 32×32 (a medium object refers to 32*32-96*96, and a large object refers to a larger object than 96*96);
2) In actual application scenarios, it is usually more inclined to use the ratio relative to the original image to define: the product of the length and width of the object label box, divided by the product of the length and width of the entire image, and then open the root sign, if the result is less than 3%, called small goals;

1.2 Difficulties

1) The number of samples containing small targets is small, which potentially makes the target detection model pay more attention to the detection of medium and large targets;

2) The area covered by small objects is smaller, so the location of small objects will lack diversity. We speculate that this makes generalizability of small object detection difficult to verify;

2. Small target and difficult solution

Mainly through data optimization (such as graffiti data enhancement, mosaic enhancement), network optimization, attention mechanism, loss optimization, etc.;

2.1 Attention improves the detection accuracy of small targets

2.1.1 Context Information CAM

        Due to low resolution and small size, tiny objects are difficult to detect. The main reason for the poor performance of tiny object detection is the limitation of the network and the imbalance of the training dataset. In this paper, we propose a novel feature pyramid network that combines contextual augmentation and feature refinement. The features obtained by multi-scale expansion convolution are fused and injected into the feature pyramid network from top to bottom to supplement context information. In multi-scale feature fusion, channel and spatial feature refinement mechanisms are introduced to suppress conflict formation and prevent tiny objects from being submerged in conflict information. In addition, a data augmentation method called copy-reduce-paste is also proposed, which can increase the contribution of tiny objects to missed detections during training, thus ensuring more balanced training.

 Based on Yolov5/Yolov7 tiny target detection---context information CAM, the tiny target rises significantly_AI Little Monster's Blog-CSDN Blog

2.1.2 ConvNeXt

Raising point artifact: Yolov5/Yolov7 joined ConvNeXt to improve small target detection ability, applicable to all series of yolo_AI Little Monster Blog-CSDN Blog

 2.1.3 ECVBlock

YoloV5-based CFPNet---ECVBlock's small target detection, plug and play, help to detect rising points_AI Little Monster's Blog-CSDN Blog The   EVC proposed is mainly composed of two blocks connected in parallel, in which lightweight MLP is used to capture the global long-term dependencies (i.e., global information) of top-level features.

 2.1.4 Generalized building blocks of multi-head context aggregation (Context Aggregation)

Yolov5 point-up artifact: attention mechanism---a generalized building block of multi-head context integration (Context Aggregation), which helps small target detection and violent point increase_AI Little Monster's Blog-CSDN Blog

 2.2 Multi-head detection head

YOLOv5 has 3 detection heads, which can detect targets on multiple scales, but the detection ability of tiny targets may be poor. Therefore, adding a detection head for tiny objects can increase a lot of points, and the map improvement is obvious;

Tips for increasing points: Micro target detection based on Yolov5, multi-head detection head improves small target detection accuracy_yolov5 increases detection head_AI Little Monster's Blog-CSDN Blog

2.3 loss optimization

2.3.1 Wasserstein Distance Loss

 Yolov7/Yolov5 loss function improvement: Wasserstein Distance Loss, helping small targets to increase points - Programmer Sought

1) The sensitivity of IoU to small object position deviation is analyzed, and NWD is proposed as a better indicator to measure the similarity between two bounding boxes;

2) Design powerful tiny object detectors by applying NWD to label assignment, NMS and loss functions in anchor-based detectors;

3) The proposed NWD can significantly improve the TOD performance of popular anchor-based detectors, and it achieves a performance improvement from 11.1% to 17.6% on Faster R-CNN on the AI-TOD dataset;

The main advantages of the Wasserstein distance are :

  1. The distribution similarity can be measured regardless of whether there is overlap between small objects;
  2. NWD is insensitive to objects of different scales and is more suitable for measuring the similarity between small objects.

 3. To be continued

Guess you like

Origin blog.csdn.net/m0_63774211/article/details/131455404