"Target Detection" Summary of Target Detection

1. Classification of target detection methods

First, prior knowledge of the known target

In this case, there are two types of methods for detecting targets:
(1) Use the prior knowledge of the target to train a bunch of weak classifiers, and then these weak classifiers vote together to detect the target;
(2) Find the target and non-target based on the prior knowledge The best dividing line of the goal

Second, prior knowledge of unknown targets

At this time, I don't know what the target is to be detected, so what is a target has a different definition.
(1) Detect salient targets in the scene, such as expressing the salient probability of each pixel in the scene through some features, and then find the salient target.
(2) Detect moving targets in the scene.

Two, the classic front-background separation target detection method

2.1 Background subtraction method

When detecting a moving target, if the background is still, the current image and the pre-stored background image are used as the difference, and then the threshold is used to detect the moving area is a dynamic target recognition technology.

The background difference algorithm is suitable for situations where the background is known, but the difficulty is how to automatically obtain a long-term static background model.

2.2 Frame difference method

The difference between two or several consecutive frames in the video sequence is used for target detection and extraction. (More suitable for dynamically changing scenes)

In the process of motion detection, this method uses time information to obtain the grayscale difference of corresponding pixels by comparing several consecutive frames in the image. If they are all greater than a certain threshold T2, it can be determined that there is a moving target at this position.

2.3 Optical flow field method

The principle of gray-scale retention of corresponding pixels in two adjacent frames is used to evaluate changes in two-dimensional images.

It can better detect the relevant foreground target from the background, even part of the moving target in motion, which is suitable for the detection of the relative moving target during the camera movement.

Disadvantages: the opening problem, the non-uniqueness problem of the solution of the optical flow field constraint equation. Can not accurately represent the actual sports field.

3. Common methods for deep learning to detect small targets

1. Traditional image pyramid and multi-scale sliding window detection

  • Before the popularity of deep learning methods, for targets of different scales, it was common to build image pyramids with different resolutions from the original images, and then use a classifier with a fixed input resolution for each layer of the pyramid to slide on that layer to detect the target. Detect small targets at the bottom of the pyramid; or use only one original image, and use classifiers with different resolutions to detect the target on the original image, so as to detect small targets in a relatively small window classifier.
  • Evaluation: However, this method is slow (although usually building image pyramids can be accelerated using convolution kernel separation or directly and simply and rudely resize, but still need to do multiple feature extraction), someone later borrowed from its ideas to come up with feature pyramids Network FPN, which takes features in different layers for fusion, only requires one forward calculation, and does not need to scale the picture. It is also used in small target detection

2. Simple, crude and reliable Data Augmentation

  • Copying a small target to multiple places in a picture can increase the number of Anchor boxes matched by the small target, increase the training weight of the small target, and reduce the bias of the network to the big target.

3. Feature fusion FPN

  • The feature maps at different stages correspond to different receptive fields, and the degree of abstraction of information expressed by them is different.
  • The shallow feature map feels wild and small, and is more suitable for detecting small targets (to detect large targets, they only "see" part of the large target, and the effective information is not enough);
  • The deep feature map feels wild and is suitable for detecting large targets (to detect small targets, they "see" too much background noise and too much redundant noise).
  • Feature Pyramid Network FPN: fusion of feature maps at different stages to improve the performance of target detection.

4. Appropriate training methods SNIP, SNIPER, SAN

  • The model pre-training distribution should be as close as possible to the distribution of the test input.

5. More dense Anchor sampling and matching strategy S3FD, FaceBoxes

  • If the data set has been determined, you can also increase the setting strategy of the Anchor responsible for the small target to make the learning of the small target more fully during training.
  • In addition, using a looser matching strategy (such as IoU> 0.4) for Anchors with small targets is also a common method.

6. Generate a GAN that zooms in and then detects

7. Relation Network and PyramidBox using Context information

Fourth, the choice of target detector

  1. If you need to detect small objects and do not require speed, prefer to use the Faster R-CNN algorithm;
  2. If speed is the most important, tend to use the YOLO algorithm;
  3. If you need a balanced performance, prefer to use the SSD algorithm;

Guess you like

Origin blog.csdn.net/libo1004/article/details/110881739