Zero-based entry target detection series learning records (1): prior knowledge

Baidu Flying Paddle Zero-Basic Practice Deep Learning Target Detection Series Study Notes Target
Detection Task Purpose: Category + Target Location
The core problems faced:
1. How to generate candidate regions 2. How to extract image features
insert image description here

Target detection development process:

Currently there are mainly two directions:

(1):Anchor-Free

One is a center point based method and the other is a bounding box based method.
1. The center point-based method usually regards each pixel as the possibility of the center point of the target, and predicts the size and category of the target on this basis.
2. The bounding box-based method predicts the bounding box of the object directly from the feature map without the need for a predefined anchor box.

(2):Anchor-Based

Use the predefined anchor box (Anchor Box) or prior box (Prior Box) when predicting the target position and size.

insert image description here
insert image description here

Object detection basics:

(1) Bounding box bbox

The bbox is used to identify the target object in the image, and is usually represented together with the category of the target. The model will predict the category and position of the target according to the input image and bbox, and output the corresponding detection results.
insert image description here

(2) Anchor box AnchorBox

Anchor Box (Anchor Box), also known as Prior Box (Prior Box), is a technique used in target detection to generate candidate boxes. Anchor boxes are a set of predefined boxes, each with a different size and aspect ratio, which are used to try to match different objects in the input image. Anchor boxes are usually generated at each location of the input image to try to capture objects of different scales and aspect ratios.
During the training process of object detection, the model is trained according to the pre-defined anchor boxes and the location and size of the actual object to learn how to predict the category and location of the object. During inference, the model determines the location and size of the object based on the prediction results and the locations and sizes of the anchor boxes.
insert image description here

(3) Intersection and union ratio IOU

Intersection over Union (IoU) is an indicator used to evaluate the performance of object detection algorithms. In object detection, IoU is often used to measure how much the bounding box (bbox) predicted by the model overlaps with the actual object bounding box.

The calculation formula of IoU is: IoU=intersection area/union area .
Among them, the intersection area refers to the overlap area between the model predicted bbox and the actual target bbox, and the union area refers to the area of ​​the two bboxes plus their overlapping area.
Usually, the IoU threshold is set according to specific tasks and requirements. For example, in target detection, the IoU threshold is usually set to 0.5 or 0.7 to judge whether the model successfully detects the target. In addition to evaluating model performance, IoU can also be used for bbox optimization and adjustment in some target detection algorithms. For example, the non-maximum suppression (NMS) algorithm is based on IoU.
insert image description here
insert image description here

(4) Non-maximum suppression NMS

If there are multiple prediction boxes that are relatively close in position, only the prediction box with the highest score is selected, and the remaining prediction boxes are discarded.
insert image description here
insert image description here

Guess you like

Origin blog.csdn.net/m0_63495706/article/details/130049529