Detailed explanation of the basic principles of YOLO

Introduction to YOLO

YOLO is a new object detection method. Previous object detection methods perform detection by repurposing classifiers. Different from previous solutions, object detection is regarded as a regression problem to spatially locate the bounding box and predict the class probability of the box. Predict bounding boxes and class probabilities directly from full images in a single evaluation using a single neural network. Since the entire detection process only uses one network, the detection performance can be directly optimized end-to-end.

Before formally introducing YOLO, let’s look at a picture:

It can be seen that the biggest feature of YOLO is its fast speed. YOLO still lags behind current state-of-the-art detection systems in accuracy. Although it can quickly identify objects in images, it is not very accurate at locating certain objects, especially small ones. Enter true end-to-end object detection: extract features directly in the network to predict object classification and location.

YOLO structure

The overall structure is composed of three parts: GoogleNet + 4 convolutions + 2 FC

Guess you like

Origin blog.csdn.net/qq_41946216/article/details/132733387