Thoroughly get to know YOLO

YOLO Training and testing are conducted in a single network.
Training:
Network architecture:
the use of the 3x3 and 1x1 convolution GoogleNet in place of the original module Inception, pre-trained using data sets ImageNet good GoogleNet. Adding new convolutional layer, the parameters are initialized randomly.
Here Insert Picture Description
The output value of the tensor that does not make sense, and we through training, artificial and what is the value of a dimension representation. We defined as follows:
Here Insert Picture Description
wherein the training sample x, y, w, h are known, Pr (Object), IOU are also known.
Here Insert Picture Description
For each grid, only the calculation of the probability of each category of object grid contains training. Only those grid makes sense.

Guess you like

Origin blog.csdn.net/qq_42278791/article/details/91594894