yolov3 series (zero)-yolov3 detailed

Reference tutorial

Target detection algorithm and EfficientDet explanation

paper

https://pjreddie.com/media/files/papers/YOLOv3.pdf

translation

https://zhuanlan.zhihu.com/p/34945787

yoloV3 homepage:

https://pjreddie.com/darknet/yolo/

Reference website

https://blog.csdn.net/leviopku/article/details/82660381
https://blog.csdn.net/u012746060/article/details/81183006
https://blog.csdn.net/Patrick_Lxc/article/details/80615433
https://www.cnblogs.com/gezhuangzhuang/p/10596545.html
https://blog.csdn.net/m0_37857151/article/details/81330699


The target detection algorithm of the yolo series can be said to be a masterpiece in the history of target detection. The v3 algorithm is formed on the basis of v1 and v2 . First look at yolov1 and yolov2

0. Overall introduction

Insert picture description here
Insert picture description here

1. Network structure

The figure below shows the network structure
yolov3
DBL (Darknetconv2d_BN_Leaky) of yolov3: it is the basic component of yolov3. It is 卷积 + BN + Leaky relu
Res_unit : DBL + DBL ADD
Resn : n represents the numbers res1, res2...res8, etc., indicating how many res_units are contained in this res_block. Refer to resnet
concat : tensor stitching. Join the darknet middle layer and a certain layer behind.拼接的操作和残差层 add 是不一样,拼接会扩张张量的维度,而 add 只是直接相加,不会导致张量维度的改变

yolov3

darknet-53

darknet-53
yolov3使用了 darknet53 前面的52层,yolov3这个网络是一个全卷积网络大量使用残差的跳层连接,并且为了降低池化带来的梯度负面效果,作者直接屏弃了 Pooling ,用 conv 的 stride 来实现降采样。在这个网络结构中使用的是步长为2的卷积来降采样 为了加强算法对小目标检测的精确度,yolov3中采用类似 FPN 的 upsample 和融合做法(最后融合了3个scale其他两个scale的大小分别是 26*26、52*52),在多个 scale 的 feature map 上做检测 作者在3条预测支路采用的也是全卷积的结构,其中最后一个卷积层的卷积核个数是255,是针对COCO数据集的80类:3*(80+4+1)=255,3表示一个grid cell包含3个bunding box,4表示框的4个坐标信息,1表示objectness score.

output

yolov3-output
``The so-called multi-size comes from these three prediction paths. The depth of y1, y2, and y3 are all 255 and the rule of side length is 13:26:52. What yolov3 sets is that each network unit predicts 3 boxes, so each box needs to have five basic parameters (x, y, w, h, confidence), and then there are 80 categories of probabilities. So 3*(80+4+1)=255

How did v1, v2, v3 come

-To be continued


reference

  1. Understand YOLO v3 in one article
  2. yolo v3 of yolo series [in-depth analysis]

Guess you like

Origin blog.csdn.net/qq122716072/article/details/108347462