YOLOV5 parameter setting and model training pits 123

Get into the habit of writing together! This is the 10th day of my participation in the "Nuggets Daily New Plan · April Update Challenge", click to view the details of the event .

foreword

The Blue Bridge Cup is over, what should I say, the question group has changed, and I don't talk about martial arts, of course I can't do it myself.... Ok, it's time to come back and do something serious.

Last time we briefly played with YOLOV5. If we follow the article I recorded last time, I think, at least everyone's YOLO environment should be set up, and how to set the data set for training. Then I said that when I was playing, I said that I overturned the car, and that there was a problem with the size of the picture. In fact, I didn’t roll over, yes, I didn’t roll over, it was the parameter problem when I was testing and training the model. Why do I say that I didn't roll over myself. Simple.

insert image description here

But why can't the results come out, there are many reasons, the most likely and most likely reason is the problem of confidence. So the problem I have to solve today is to get my own model to come out. and make a record,

Self-training model problem

no target frame

problem analysis

This is the problem we are encountering now, that is, the model is clearly trained, but no target frame appears. And when doing testing, we use images from the same training set. And in our actual training process, we also found that it should have been recognized. But it just doesn't come out.

Let’s talk about the possible reasons first, the first one that is more likely to mislead people is this thing

detect.py insert image description here train.py insert image description here

This thing doesn't actually refer to the original size of your image at all. It is the size that will be processed when it is sent to the neural network. It is said that this is necessary. If you read my previous Pytorch from 0 to deployment, you should know it. I don’t want to repeat it here. Much is based.

So, I was misled up front. That's not the reason, what else could it be. Obviously there is another reason here and that is credibility.

The question is, if it is a question of credibility, then why does the training test have obvious target labels? How to say this. Deep learning is essentially a machine learning. It is a learning process, and the training of neural network should be a process of modeling and fitting in essence.

Let's take one of the most intuitive examples, that is, students learn to do exam papers.

一个学生想要取得好成绩,无非两条件,天赋+学习。

在我们这里 天赋就是游戏的神经网络结构,对神经网络进行优化。学习自然就是获取权重的过程,也就是训练过程。

我们在排查问题的现在,显然我们先排除天赋问题,也就是神经网络结构的问题。 所以现在的问题来到了学习上。

学习有多种方式,要么题海战术,要么精打细算(认真搞清楚会的) 所以这里引出了我们深度学习需要的几个东西:数据量(题目),学习速率,学习时间(学习轮数)。

那么我们看看题海战术会出现什么问题,首先采用这个策略需要有足够的数据集,但是这里有个问题,你题目做的多,看的题型多,并不意味着,你就会了呀,做的多不代表懂得多。深度学习的过程,相当于拿着标准答案去刷题(所以你测试的时候能够出来很正常)。但是实际上考试的时候,没有答案,你见到题目多但是,你没有掌握到精髓呀,所以自然题目做不出来,那么也就是识别不出来。

看看第二个,精打细算。这个属于什么呢,不拼数量,拼精度。学会举一反三,问题就是学得慢,适合咱们数据集小的情况。 也就是咱们现在的情况,一共才24张图片。所以,这个时候我们要干嘛,提高精度呀,学久一点,学慢一点,慢慢来。

那么实际上控制这些的有哪些参数呢

我们把目光放在train.py 这个文件

insert image description here 当然还有一个 lr 这个没必要动,我们也可以通过提高学习轮数来达到效果。

第一个不用说,学久一点,第二个 batch-size

这个参数很重要,两个原因,第一个我们是在显卡上面跑,这个的大小涉及到我们显卡性能 第二个,这个大小也就是我说的,你要不要学精的问题,越大,学得越快,N个题目一块看,自然学不到啥,当然是在相同的学习时长(轮数)下,设置小一点,例如这里是4(默认是8)那就一次性学4个题目,自然会精度高一点。

重新训练

来看看咱们原来的指令

python train.py --weights weights/yolov5s.pt  --cfg models/yolov5s.yaml  --data data/mydata.yaml --epoch 200 --batch-size 4   --device 0


复制代码

首先第一个呢,其实迁移学习的意思,用训练好的参数去再训练,而不是默认值,这样会快一点 第二个参数呢,是那个模型的参数设置,分几个类呀啥的 第三个参数,指定咱们的数据集 第四个参数,训练次数(时长,这样理解) 第五个参数,一次性进入训练的数据集大小(个数) 第六个参数,训练平台(我这里是cuda0也就是用我自己的显卡训练GTX1650(害,穷,垃圾卡先用着))

所以现在,我们训练的轮数可能太少了,所以嘞,我们就用300看看(默认也是300)

现在这里的话为了方便,我就直接在文件里面改了 insert image description here

直接让它运行,之后咱们来查看结果

tensorboard --logdir=runs

复制代码

insert image description here insert image description here

测试 现在咱们测试一下咱们的模型 同样的为了方便,我这里也是直接该参数在文件里面搞 insert image description here 这里的话,一方面是为了方便,还有就是有些超参数我懒得去设置了。 来看看效果:

insert image description here

好框出来了,但是问题来了 问号是什么玩意

中文乱码

好不容易解决了训练问题,接下来是这个毛病,这个呢也好解决,主要是opencv的问题。

这玩意默认是不支持中文的,在我们当前的这个版本当中,问题定位到这里。 所以修改一下源码就好了。 insert image description here

一开始 我的解决方案是直接在Puttext指定那个字体,但是可能是和我一般使用的cv版本不一样,没有设置参数。 所以最后也是在Bing了一下,之后找到解决方案。

先来看看咱们解决好之后的样子 insert image description here

先来一个搜索小技巧

yolov5-5 中文框 -csdn

原文参考链接 zhuanlan.zhihu.com/p/359494157

那么如何解决呢,首先的话得先下载字体

链接:pan.baidu.com/s/1qXVXBgfD… 提取码:6666

下载之后解压,这里的话我放在了这个文件夹 insert image description here

然后找到咱们的 这个文件 insert image description here 找到这个函数 insert image description here 修改代码

def plot_one_box(x, img, color=None, label=None, line_thickness=3):
    # Plots one bounding box on image img
    tl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1  # line/font thickness
    color = color or [random.randint(0, 255) for _ in range(3)]
    c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))
    cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
    if label:
        tf = max(tl - 1, 1)  # font thickness
        t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
        font_size = t_size[1]
        font = ImageFont.truetype('../data/font/simsun.ttc', font_size)
        t_size = font.getsize(label)
        c2 = c1[0] + t_size[0], c1[1] - t_size[1]
        cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA)  # filled
        img_PIL = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        draw = ImageDraw.Draw(img_PIL)
        draw.text((c1[0], c2[1] - 2), label, fill=(255, 255, 255), font=font)

        return cv2.cvtColor(np.array(img_PIL), cv2.COLOR_RGB2BGR)
    # if label:
    #     tf = max(tl - 1, 1)  # font thickness
    #     t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
    #     c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
    #     cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA)  # filled
    #     cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)

复制代码

insert image description here

mosaic = plot_one_box(box, mosaic, label=label, color=color, line_thickness=tl)
复制代码

回到 detect.py insert image description here

im0 = plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)
复制代码

ok, so far so good. Oh, in addition, in addition to the last coding error, for safety, this text.py can also be modified belowinsert image description here

Here, it's OK.

Then next is the setting of those parameters. Knowing these, you will be able to use it proficiently.

The next step is to read the paper first, and then the source code. Finally, I engineer YOLOV5, and then do the project. Yes, I am paving the way for the Internet + competition.

parameter

detect.py parameters

weights:训练的权重
source:测试数据,可以是图片/视频路径,也可以是'0'(电脑自带摄像头),也可以是rtsp等视频流
output:网络预测之后的图片/视频的保存路径
img-size:网络输入图片大小
conf-thres:置信度阈值
iou-thres:做nms的iou阈值
device:设置设备
view-img:是否展示预测之后的图片/视频,默认False
save-txt:是否将预测的框坐标以txt文件形式保存,默认False
classes:设置只保留某一部分类别,形如0或者0 2 3
agnostic-nms:进行nms是否也去除不同类别之间的框,默认False
augment:推理的时候进行多尺度,翻转等操作(TTA)推理
update:如果为True,则对所有模型进行strip_optimizer操作,去除pt文件中的优化器等信息,默认为False
复制代码

train.py parameters

weights:训练的权重(这里主要是为了迁移学习,你也可以不要default=‘’)
cfg:模型配置文件,网络结构
data:数据集配置文件,数据集路径,类名等
hyp:超参数文件
epochs:训练总轮次
batch-size:批次大小
img-size:输入图片分辨率大小
rect:是否采用矩形训练,默认False
resume:接着打断训练上次的结果接着训练
nosave:不保存模型,默认False
notest:不进行test,默认False
noautoanchor:不自动调整anchor,默认False
evolve:是否进行超参数进化,默认False
bucket:谷歌云盘bucket,一般不会用到
cache-images:是否提前缓存图片到内存,以加快训练速度,默认False
weights:加载的权重文件
name:数据集名字,如果设置:results.txt to results_name.txt,默认无
device:训练的设备,cpu;0(表示一个gpu设备cuda:0);0,1,2,3(多个gpu设备)
multi-scale:是否进行多尺度训练,默认False
single-cls:数据集是否只有一个类别,默认False
adam:是否使用adam优化器
sync-bn:是否使用跨卡同步BN,在DDP模式使用
local_rank:gpu编号
logdir:存放日志的目录
workers:dataloader的最大worker数量
复制代码

Summarize

That's about it, mainly combined with the follow-up supplements of the previous blog post.

Guess you like

Origin juejin.im/post/7084929223659356197