继之前博客的YOLO安装,这篇将分享如何用YOLO训练测试自己的数据~(这边我们检测的只有一类,是车里的方向盘检测,当然可以类推到其他的类,像车啊,动物,人啊等等)
如何制作数据集
要进行训练需要数据集的支撑,首先需要制作数据集,这边我们只检测一类,所以制作了500多张数据集图片。用LabelImg来做数据集,GitHub上自行下载使用,有介绍如何使用,推荐装anaconda使用
第一次用的时候需要输入前三行命令,后面每次使用的时候只需切换到目录下执行第三行命令就能打开标注数据集的界面了。
数据集
直接在某一盘下建了一个目录VOC2007
Annotations里存放的是用labelimg标注保存的xml文件,
ImagwSets文件夹下面存放的是
只用到了main文件夹下面存放一些train.txt test.txt等文件,其余两个文件夹目前没用到。JPEGImages文件夹下存放的是采集的图片,jpg,png都行的,有些教程说要jpg格式,其实不一定的png也可以,后面有些需要改动而已,我的图片是png的,在标记图片之前最好现将图片的命名批量修改从成类似0000001.png之类的。
main文件下的train.txt文件就是图片的名字不带后缀也不带路径,可批处理生成
将生成的xml文件转成txt文件
darknet-master\scripts目录下有个voc_label.py文件
import xml.etree.ElementTree as ET import pickle import os from os import listdir, getcwd from os.path import join #sets=[('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test')] #classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"] sets=[('2007', 'train')] 我的main下只放了train文件,图片全部用来训练不用来验证测试 classes = ["wheel"] 只有一个类就是方向盘 wd = "E:" 添加了路径,省的每次都要放在什么scripts目录下,麻烦,所以下面有些地方需要改动 def convert(size, box): dw = 1./size[0] dh = 1./size[1] x = (box[0] + box[1])/2.0 y = (box[2] + box[3])/2.0 w = box[1] - box[0] h = box[3] - box[2] x = x*dw w = w*dw y = y*dh h = h*dh return (x,y,w,h) def convert_annotation(year, image_id): #in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id)) #out_file = open('VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w') in_file = open('%s/VOC%s/Annotations/%s.xml'%(wd, year, image_id)) out_file = open('%s/VOC%s/labels/%s.txt'%(wd, year, image_id), 'w') tree=ET.parse(in_file) root = tree.getroot() size = root.find('size') w = int(size.find('width').text) h = int(size.find('height').text) for obj in root.iter('object'): difficult = obj.find('difficult').text cls = obj.find('name').text if cls not in classes or int(difficult) == 1: continue cls_id = classes.index(cls) xmlbox = obj.find('bndbox') b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text)) bb = convert((w,h), b) out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n') #wd = getcwd() #for year, image_set in sets: # if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)): # os.makedirs('VOCdevkit/VOC%s/labels/'%(year)) # image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split() #list_file = open('%s_%s.txt'%(year, image_set), 'w') #for image_id in image_ids: # list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg\n'%(wd, year, image_id)) # convert_annotation(year, image_id) #list_file.close() for year, image_set in sets: if not os.path.exists('%s/VOC%s/labels/'%(wd, year)): os.makedirs('%s/VOC%s/labels/'%(wd, year)) image_ids = open('%s/VOC%s/ImageSets/Main/%s.txt'%(wd, year, image_set)).read().strip().split() list_file = open('%s_%s.txt'%(year, image_set), 'w') for image_id in image_ids: list_file.write('%s/VOC%s/JPEGImages/%s.png\n'%(wd, year, image_id)) convert_annotation(year, image_id) list_file.close()
在目录下生成labels文件夹,里面存放
同时在e盘根目录下也生成了2007_train.txt,并将2007_train.txt移到darknet-master\build\darknet\x64目录下
修改配置文件
进入darknet-master\build\darknet\x64目录下
首先修改data文件夹下voc.names文件,我们只检测一类,所以文件里只需要写wheel就行,
其次修改cfg文件夹下的voc.data文件
然后修改cfg文件夹下.cfg文件也就是网络设置,随便挑一个吧,这边用的yolov2.cfg
[net] # Testing batch=5 batch不敢设置太大,电脑跑不掉 subdivisions=1 # Training # batch=64 # subdivisions=8 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1 learning_rate=0.0005 学习率调小了 burn_in=1000 max_batches = 2000 我就500多章图片不需要迭代太多次 policy=steps steps=200,250 scales=.1,.1 [convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky [maxpool] size=2 stride=2 [convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky [maxpool] size=2 stride=2 [convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky [maxpool] size=2 stride=2 [convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky [maxpool] size=2 stride=2 [convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky [maxpool] size=2 stride=2 [convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky [convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky ####### [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky [route] layers=-9 [convolutional] batch_normalize=1 size=1 stride=1 pad=1 filters=64 activation=leaky [reorg] stride=2 [route] layers=-1,-4 [convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky [convolutional] size=1 stride=1 pad=1 filters=30 region前最后一个卷积层的filters数是特定的,计算公式为filter=num*(classes+5) activation=linear [region] anchors = 0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828 bias_match=1 classes=1 种类设置1 coords=4 num=5 softmax=1 jitter=.3 rescore=1 object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1 absolute=1 thresh = .6 random=0 random为1时会启用Multi-Scale Training,随机使用不同尺寸的图片进行训练 减少计算量就设置0啦
最后修改yolo.c文件,打开darknet工程,找到yolo.c源代码,将voc_names改成一个类,wheel就好,并且需要重新生成解决方案
x64目录下的darknet.exe会更新一下。
训练
darknet-master\build\darknet\x64目录下运行命令
darknet.exe detector train cfg\voc.data cfg\yolov2.cfg
训练生成的模型在backup目录下
测试
darknet-master\build\darknet\x64目录下运行命令
darknet.exe detector test data/voc.data cfg/yolov2.cfg backup/yolov2_1800.weights E:/VOC2007/JPEGImages/00001.png
初试yolo训练自己的数据,只是尝试运用了还没有深入了解,后续会有深入探索~~~