I. Introduction
In the previous article, we have learned about the organization format of the VOC data set . If we want to train our own data set, then we can organize your data according to the VOC format.
Of course, some tools may be needed, and I will talk about it later if I have a chance! ! !
However, darknet
the yolov3、yolov4
required data format of the official version is not like this, and we still need to do some conversion. But the good news is that we can use some python scripts to help us quickly complete the format conversion! ! !
Two, yolo's data format
As mentioned earlier, the data format of yolo is different from that of voc, which is mainly reflected in the different organization of category and coordinate information , that is, the former does not directly use the .xml
file mentioned in the previous article to read data, but uses Read data in a file that saves category and coordinate information in a certain format .txt
. So what is the specific format for yolo to save category and coordinate information ? See below:
One
.txt
corresponds to one picture, and their names are the same.
A line corresponding to the object, the first part is class_id, the next four numbers are
BoundingBox
in(中心x坐标,中心y坐标,宽,高)
. These coordinates are relative coordinates from 0 to 1.
Three, format conversion
So since the data format of the label part in voc is different from that in yolo, then we need to convert it. We can use the official one voc_label.py
to achieve the conversion. Let's briefly talk about what we should pay attention to! ! !
1. First assume that you have obtained the voc data set before and stored it in the VOCdevkit
folder. There will be VOC2007
subfolders or VOC2012
subfolders or both under this folder . At VOC2007
and VOC2012
below that is stored on said article Annotations、ImageSets和JPEGImages
such as a sub-folder. (Note that the data sets for training verification and testing are separate when downloading, but their image names and xml file names are continuous, that is, they have no identical names before each other, so they are all the images directly mixed together. So if you want to convert all train, val, and test data at once, just copy the files in the folder with the same name in the downloaded test data set to the folder with the same name in the training verification data set.)
In fact Main
, the other files in the folder are not used, and it is okay to keep only the files in the figure below.
2, the voc_label.py
script files and your VOCdevkit
folders on the same path to the next level.
3. Modify the code according to the actual situation:
- The first is
sets
: you should usesets
the place in the code, such as the following code to understandsets
what values should be set in yours, in fact, it is closely related to the name and path of your folder.
for year, image_set in sets:
if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)):
os.makedirs('VOCdevkit/VOC%s/labels/'%(year))
image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'
%(year, image_set)).read().strip().split()
Because VOCdevkit
there is only one VOC2007
subfolder under my file, and there are files under the VOCdevkit/VOC2007/ImageSets/Main
path train.txt
, val.txt
and test.txt
the names of the images used for training, verification, and testing in all categories (or all images) are stored. So, my sets
settings became sets=[('2007', 'train'), ('2007', 'val'), ('2007', 'test')]
.
- The second is
classes
: here are the various categories included in your data set.
import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
sets=[('2007', 'train'), ('2007', 'val'),('2007', 'test')]
classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus",
"car", "cat", "chair", "cow", "diningtable", "dog", "horse",
"motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
def convert(size, box):
dw = 1./size[0]
dh = 1./size[1]
x = (box[0] + box[1])/2.0
y = (box[2] + box[3])/2.0
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
def convert_annotation(year, image_id):
in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
out_file = open('VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w')
tree=ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult) == 1:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (float(xmlbox.find('xmin').text),
float(xmlbox.find('xmax').text),
float(xmlbox.find('ymin').text),
float(xmlbox.find('ymax').text))
bb = convert((w,h), b)
out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
wd = getcwd()
for year, image_set in sets:
if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)):
os.makedirs('VOCdevkit/VOC%s/labels/'%(year))
image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'
%(year, image_set)).read().strip().split()
list_file = open('%s_%s.txt'%(year, image_set), 'w')
for image_id in image_ids:
list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg\n'
%(wd, year, image_id))
convert_annotation(year, image_id)
list_file.close()
4. Run voc_label.py
, and finally it will be VOCdevkit
generated in the same level directory of the folder 2007_train.txt,2007_val.txt,2007_test.txt
, and VOCdevkit/VOC2007
a label
folder will be generated under the path .
So what are the specific contents of the files generated above?
2007_train.txt,2007_val.txt,2007_test.txt
As shown below, which is stored train.txt
, val.txt
, test.txt
the full path of the file name stored in the picture . That and train.txt
, val.txt
, test.txt
compared to just put pictures inside the store into a full path name, the other in terms of the number and corresponding pictures are exactly the same.
labels文件夹
The files stored in this folder .txt
correspond to all the pictures (training + verification + testing), one picture and one file, and the names are all the same. What is stored in each file is category and coordinate information , as shown below.
Four, summary
We already know that some new, reorganized files are generated by voc_label.py
converting the VOC
formatted data set into a yolo
format, so how should these files be used? Regarding how to use them for training, let's wait until the next article! Hope the above content can help you!