VOC format data set to yolo (darknet) format

I. Introduction

In the previous article, we have learned about the organization format of the VOC data set . If we want to train our own data set, then we can organize your data according to the VOC format.

Of course, some tools may be needed, and I will talk about it later if I have a chance! ! !

However, darknetthe yolov3、yolov4required data format of the official version is not like this, and we still need to do some conversion. But the good news is that we can use some python scripts to help us quickly complete the format conversion! ! !

Two, yolo's data format

As mentioned earlier, the data format of yolo is different from that of voc, which is mainly reflected in the different organization of category and coordinate information , that is, the former does not directly use the .xmlfile mentioned in the previous article to read data, but uses Read data in a file that saves category and coordinate information in a certain format .txt. So what is the specific format for yolo to save category and coordinate information ? See below:

Insert picture description here

One .txtcorresponds to one picture, and their names are the same.

A line corresponding to the object, the first part is class_id, the next four numbers are BoundingBoxin (中心x坐标,中心y坐标,宽,高). These coordinates are relative coordinates from 0 to 1.

Three, format conversion

So since the data format of the label part in voc is different from that in yolo, then we need to convert it. We can use the official one voc_label.pyto achieve the conversion. Let's briefly talk about what we should pay attention to! ! !

1. First assume that you have obtained the voc data set before and stored it in the VOCdevkitfolder. There will be VOC2007subfolders or VOC2012subfolders or both under this folder . At VOC2007and VOC2012below that is stored on said article Annotations、ImageSets和JPEGImagessuch as a sub-folder. (Note that the data sets for training verification and testing are separate when downloading, but their image names and xml file names are continuous, that is, they have no identical names before each other, so they are all the images directly mixed together. So if you want to convert all train, val, and test data at once, just copy the files in the folder with the same name in the downloaded test data set to the folder with the same name in the training verification data set.)

In fact Main, the other files in the folder are not used, and it is okay to keep only the files in the figure below.
Insert picture description here

2, the voc_label.pyscript files and your VOCdevkitfolders on the same path to the next level.

3. Modify the code according to the actual situation:

  • The first is sets: you should use setsthe place in the code, such as the following code to understand setswhat values ​​should be set in yours, in fact, it is closely related to the name and path of your folder.
for year, image_set in sets:
    if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)):
        os.makedirs('VOCdevkit/VOC%s/labels/'%(year))
    image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'
    							%(year, image_set)).read().strip().split()

Because VOCdevkitthere is only one VOC2007subfolder under my file, and there are files under the VOCdevkit/VOC2007/ImageSets/Mainpath train.txt, val.txtand test.txtthe names of the images used for training, verification, and testing in all categories (or all images) are stored. So, my setssettings became sets=[('2007', 'train'), ('2007', 'val'), ('2007', 'test')].

  • The second is classes: here are the various categories included in your data set.
import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join

sets=[('2007', 'train'), ('2007', 'val'),('2007', 'test')]

classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", 
"car", "cat", "chair", "cow", "diningtable", "dog", "horse", 
"motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]


def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

def convert_annotation(year, image_id):
    in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
    out_file = open('VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w')
    tree=ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)

    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), 
        	float(xmlbox.find('xmax').text), 
        	float(xmlbox.find('ymin').text), 
        	float(xmlbox.find('ymax').text))
        bb = convert((w,h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')

wd = getcwd()

for year, image_set in sets:
    if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)):
        os.makedirs('VOCdevkit/VOC%s/labels/'%(year))
    image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'
    							%(year, image_set)).read().strip().split()
    
    list_file = open('%s_%s.txt'%(year, image_set), 'w')
    
    for image_id in image_ids:
        list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg\n'
        						%(wd, year, image_id))
        convert_annotation(year, image_id)
    list_file.close()

4. Run voc_label.py, and finally it will be VOCdevkitgenerated in the same level directory of the folder 2007_train.txt,2007_val.txt,2007_test.txt, and VOCdevkit/VOC2007a labelfolder will be generated under the path .

Insert picture description here

Insert picture description here
So what are the specific contents of the files generated above?

  • 2007_train.txt,2007_val.txt,2007_test.txt

As shown below, which is stored train.txt, val.txt, test.txtthe full path of the file name stored in the picture . That and train.txt, val.txt, test.txtcompared to just put pictures inside the store into a full path name, the other in terms of the number and corresponding pictures are exactly the same.
Insert picture description here
Insert picture description here

Insert picture description here

  • labels文件夹

The files stored in this folder .txtcorrespond to all the pictures (training + verification + testing), one picture and one file, and the names are all the same. What is stored in each file is category and coordinate information , as shown below.
Insert picture description here

Insert picture description here

Four, summary

We already know that some new, reorganized files are generated by voc_label.pyconverting the VOCformatted data set into a yoloformat, so how should these files be used? Regarding how to use them for training, let's wait until the next article! Hope the above content can help you!

Guess you like

Origin blog.csdn.net/qq_39507748/article/details/110819929