Convert Pascal VOC format data set to YOLO format data set

Convert Pascal VOC format data set to YOLO format data set

introduce

Pascal VOC (Visual Object Classes) is a widely used image classification and object detection dataset. YOLO (You Only Look Once) is a real-time target detection algorithm. In some cases, it may be necessary to convert a Pascal VOC format dataset to YOLO format for training in a YOLO model.

Pascal VOC format

Data sets in Pascal VOC format usually contain image files and corresponding XML files. The XML file contains information such as the category, bounding box, etc. of the objects in each image.

YOLO format

The YOLO format data set requires that the target annotation information of each image be stored in a text file, with each line representing a target in the format of class x_center y_center width height.

Conversion steps

  1. Parse the target information from the Pascal VOC XML file and calculate the center coordinates, width and height of the target.

  2. Format the target information into YOLO format annotation information.

  3. Write annotation information in YOLO format to a text file.

Python code example

import os
import xml.etree.ElementTree as ET

def convert_voc_to_yolo(voc_folder, output_folder):
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)
    
    for xml_file in os.listdir(voc_folder):
        if xml_file.endswith('.xml'):
            tree = ET.parse(os.path.join(voc_folder, xml_file))
            root = tree.getroot()

            yolo_file = os.path.splitext(xml_file)[0] + '.txt'
            yolo_path = os.path.join(output_folder, yolo_file)

            with open(yolo_path, 'w') as f:
                for obj in root.findall('object'):
                    cls = obj.find('name').text
                    bbox = obj.find('bndbox')
                    x_center = (float(bbox.find('xmin').text) + float(bbox.find('xmax').text)) / 2
                    y_center = (float(bbox.find('ymin').text) + float(bbox.find('ymax').text)) / 2
                    width = float(bbox.find('xmax').text) - float(bbox.find('xmin').text)
                    height = float(bbox.find('ymax').text) - float(bbox.find('ymin').text)

                    yolo_line = f"{
      
      cls} {
      
      x_center} {
      
      y_center} {
      
      width} {
      
      height}\n"
                    f.write(yolo_line)

voc_folder = 'path_to_voc_data'
output_folder = 'path_to_output_folder'
convert_voc_to_yolo(voc_folder, output_folder)

in conclusion

This article describes how to convert a Pascal VOC format dataset to YOLO format for training in a YOLO model. By parsing the XML file and calculating the coordinates and dimensions of the target, we can generate an annotation file suitable for the YOLO model.

Guess you like

Origin blog.csdn.net/Silver__Wolf/article/details/132346930