前言

该文章介绍了如何将VOC格式转换为YOLO格式。YOLO格式转换为VOC格式见另一篇文章。

一、VOC格式和YOLO格式介绍？

1.VOC格式

VOC数据集采用的格式为XML格式，下面为示例：

<annotation>
   <folder>img</folder>
   <filename>pikaqiu.jpg</filename>
   <path>E:\cv_code\image_processing\test\img\pikaqiu.jpg</path>
   <source>
      <database>Unknown</database>
   </source>
   <size>
      <width>1062</width>
      <height>974</height>
      <depth>3</depth>
   </size>
   <segmented>0</segmented>
   <object>
      <name>pikaqiu</name>
      <pose>Unspecified</pose>
      <truncated>0</truncated>
      <difficult>0</difficult>
      <bndbox>
         <xmin>83</xmin>
         <ymin>74</ymin>
         <xmax>987</xmax>
         <ymax>920</ymax>
      </bndbox>
   </object>
</annotation>

其中我们要用到的信息有图片名称： <filename>pikaqiu.jpg</filename>，图片的宽度、高度、通道数信息： <size> <width>1062</width> <height>974</height> <depth>3</depth> </size>，类别名字：<name>pikaqiu</name>，box信息： <xmin>83</xmin> <ymin>74</ymin> <xmax>987</xmax> <ymax>920</ymax>。

2.YOLO格式

（class,xCenter,yCenter,w,h）,其中分别代表内别、标注框中心坐标、标注框相对宽度和长度。

二、使用步骤

1.引入库

import xml.etree.ElementTree as ET
import os

2.设置文件地址和标签信息

# xml文件所在目录
xml_dir = "E:/cv_code/image_processing/aug_datasets/Annotations/"
# Yolo格式文件保存目录
yolo_dir = "E:/cv_code/image_processing/aug_datasets/label/"
# 类别名称和数字标签的映射
class_map = {"pikaqiu": 0}

3.遍历并解析XML文件信息

# 遍历XML文件夹中的所有文件
for xml_file in os.listdir(xml_dir):
    if not xml_file.endswith(".xml"):
        continue

    # 解析XML文件
    tree = ET.parse(os.path.join(xml_dir, xml_file))
    root = tree.getroot()

4.写入TXT信息

    # 获取图像尺寸
    size = root.find("size")
    width = int(size.find("width").text)
    height = int(size.find("height").text)

    # 遍历所有目标
    for obj in root.iter("object"):
        # 获取类别和边界框坐标
        cls_name = obj.find("name").text
        if cls_name not in class_map:
            continue

        cls_id = class_map[cls_name]

        bbox = obj.find("bndbox")
        xmin = float(bbox.find("xmin").text)
        ymin = float(bbox.find("ymin").text)
        xmax = float(bbox.find("xmax").text)
        ymax = float(bbox.find("ymax").text)

        # 计算归一化坐标
        x = (xmin + xmax) / (2 * width)
        y = (ymin + ymax) / (2 * height)
        w = (xmax - xmin) / width
        h = (ymax - ymin) / height

        # 将信息写入Yolo格式文件
        yolo_file = os.path.splitext(xml_file)[0] + ".txt"
        with open(os.path.join(yolo_dir, yolo_file), "a") as f:
            f.write(f"{cls_id} {x:.6f} {y:.6f} {w:.6f} {h:.6f}\n")

5.总代码

import xml.etree.ElementTree as ET
import os

# xml文件所在目录
xml_dir = "E:/cv_code/image_processing/aug_datasets/Annotations/"
# Yolo格式文件保存目录
yolo_dir = "E:/cv_code/image_processing/aug_datasets/label/"
# 类别名称和数字标签的映射
class_map = {"pikaqiu": 0}

# 遍历XML文件夹中的所有文件
for xml_file in os.listdir(xml_dir):
    if not xml_file.endswith(".xml"):
        continue

    # 解析XML文件
    tree = ET.parse(os.path.join(xml_dir, xml_file))
    root = tree.getroot()

    # 获取图像尺寸
    size = root.find("size")
    width = int(size.find("width").text)
    height = int(size.find("height").text)

    # 遍历所有目标
    for obj in root.iter("object"):
        # 获取类别和边界框坐标
        cls_name = obj.find("name").text
        if cls_name not in class_map:
            continue

        cls_id = class_map[cls_name]

        bbox = obj.find("bndbox")
        xmin = float(bbox.find("xmin").text)
        ymin = float(bbox.find("ymin").text)
        xmax = float(bbox.find("xmax").text)
        ymax = float(bbox.find("ymax").text)

        # 计算归一化坐标
        x = (xmin + xmax) / (2 * width)
        y = (ymin + ymax) / (2 * height)
        w = (xmax - xmin) / width
        h = (ymax - ymin) / height

        # 将信息写入Yolo格式文件
        yolo_file = os.path.splitext(xml_file)[0] + ".txt"
        with open(os.path.join(yolo_dir, yolo_file), "a") as f:
            f.write(f"{cls_id} {x:.6f} {y:.6f} {w:.6f} {h:.6f}\n")

YOLO转VOC

(16条消息) YOLO格式转换为VOC格式_Bo菜来了的博客-CSDN博客

VOC格式转换为YOLO格式

前言