Tabla de contenido

Prefacio

1. ¿Introducción al formato VOC y al formato YOLO?

1.formato COV

2.Formato YOLO

2. Pasos de uso

1. Importar la biblioteca

2. Configure la dirección del archivo y la información de la imagen.

3. Recorrer y analizar archivos YOLO

4. Cree un archivo XML y escriba información.

5.Código total

Prefacio

Este artículo describe cómo convertir el formato YOLO al formato XML. Consulte otro artículo para convertir el formato XML al formato YOLO.

1. ¿Introducción al formato VOC y al formato YOLO?

1.formato COV

El formato del conjunto de datos VOC es formato XML. El siguiente es un ejemplo:

<anotación> 
   <carpeta>img</carpeta> 
   <nombre de archivo>pikaqiu.jpg</nombre de archivo> 
   <ruta>E:\cv_code\image_processing\test\img\pikaqiu.jpg</ruta> 
   <fuente> 
      <base de datos>Desconocido</base de datos> 
   </fuente> 
   <tamaño> 
      <ancho>1062</ancho> 
      <alto>974</alto> 
      <profundidad>3</profundidad> 
   </tamaño > 
   <segmentado>0</segmentedo> 
   <objeto> 
      <nombre>pikaqiu</nombre> 
      <pose>Sin especificar</pose> 
      <truncado>0</truncado> 
      <difícil>0</difícil> 
      <bndbox> 
         <xmin> 83</xmin> 
         <ymin>74</ymin> 
         <xmax>987</xmax><xmin>83</xmin> <ymin>74</ymin> <xmax>987</xmax><xmin>83</xmin> <ymin>74</ymin> <xmax>987</xmax> 
         <ymax>920</ymax> 
      </bndbox></bndbox></bndbox> 
   </objeto> 
</annotación>

La información que necesitamos usar incluye el nombre de la imagen: <filename>pikaqiu.jpg</filename>, el ancho, alto y la información del número de canal de la imagen: <size> <width>1062</width> <height>974 </altura > <profundidad>3</profundidad> </tamaño>, nombre de categoría: <nombre>pikaqiu</nombre>, información del cuadro: <xmin>83</xmin> <ymin>74</ymin> <xmax >987</xmax> <ymax>920</ymax>.

2.Formato YOLO

(clase, xCenter, yCenter, w, h), que representan respectivamente la clasificación interna, las coordenadas centrales del cuadro de etiqueta y el ancho y largo relativos del cuadro de etiqueta.

2. Pasos de uso

1. Importar la biblioteca

import xml.etree.ElementTree as ET
import os

2. Configure la dirección del archivo y la información de la imagen.

# Yolo格式文件所在目录
yolo_dir = "E:/cv_code/image_processing/aug_datasets/label/"
# XML文件保存目录
xml_dir = "E:/cv_code/image_processing/aug_datasets/Annotations/"
# 图像尺寸
img_width, img_height = 1062, 974

# 类别数字标签和名称的映射
class_map = {"pikaqiu": 0}

3. Recorrer y analizar archivos YOLO

# 遍历Yolo格式文件夹中的所有文件
for yolo_file in os.listdir(yolo_dir):
    if not yolo_file.endswith(".txt"):
        continue

    # 解析Yolo格式文件
    with open(os.path.join(yolo_dir, yolo_file), "r") as f:
        lines = f.readlines()

    # 获取图像名称
    img_file = os.path.splitext(yolo_file)[0] + ".jpg"

4. Cree un archivo XML y escriba información.

$xmín=ancho*(x-0.5w)$

$ymin=altura*(y-0.5h)$

$xmáx=peso*(x+0,5w)$

$ymax=altura*(y+0.5h)$

    # 创建XML文件
    #创建根节点
    root = ET.Element("annotation")
    #创建子节点
    filename = ET.SubElement(root, "filename")
    #添加文本
    filename.text = img_file
    size = ET.SubElement(root, "size")
    width = ET.SubElement(size, "width")
    width.text = str(img_width)
    height = ET.SubElement(size, "height")
    height.text = str(img_height)
    depth = ET.SubElement(size, "depth")
    depth.text = "3"

    # 遍历所有目标
    for line in lines:
        parts = line.strip().split()
        if len(parts) < 5:
            continue

        cls_id = int(parts[0])


        if cls_id  in class_map:
            continue

        for k, v in class_map.items():
            cls_name = None
            if v == cls_id:
                cls_name = k
                break

        # cls_name = class_map[cls_id]
        x = float(parts[1])
        y = float(parts[2])
        w = float(parts[3])
        h = float(parts[4])

        # 计算边界框坐标
        xmin = int((x - w / 2) * img_width)
        ymin = int((y - h / 2) * img_height)
        xmax = int((x + w / 2) * img_width)
        ymax = int((y + h / 2) * img_height)

        # 将信息写入XML文件
        obj = ET.SubElement(root, "object")
        name = ET.SubElement(obj, "name")
        name.text = cls_name
        bndbox = ET.SubElement(obj, "bndbox")
        xmin_node = ET.SubElement(bndbox, "xmin")
        xmin_node.text = str(xmin)
        ymin_node = ET.SubElement(bndbox, "ymin")
        ymin_node.text = str(ymin)
        xmax_node = ET.SubElement(bndbox, "xmax")
        xmax_node.text = str(xmax)
        ymax_node = ET.SubElement(bndbox, "ymax")
        ymax_node.text = str(ymax)
    #将树写入文件
    tree = ET.ElementTree(root)
    tree.write(xml_dir + os.path.splitext(yolo_file)[0] + ".xml")

5.Código total

import os
import xml.etree.ElementTree as ET

# Yolo格式文件所在目录
yolo_dir = "E:/cv_code/image_processing/aug_datasets/label/"
# XML文件保存目录
xml_dir = "E:/cv_code/image_processing/aug_datasets/Annotations/"
# 图像尺寸
img_width, img_height = 1062, 974

# 类别数字标签和名称的映射
class_map = {"pikaqiu": 0}

# 遍历Yolo格式文件夹中的所有文件
for yolo_file in os.listdir(yolo_dir):
    if not yolo_file.endswith(".txt"):
        continue

    # 解析Yolo格式文件
    with open(os.path.join(yolo_dir, yolo_file), "r") as f:
        lines = f.readlines()

    # 获取图像文件名
    img_file = os.path.splitext(yolo_file)[0] + ".jpg"

    # 创建XML文件
    #创建根节点
    root = ET.Element("annotation")
    #创建子节点
    filename = ET.SubElement(root, "filename")
    #添加文本
    filename.text = img_file
    size = ET.SubElement(root, "size")
    width = ET.SubElement(size, "width")
    width.text = str(img_width)
    height = ET.SubElement(size, "height")
    height.text = str(img_height)
    depth = ET.SubElement(size, "depth")
    depth.text = "3"

    # 遍历所有目标
    for line in lines:
        parts = line.strip().split()
        if len(parts) < 5:
            continue

        cls_id = int(parts[0])

        if cls_id  in class_map:
            continue

        for k, v in class_map.items():
            cls_name = None
            if v == cls_id:
                cls_name = k
                break

        x = float(parts[1])
        y = float(parts[2])
        w = float(parts[3])
        h = float(parts[4])

        # 计算边界框坐标
        xmin = int((x - w / 2) * img_width)
        ymin = int((y - h / 2) * img_height)
        xmax = int((x + w / 2) * img_width)
        ymax = int((y + h / 2) * img_height)

        # 将信息写入XML文件
        obj = ET.SubElement(root, "object")
        name = ET.SubElement(obj, "name")
        name.text = cls_name
        bndbox = ET.SubElement(obj, "bndbox")
        xmin_node = ET.SubElement(bndbox, "xmin")
        xmin_node.text = str(xmin)
        ymin_node = ET.SubElement(bndbox, "ymin")
        ymin_node.text = str(ymin)
        xmax_node = ET.SubElement(bndbox, "xmax")
        xmax_node.text = str(xmax)
        ymax_node = ET.SubElement(bndbox, "ymax")
        ymax_node.text = str(ymax)
    #将数据写入文件
    tree = ET.ElementTree(root)
    tree.write(xml_dir + os.path.splitext(yolo_file)[0] + ".xml")

Convertir el formato YOLO al formato VOC