Índice

Prefácio

1. Introdução ao formato VOC e formato YOLO?

1. Formato VOC

Formato 2.YOLO

2. Etapas de uso

1. Importe a biblioteca

2. Defina o endereço do arquivo e as informações da imagem

3. Percorra e analise arquivos YOLO

4. Crie um arquivo XML e escreva informações

5. Código total

Prefácio

Este artigo descreve como converter o formato YOLO para o formato XML. Veja outro artigo para converter o formato XML para o formato YOLO.

1. Introdução ao formato VOC e formato YOLO?

1. Formato VOC

O formato do conjunto de dados VOC é o formato XML. Veja a seguir um exemplo:

<annotation> 
   <folder>img</folder> 
   <filename>pikaqiu.jpg</filename> 
   <caminho>E:\cv_code\image_processing\test\img\pikaqiu.jpg</path> 
   <source> 
      <database>Desconhecido</database> 
   </source> 
   <size> 
      <width>1062</width> 
      <height>974</height> 
      <profundidade>3</profundidade> 
   </size > 
   <segmented>0</segmented> 
   <object> 
      <name>pikaqiu</name> 
      <pose>Não especificado</pose> 
      <truncated>0</truncated> 
      <difficult>0</difficult> 
      <bndbox> 
         <xmin> 83</xmin> 
         <ymin>74</ymin> 
      </bndbox></bndbox></bndbox> 
         <xmax>987</xmax> <xmax>987</xmax> <xmax>987</xmax>
         <ymax>920</ymax> 
   </object> 
</annotation>

As informações que precisamos usar incluem o nome da imagem: <filename>pikaqiu.jpg</filename>, a largura, altura e informações do número do canal da imagem: <size> <width>1062</width> <height>974 </height > <profundidade>3</profundidade> </size>, nome da categoria: <name>pikaqiu</name>, informações da caixa: <xmin>83</xmin> <ymin>74</ymin> <xmax >987< /xmax> <ymax>920</ymax>.

Formato 2.YOLO

(classe,xCenter,yCenter,w,h), que representam respectivamente a classificação interna, as coordenadas centrais da caixa de rótulo e a largura e comprimento relativos da caixa de rótulo.

2. Etapas de uso

1. Importe a biblioteca

import xml.etree.ElementTree as ET
import os

2. Defina o endereço do arquivo e as informações da imagem

# Yolo格式文件所在目录
yolo_dir = "E:/cv_code/image_processing/aug_datasets/label/"
# XML文件保存目录
xml_dir = "E:/cv_code/image_processing/aug_datasets/Annotations/"
# 图像尺寸
img_width, img_height = 1062, 974

# 类别数字标签和名称的映射
class_map = {"pikaqiu": 0}

3. Percorra e analise arquivos YOLO

# 遍历Yolo格式文件夹中的所有文件
for yolo_file in os.listdir(yolo_dir):
    if not yolo_file.endswith(".txt"):
        continue

    # 解析Yolo格式文件
    with open(os.path.join(yolo_dir, yolo_file), "r") as f:
        lines = f.readlines()

    # 获取图像名称
    img_file = os.path.splitext(yolo_file)[0] + ".jpg"

4. Crie um arquivo XML e escreva informações

$xmín=largura*(x-0,5w)$

$ymin=altura*(y-0,5h)$

$xmáx=peso*(x+0,5w)$

$ymáx=altura*(y+0,5h)$

    # 创建XML文件
    #创建根节点
    root = ET.Element("annotation")
    #创建子节点
    filename = ET.SubElement(root, "filename")
    #添加文本
    filename.text = img_file
    size = ET.SubElement(root, "size")
    width = ET.SubElement(size, "width")
    width.text = str(img_width)
    height = ET.SubElement(size, "height")
    height.text = str(img_height)
    depth = ET.SubElement(size, "depth")
    depth.text = "3"

    # 遍历所有目标
    for line in lines:
        parts = line.strip().split()
        if len(parts) < 5:
            continue

        cls_id = int(parts[0])


        if cls_id  in class_map:
            continue

        for k, v in class_map.items():
            cls_name = None
            if v == cls_id:
                cls_name = k
                break

        # cls_name = class_map[cls_id]
        x = float(parts[1])
        y = float(parts[2])
        w = float(parts[3])
        h = float(parts[4])

        # 计算边界框坐标
        xmin = int((x - w / 2) * img_width)
        ymin = int((y - h / 2) * img_height)
        xmax = int((x + w / 2) * img_width)
        ymax = int((y + h / 2) * img_height)

        # 将信息写入XML文件
        obj = ET.SubElement(root, "object")
        name = ET.SubElement(obj, "name")
        name.text = cls_name
        bndbox = ET.SubElement(obj, "bndbox")
        xmin_node = ET.SubElement(bndbox, "xmin")
        xmin_node.text = str(xmin)
        ymin_node = ET.SubElement(bndbox, "ymin")
        ymin_node.text = str(ymin)
        xmax_node = ET.SubElement(bndbox, "xmax")
        xmax_node.text = str(xmax)
        ymax_node = ET.SubElement(bndbox, "ymax")
        ymax_node.text = str(ymax)
    #将树写入文件
    tree = ET.ElementTree(root)
    tree.write(xml_dir + os.path.splitext(yolo_file)[0] + ".xml")

5. Código total

import os
import xml.etree.ElementTree as ET

# Yolo格式文件所在目录
yolo_dir = "E:/cv_code/image_processing/aug_datasets/label/"
# XML文件保存目录
xml_dir = "E:/cv_code/image_processing/aug_datasets/Annotations/"
# 图像尺寸
img_width, img_height = 1062, 974

# 类别数字标签和名称的映射
class_map = {"pikaqiu": 0}

# 遍历Yolo格式文件夹中的所有文件
for yolo_file in os.listdir(yolo_dir):
    if not yolo_file.endswith(".txt"):
        continue

    # 解析Yolo格式文件
    with open(os.path.join(yolo_dir, yolo_file), "r") as f:
        lines = f.readlines()

    # 获取图像文件名
    img_file = os.path.splitext(yolo_file)[0] + ".jpg"

    # 创建XML文件
    #创建根节点
    root = ET.Element("annotation")
    #创建子节点
    filename = ET.SubElement(root, "filename")
    #添加文本
    filename.text = img_file
    size = ET.SubElement(root, "size")
    width = ET.SubElement(size, "width")
    width.text = str(img_width)
    height = ET.SubElement(size, "height")
    height.text = str(img_height)
    depth = ET.SubElement(size, "depth")
    depth.text = "3"

    # 遍历所有目标
    for line in lines:
        parts = line.strip().split()
        if len(parts) < 5:
            continue

        cls_id = int(parts[0])

        if cls_id  in class_map:
            continue

        for k, v in class_map.items():
            cls_name = None
            if v == cls_id:
                cls_name = k
                break

        x = float(parts[1])
        y = float(parts[2])
        w = float(parts[3])
        h = float(parts[4])

        # 计算边界框坐标
        xmin = int((x - w / 2) * img_width)
        ymin = int((y - h / 2) * img_height)
        xmax = int((x + w / 2) * img_width)
        ymax = int((y + h / 2) * img_height)

        # 将信息写入XML文件
        obj = ET.SubElement(root, "object")
        name = ET.SubElement(obj, "name")
        name.text = cls_name
        bndbox = ET.SubElement(obj, "bndbox")
        xmin_node = ET.SubElement(bndbox, "xmin")
        xmin_node.text = str(xmin)
        ymin_node = ET.SubElement(bndbox, "ymin")
        ymin_node.text = str(ymin)
        xmax_node = ET.SubElement(bndbox, "xmax")
        xmax_node.text = str(xmax)
        ymax_node = ET.SubElement(bndbox, "ymax")
        ymax_node.text = str(ymax)
    #将数据写入文件
    tree = ET.ElementTree(root)
    tree.write(xml_dir + os.path.splitext(yolo_file)[0] + ".xml")

Converter o formato YOLO para o formato VOC