YOLO新增数据集类别，不需重新标注数据集

问题：自己的数据集已有类别0,1。想新增类别2，类别2在自己数据集中存在，不想重新全部标注一遍。

解决：用其他模型权重对自己数据集detect并保存txt文件→修改类别名称并合并txt

1. 用已有模型（可以检测出类别2）的权重文件来对自己数据集detect并保存txt文件到路径A

python detect.py --save-txt --source 自己数据集的文件目录 --weights ./weights/yolov5x.pt

自己数据集的文件目录：
E:\Desktop\detection project\datasets\VOC2000\labels\val
weights：yolov5x.pt，YOLO中预训练的模型里面有我需要的person类。

2. 更改类别名称，并合并txt文件。

改类别名称：比如detect之后的我们需要的类别2，在其他模型中是类别0，需要将0→2

合并txt：将类别2的txt内容合并到自己数据集的txt文件中，只要类别2的。

要注意其他模型检测出来的类别不止你想要的类别2一种还有其他不需要的类别，就需要筛选掉。在尝试很多代码之后，终于找到最纯粹的，其他的要么没改类别名，要么就合并了其他的类别，总是不够完美！！！

首先：提取detect出来的很多类别中你需要的类别2到一个新的文件夹：

import os

YOLOV5_LABEL_ROOT = f'E:\\Desktop\\detection project\\datasets\\VOC2000\\labels\\labels\\'  # yolov5 导出的推理图片的 txt
DATASET_LABEL_ROOT = f'E:\\Desktop\\detection project\\datasets\\VOC2000\\labels\\val-only\\'  # 存放数据集的路径，新的空白文件夹

if __name__ == '__main__':
    yolo_file = os.listdir(YOLOV5_LABEL_ROOT)

    # 遍历文件里面有 .txt 结尾的
    for file_name in yolo_file:

        # 判断 txt 文件才进行读取
        if not file_name.endswith(".txt"):
            continue

        file_path = YOLOV5_LABEL_ROOT + file_name
        with open(file_path, "r") as f:
            for line in f.readlines():

                # 只需要提取 0 -> person 的数据，在他人模型中你需要的类别数
                if line.split()[0] != '38':
                    continue

                data_path = DATASET_LABEL_ROOT + file_name
                print(data_path)
                # 汇总到数据集的标注文件
                with open(data_path, "a") as fd:
                    fd.write(line)

其次，把刚才提出的类别号修改成自己需要的类别号。

import os
import re
# 路径，提取出的所需类别2的新文件夹位置
path = 'E:\\Desktop\\detection project\\datasets\\VOC2000\\labels\\val-only\\'
# 文件列表
files = []
for file in os.listdir(path):
    if file.endswith(".txt"):
        files.append(path+file)
# 逐文件读取-修改-重写
for file in files:
    with open(file, 'r') as f:
        new_data = re.sub('^38', '555', f.read(), flags=re.MULTILINE)    # 将列中的38替换为555,555是该类别在自己数据集中的类别。
    with open(file, 'w') as f:
        f.write(new_data)

最后，合并两个文件夹，把新提取的类合并到自己的数据集label文件夹中。over！！

import os

path = 'E:\\Desktop\\detection project\\datasets\\VOC2000\\labels\\val-only\\'
pathName = os.listdir(path)
oldpath = 'E:\\Desktop\\detection project\\datasets\\VOC2000\\labels\\val\\'  # 自己数据集label所在路径
oldpathName = os.listdir(oldpath)
for files in pathName:
    for oldfiles in oldpathName:
        if files == oldfiles:
            with open(path +'\\'+ files ,'r') as  f1:
                with open(oldpath +'\\' + oldfiles,'a+') as f2:
                    f2.write(f1.read())

重新开始训练数据集。

YOLO新增数据集类别，不需重新标注数据集

问题：自己的数据集已有类别0,1。想新增类别2，类别2在自己数据集中存在，不想重新全部标注一遍。

1. 用已有模型（可以检测出类别2）的权重文件来对自己数据集detect并保存txt文件到路径A

2. 更改类别名称，并合并txt文件。

猜你喜欢