Convert the json file marked by labelme to yolo format

The json file marked by labelme is generated during data labeling and cannot be directly applied to model training. All major target detection training platforms or project frameworks have their own data format requirements, usually in voc, coco or yolo format. Since the yolov8 project is relatively hot, this blog post introduces in detail the process of converting json format annotations to yolo format and its code.

1. Data structure

1.1 labelme (json) data format

The data storage structure of Labelme (json) is shown in the figure below:
Insert image description here

Shape: is an array that stores all labels and corresponding boxes;
imagePath: image path, json files and jpg images are stored in the same path
imageHeight: the height of the image
imageWidth: the width of the image

Shape is an array, which stores dictionaries, and the specific storage content is as follows:
Insert image description here

Label: Label category
Points: The starting point and end point of label labeling
Shape_type: rectangle means rectangle

1.2 yolo data format

Data format and data representation
There are no fixed requirements for the file format (usually images store the original image; labels store the txt label position). The basic data format is as shown in the figure below:

Insert image description here
Txt file structure information description
The format of the Txt tag is, {target category id} {normalized target center point x coordinate} {normalized target center point y coordinate} {normalized target frame width w } {Normalized target box height h}. Different from other data, yolo tags only have category ids and no specific category names. In addition, it describes the xywh information of the annotation box in relative size, which is not affected by the change of image size.
Insert image description here

2. Convert code

2.1 Code logic description

json2yolo implements the conversion of json format annotations into yolo format. The characteristic of the Json format is that the name in the form of str is used to describe the category of the label box, while the yolo format directly describes the category of the label box in the category id. In json annotation, each annotation box is in absolute coordinate format, specifically x1, y1, x2, y2 (x1, y1 are the starting point of the annotation box, x2, y2 are the end points of the annotation box), while in the yolo format, Describe the label box with relative coordinates (x relative to w, y relative to h), the specific format is cx, cy, w, h (where cx, cy are the center points of the label box)

2.2 All code

import json
import cv2
import numpy as np
import os

def json2yolo(path):
    dic={
    
    '火':'0',"烟雾":'1'}#类别字典
    data = json.load(open(path,encoding="utf-8"))#读取带有中文的文件
    w=data["imageWidth"]#获取jaon文件里图片的宽高
    h=data["imageHeight"]
    all_line=''
    for i in  data["shapes"]:
        #归一化坐标点。并得到cx,cy,w,h
        [[x1,y1],[x2,y2]]=i['points'] 
        x1,x2=x1/w,x2/w
        y1,y2=y1/h,y2/h
        cx=(x1+x2)/2
        cy=(y1+y2)/2
        w=abs(x2-x1)
        h=abs(y2-y1)

        #将数据组装成yolo格式
        line="%s %.4f %.4f %.4f %.4f\n"%(dic[i['label']],cx,cy,w,h)#生成txt文件里每行的内容
        all_line+=line
    #print(all_line)
    filename=path.replace('json','txt')#将path里的json替换成txt,生成txt里相对应的文件路径
    fh=open(filename,'w',encoding='utf-8')
    fh.write(all_line)
    fh.close()

path="D:/yolo_seq/fire_smoke/data/data/"
path_list=os.listdir(path)
path_list2=[x for x in path_list if ".json" in x]#获取所有json文件的路径
for p in path_list2:
    json2yolo(path+p)

Guess you like

Origin blog.csdn.net/m0_74259636/article/details/132768892