Process your own classification data set into the format of ILSVRC data set


foreword

This article mainly introduces how to convert the existing image classification dataset into a dataset in ILSVRC format


1. Raw data set

The data structure of the original classification data set is roughly as follows:
insert image description here
Then under the train and val directories are the folders we named after the category name, as shown in the figure: that is, the
insert image description here
following structure needs to be satisfied:

-数据集
    -train
        -class1
        -class2
        ...
    -val
        -class1
        -class2
        

2. Format conversion

The conversion code is as follows:

import os

root=""#数据集根目录
class_id={
    
    } #类别对应id的字典,如:{class1:0,class2:1} ,这个一定要记住
def create_txt(filename,mode):
    filepath=os.path.join(root,filename)
    with open(filepath,"a+") as f:
        img_dir=os.path.join(root,mode)
        for root1,dirs,files in os.walk(img_dir):
            for file in files:
                classname=root1.split(os.path.sep)[-1]
                imgpath=os.path.join(root1,file)
                data=imgpath+" "+str(class_id[classname])+"\n"

                f.write(data)


if __name__ == '__main__':
    create_txt("val.txt","val") #训练集即为create_txt("train.txt","train")
    

After execution, train.txt and val.txt will be generated in the statistics directory of train and val, as shown in the figure:
insert image description here
the information recorded in it is the image path + category id


Summarize

The above is the whole content of this article, if you have any questions, welcome to communicate in the comment area

Guess you like

Origin blog.csdn.net/qq_55068938/article/details/128190471