Article Directory
foreword
This article mainly introduces how to convert the existing image classification dataset into a dataset in ILSVRC format
1. Raw data set
The data structure of the original classification data set is roughly as follows:
Then under the train and val directories are the folders we named after the category name, as shown in the figure: that is, the
following structure needs to be satisfied:
-数据集
-train
-class1
-class2
...
-val
-class1
-class2
2. Format conversion
The conversion code is as follows:
import os
root=""#数据集根目录
class_id={
} #类别对应id的字典,如:{class1:0,class2:1} ,这个一定要记住
def create_txt(filename,mode):
filepath=os.path.join(root,filename)
with open(filepath,"a+") as f:
img_dir=os.path.join(root,mode)
for root1,dirs,files in os.walk(img_dir):
for file in files:
classname=root1.split(os.path.sep)[-1]
imgpath=os.path.join(root1,file)
data=imgpath+" "+str(class_id[classname])+"\n"
f.write(data)
if __name__ == '__main__':
create_txt("val.txt","val") #训练集即为create_txt("train.txt","train")
After execution, train.txt and val.txt will be generated in the statistics directory of train and val, as shown in the figure:
the information recorded in it is the image path + category id
Summarize
The above is the whole content of this article, if you have any questions, welcome to communicate in the comment area