Make val_list.txt and train_list.txt

If there is only a data set (only pictures) and no files such as val_list.txt for semantic segmentation, you can refer to the following code:

##制作label list
import os
import pandas  as  pd
from sklearn.model_selection import train_test_split

DATAPATH='/home/aistudio/work/dataset/'
lab_train_lists=os.listdir(DATAPATH+'lab_train')
lab_train_lists.sort()
print(lab_train_lists[:2])
img_train_lists=os.listdir(DATAPATH+'img_train')
img_train_lists.sort()
total=[DATAPATH+'img_train/'+img_train_list+' '+\
DATAPATH+'lab_train/'+lab_train_list for img_train_list,lab_train_list in zip(img_train_lists,lab_train_lists)]
df=pd.DataFrame(total)
train_df, val_df = train_test_split(df, test_size=0.2, random_state=1000)
val_df.to_csv(DATAPATH+'val_list.txt',index=0,header=0)
train_df.to_csv(DATAPATH+'train_list.txt',index=0,header=0)

test_lists=os.listdir(DATAPATH+'img_testA/')
test_lists.sort()
test_total=[DATAPATH+'img_testA/'+test_list for test_list in test_lists]
test_df=pd.DataFrame(test_total)
test_df.to_csv(DATAPATH+'test_list.txt',index=0,header=0)

This code is run on ai studio, you can modify your own data set path

In addition, the first version is more readable, so I still post it. The above is just to practice the functional formula

##制作label list
import os
import pandas  as  pd
from sklearn.model_selection import train_test_split
lab_train_lists=os.listdir('/home/aistudio/work/dataset/lab_train')
lab_train_lists.sort()
img_train_lists=os.listdir('/home/aistudio/work/dataset/img_train')
img_train_lists.sort()
print(img_train_lists[0])
print(lab_train_lists[0])
total=[]
for i in range(len(img_train_lists)):
    total.append('/home/aistudio/work/dataset/img_train/'+img_train_lists[i]+' '+'/home/aistudio/work/dataset/lab_train/'+lab_train_lists[i])
total[:10]

df=pd.DataFrame(total)
#val_df=df.loc[:14598]
#train_df=df.loc[14598:]
train_df, val_df = train_test_split(df, test_size=0.2, random_state=1000)
val_df.to_csv("/home/aistudio/work/dataset/val_list.txt",index=0,header=0)
train_df.to_csv("/home/aistudio/work/dataset/train_list.txt",index=0,header=0)

# test_lists=os.listdir('/home/aistudio/work/dataset/img_testA/')
# test_lists.sort()
# test_total=[]
# for i in range(len(test_lists)):
#     test_total.append('/home/aistudio/work/dataset/img_testA/'+test_lists[i])
# test_df=pd.DataFrame(test_total)
# test_df.to_csv("/home/aistudio/work/dataset/test_list.txt",index=0,header=0)

Look, the first version just deleted the for loop, and the content is basically the same

Guess you like

Origin blog.csdn.net/zhou_438/article/details/109438369