【DEBUG】Customized Datasets error FileNotFoundError: Unable to find xx at /

When using a custom data set to train Stable Diffusion, the data set folder was created according to the Datasets document , and an error was reported when running the test code.

from datasets import load_dataset
ds = load_dataset('imagefolder', data_files='/xxxxx')
ds["train"][0]

#>>>FileNotFoundError: Unable to find 'xxxxx' at /

Check the directory data format and keep it consistent with the demo

folder/train/metadata.jsonl
folder/train/0001.png
folder/train/0002.png
folder/train/0003.png

After checking the source code, we found that the source code used to fs.globtraverse data_filesthat level of folders. If you need multi-level folder traversal, you need to pass in a glob expression, that is, add it after the path.**

glob_iter = [PurePath(filepath) for filepath in fs.glob(pattern) if fs.isfile(filepath)]

Change the code to

from datasets import load_dataset
ds = load_dataset('imagefolder', data_files='/xxxxx/**')
ds["train"][0]

successfully solved

Guess you like

Origin blog.csdn.net/u011459717/article/details/131090205