Solve the problem that huggingface cannot download the model and data set in the code because of the network (pseudo)

Model download of huggingface

In fact, it is downloaded manually using git
. The specific method is:

sudo apt-get update
sudo apt-get install git-lfs
git lfs install 

Thengit clone https://huggingface.co/roberta-large

huggingface data set download

First of all, some data sets can also be downloaded through git (that kind, open the data set page, the file contains the data set ontology, some data set files only have python script files for downloading the data set, this kind of thing does not work)

Then you can only hang up the ladder, or change the network, download it locally, and then upload it to the server:

# 下载并本地存储
from datasets import load_dataset
dataset = load_dataset('super_glue', 'cb', cache_dir='./raw_datasets')
dataset.save_to_disk('superglue_cb')

# 读取本地的文件
from datasets import load_from_disk
raw_dataset = load_from_disk("saved_to_disk/superglue_cb")

Guess you like

Origin blog.csdn.net/Defiler_Lee/article/details/132825677