Unable to load vocabulary from file. Please check that the provided vocabulary is accessible and not

问题：

python使用transformers时报错：

1、

Unable to load vocabulary from file. Please check that the provided vocabulary is accessible and not corrupted.

无法从文件中加载词表。请检查所提供的词表是否可以访问并且没有损坏。

2、

Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96

Repo ID必须使用字母数字字符或者'-'、'_'、'...'，禁止使用'--'和'...'，'-'和'...'不能作为名称的开始或结束，最大长度为96。

解决方法：

1、

一开始以为是下载的文件损坏了，但是重新下载之后依旧报错。

扫描二维码关注公众号，回复： 14710232 查看本文章

在这里看到了一个解决方法，说只用更新一下即可：

pip install --upgrade transformers sentencepiece

不过尝试之后发现没用。

猜测并非因为文件损坏报错，而是因为无法访问导致的，这可能是由于中文路径的原因，遂将路径改为英文，该错误消失。

2、

又现了如下错误：

Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96

发现是因为在代码中不小心把文件名mDeBERTa-v3-base-mnli-xnli中的"-"写成了"_"，将代码中的"_"全部改为"-"后该错误消失。

from transformers import pipeline

model_name = r"D:\NLP\model\mDeBERTa-v3-base-mnli-xnli"

classifier = pipeline("zero-shot-classification", model=model_name)

虽然第二个错误说的是Repo ID必须使用字母数字字符或者'-'、'_'、'...'，禁止使用'--'和'...'，'-'和'...'不能作为名称的开始或结束，但也可能是因为模型文件夹的名字写错了。

Unable to load vocabulary from file. Please check that the provided vocabulary is accessible and not

猜你喜欢