How to change huggingface transformers default cache directory
foreword
Recently, I am learning to use the TensorFlow framework to do NLP tasks. I noticed that the transforms library in huggingface is very powerful, so I started to learn to use it to do corresponding tasks. Not long after I started using this library, I feel that it is really simple and powerful to operate, so I plan to learn more.
- During the learning process, I found that during the process of running the program, the downloaded model and data set are placed in the user directory of the C drive by default. In order to reduce the burden on the C drive, I want to change the default directory. I learned from the official website that there are two implementation methods , one is to temporarily specify the cache_dir, and the other is to directly set the environment variable, the two are described below.
About how to modify the default cache folder of huggingface transformers on windows
- The official description of the cache address:
The first method: setting environment variables:
On Windows, for the convenience of future use, I adopted the first method of setting the cache address, that is, setting the TRANSFORMERS_CACHE environment variable, which I set in In the user environment variable:
the next key step:
add in the user environment variable Path or system environment zero PATH:
%TRANSFORMERS_CACHE%
The second way: use the cache_dir input parameter when calling the from_pretrained function, and specify the cache folder name.
For example, specify the cache location as the current directory:
AutoModel.from_pretrained('bert-base-chinese', cache_dir='./')
that's all