1 Introduction
When using EDA (Easy Data Augmentation) for data enhancement, you need to use the Synonyms library to complete the extraction of synonyms.
Synonyms is a toolkit of Chinese synonyms, which can be used for many natural language understanding (NLP) tasks such as text alignment, recommendation algorithms, similarity calculations, semantic shifting, keyword extraction, concept extraction, automatic summarization, search engines, etc. The toolkit is currently capable of tasks such as searching for synonyms and comparing sentence similarities, and has a vocabulary of 125,792. The basic technology used in this Chinese synonyms toolkit is Word2vec.
2. Encountered a problem
The word vector file will be downloaded for the first time after the installation of synonyms, but an error will be reported to download the URL of the word vector file: https://gitee.com/chatopera/cskefu/attach_files/610602/download/words.vector.gz A 403 error occurs, through the browser Access to this URL is denied.
3. Solutions
Manually download the word vector file through the download address provided in GitHub, and then put the word vector file in the specified location.
Download link:
https://github.com/chatopera/Synonyms/releases/download/3.15.0/words.vector.gz
Word vector file storage location: /home/zhenhengdong/anaconda3/lib/python3.9/site-packages/synonyms/data
After downloading the word vector file and placing it in the specified location, the import synonyms are correct again.