[Engineering Practice] Solve the problem that Synonyms cannot download word vector files

1 Introduction

        When using EDA (Easy Data Augmentation) for data enhancement, you need to use the Synonyms library to complete the extraction of synonyms.

        Synonyms is a toolkit of Chinese synonyms, which can be used for many natural language understanding (NLP) tasks such as text alignment, recommendation algorithms, similarity calculations, semantic shifting, keyword extraction, concept extraction, automatic summarization, search engines, etc. The toolkit is currently capable of tasks such as searching for synonyms and comparing sentence similarities, and has a vocabulary of 125,792. The basic technology used in this Chinese synonyms toolkit is Word2vec.

2. Encountered a problem

        The word vector file will be downloaded for the first time after the installation of synonyms, but an error will be reported to download the URL of the word vector file: https://gitee.com/chatopera/cskefu/attach_files/610602/download/words.vector.gz A 403 error occurs, through the browser Access to this URL is denied.

 3. Solutions

        Manually download the word vector file through the download address provided in GitHub, and then put the word vector file in the specified location.

        Download link: 

https://github.com/chatopera/Synonyms/releases/download/3.15.0/words.vector.gz

        Word vector file storage location: /home/zhenhengdong/anaconda3/lib/python3.9/site-packages/synonyms/data

        After downloading the word vector file and placing it in the specified location, the import synonyms are correct again.

Guess you like

Origin blog.csdn.net/weixin_44750512/article/details/131677363