Chinese pre-training model ERNIE2.0 model download and install

July 2019, Baidu ERNIE upgrade, released semantic framework for understanding ERNIE continuous learning 2.0, and ERNIE this framework 2.0 pre-training model, which uses Baidu massive amounts of data and fly paddle (PaddlePaddle) multi-machine card and efficient training advantages, the depth and multi-task learning neural network technology, continuous learning huge amounts of data and knowledge. Aini of the frame (ERNIE) model based on pre-trained, have studied more than 1 billion of knowledge, including knowledge of multiple dimensions of natural language morphology, syntax, semantics, there is a strong common semantic representation capability for a variety of NLP scenarios, significantly enhance the effect, the use of efficient and convenient.

In this part teach how to download and use!

First, pre-training model download

ERNIE 2.0 English Base model
https://ernie.bj.bcebos.com/ERNIE_Base_en_stable-2.0.0.tar.gz
includes pre-trained model parameters, dictionary vocab.txt, model configuration ernie_config.json

ERNIE 2.0 English Large model
https://ernie.bj.bcebos.com/ERNIE_Large_en_stable-2.0.0.tar.gz
includes pre-trained model parameters, dictionary vocab.txt, model configuration ernie_config.json

Second, data download

Chinese Data: https://ernie.bj.bcebos.com/task_data_zh.tgz

English Data: Since the data collection protocol issues, where data sets are not available in English directly. GLUE data download Please refer GLUE homepage ( https://gluebenchmark.com/tasks ) Download the code and the data provided by GLUE ( https://gist.github.com/W4ngatang/60c2bdb54d156a41194446737ce03e2e ).

Suppose all data sets downloaded path is placed $GLUE_DATAafter the data download is complete, execution sh ./script/en_glue/preprocess/cvt.sh $GLUE_DATAwill complete all the data format conversion, the default converted data is output to the folder ./glue_data_processed/.

Three, PaddlePaddle installation

The project relies on Paddle Fluid 1.5, please refer to the installation guide
( https://www.paddlepaddle.org.cn/#quick-start ) for installation.

[Important] after installation, the need for timely to CUDA, cuDNN, NCCL2 other dynamic library path environment variable is added into the LD_LIBRARY_PATH, otherwise the training process will be relevant to the library error. Paddlepaddle specific configuration details please refer to:
https://www.paddlepaddle.org.cn/documentation/docs/zh/1.5/beginners_guide/quick_start_cn.html

If you would like more information about Paddle, such as modeling for practical problems, to build their own networks, there are more documents from official for your reference:
Basic concepts : The basic concepts of Fluid in use
ready data : the use Fluid training network, data transmission method and the type of support
configure a simple network : how to model for the problem, and take advantage of Fluid relevant operators to build networks
trained neural networks : how to use Fluid for standalone training , multi-machine training, as well as save and load model variables
model evaluation and debugging : introduce the model evaluation and debug methods in Fluid
other dependent ERNIE listed in requirements.txt file, use the following command to install

pip install -r requirements.txt

Draw focus!
View full content and tutorials ERNIE model used, please click on the link below, it is recommended Star Add to profile, to facilitate subsequent viewing.
GitHub: https://github.com/PaddlePaddle/ERNIEHere Insert Picture Description
version of iterations, the latest developments will be the first time published in GitHub, welcomed the sustained attention!

ERNIE also invite you to join the official technical exchange QQ group: 760 439 550, can be in communication technology issues within the group, there will be ERNIE development of students answering questions in a timely manner for everyone.
Here Insert Picture Description

Released eight original articles · won praise 0 · Views 681

Guess you like

Origin blog.csdn.net/qq_40247584/article/details/102917265