Google releases ALBERT V2 and Chinese model

When it launched in September 2019, Google's ALBERT language model achieved SOTA results on popular natural language understanding (NLU) benchmarks such as GLUE, RACE, and SQuAD 2.0. Google has now released a major V2 ALBERT update and open-sourced the Chinese ALBERT model.

As the full name "A Lite BERT" suggests, ALBERT is a lite version of the company's BERT (Bidirectional Encoder Representations from Transformers) language representation model, which has become a mainstay of NLU research. The paper "ALBERT: Streamlined BERT for Self-Supervised Learning of Language Representations" has been accepted by ICLR 2020 held in Addis Ababa, Ethiopia this April.

As outlined in the Synced report, Google's ALBERT is a leaner BERT; achieving SOTA on 3 NLP benchmarks, similar to BERT's large ALBERT configuration with 18x fewer parameters and 1.7x faster training.

Comparison between v2 and v1 models
The main changes in the ALBERT v2 model involve three new strategies: no dropout, additional training data, and long training time. The researchers trained the ALBERT base for 10M steps and the other models for 3M steps. The results show that the performance of ALBERT v2 is generally significantly improved over the first version.

In special cases, ALBERT-xxlarge v2 performs slightly worse than the first version. The researchers identified two possible reasons: 1. Additional training for 1.5 million steps did not significantly improve performance; 2. For v1, the researchers performed some hyperparameter searches in the parameter set, while for v2, they took the parameters from v1 but fine-tuned the RACE test hyperparameters. "Given that downstream tasks are sensitive to fine-tuning hyperparameters, we should be careful about so-called slight improvements.

Google also released the Chinese ALBERT model, built using training data from the Chinese Language Understanding Evaluation Benchmark (CLUE).

The paper ALBERT: A Lite BERT for Self-supervised Learning of Language Representations was published on arXiv. ALBERT models v2 GitHub

Guess you like

Origin blog.csdn.net/virone/article/details/131763717