Very detailed transformer notes, including XLNet, MT-DNN, ERNIE, ERNIE2,


The Chinese school teacher updated his personal notes and added Transformer notes, including XLNet, MT-DNN, ERNIE, ERNIE2, RoBERTa, etc. The content is very detailed and worth learning, so I recommend it.


The author is a senior algorithm engineer at Alibaba and the chief algorithm researcher of Zhiyi Technology. He is currently a senior researcher at Tencent and the author of "Python vs. Machine Learning". Teacher Hua is also a guest of our knowledge planet.

This is the author's study and summary notes for many years, after sorting out, it is open sourced to the world.

Note address:

http://www.huaxiaozhuan.com/Deep Learning/chapters/7_Transformer.html

Introduction to Transformer


Transformer is a new feature extractor based on the attention mechanism, which can be used to replace CNN and RNN to extract sequence features.

Transformer was first proposed by the paper "Attention Is All You Need", in which Transformer is used in the encoder-decoder architecture. In fact, Transformer can be applied to encoder alone or to decoder alone.

Transformer notes directory


One, Transformer

  • 1.1 Structure

  • 1.2 Transformer vs CNN vs RNN

  • 1.3 Experimental results

二、Universal Transformer

  • 2.1 Structure

  • 2.2 ACT

    • 2.3 Experimental results

Three, Transformer XL

  • 3.1 Segment-level recursion

  • 3.2 Relative position coding

    • 3.3 Experimental results

Four, GPT

  • 4.1 GPT V1

  • 4.2 GPT V2

Five, BERT

  • 5.1 Pre-training

    • 5.2 Model structure

    • 5.3 Fine tuning

    • 5.4 Performance

六、ERNIE

  • 6.1 ERNIE 1.0

  • 6.2 ERNIE 2.0

Seven, XLNet

  • 7.1 Autoregressive language model vs self-encoding language model
  • 7.2 Permutation Language Model

    • 7.3 Two-Stream Self-Attention
  • 7.4 Partial Prediction

  • 7.5 Introducing Transformer XL

  • 7.6 Multiple inputs

  • 7.7 Model comparison

    • 7.8 Experiment

8. MT-DNN

  • 8.1 Model

    • 8.2 Experiment

Nine, BERT extension

  • 9.1 BERT-wwm-ext

    • 9.2 RoBERTa

Note screenshot

Very detailed transformer notes, including XLNet, MT-DNN, ERNIE, ERNIE2,

Note screenshot

Very detailed transformer notes, including XLNet, MT-DNN, ERNIE, ERNIE2,

other


Personal website of Chinese school teacher:

http://www.huaxiaozhuan.com/
Note address:

http://www.huaxiaozhuan.com/Deep Learning/chapters/7_Transformer.html

github:

https://github.com/huaxz1986

Guess you like

Origin blog.51cto.com/15064630/2578562