Bert learning materials

https://zhuanlan.zhihu.com/p/49271699


Bert significant progress in recent years, NLP master of now most of the NLP tasks can use a similar two-stage model directly to enhance the effect of

Transformer is Google doing machine translation task in 17 "Attention is all you need" a paper presented, caused considerable repercussions, many studies have demonstrated Transformer ability to extract features is far stronger than LSTM of .

Transformer in the future will gradually become the mainstream alternative out RNN NLP tools, RNN has been caught between its parallel computing capabilities, because the sequence dependence structure itself caused.

CNN has not formed the mainstream in the NLP, the biggest advantage is the ease of doing CNN parallel computing, so fast, but especially long-distance relationship in the sequence capture feature NLP terms of natural flawed


https://zhuanlan.zhihu.com/p/37601161   depth study of the attention model

https://jalammar.github.io/illustrated-transformer/   the Transformer Data



Guess you like

Origin blog.51cto.com/kankan/2404473