Big Language Model Research Summary

Since the emergence of ChatGPT, various large language models have been completely unblocked, and the models seen every day are not the same. It is almost impossible to tell which organization released these models, what functional characteristics they have, and the relationship between these models. For example, GPT-3.0 and GPT 3.5 have a series of model versions and indexes, as well as alpaca, vicuna, camel...

So I did a little research on the well-known large language models, mainly because I wanted to get acquainted with each other. After sorting it out, I felt much clearer, and I could easily browse Zhihu and learn.

一. Basic Language Model

The basic language model refers to a model that has only been pre-trained in a large-scale text corpus, without any alignment optimization such as instruction and downstream task fine-tuning, and human feedback.

Basic LLM basic information table, GPT-style means decoder-only autoregressive language model, T5-style means encoder-decoder language model, GLM-style means GLM special model structure, Multi-task means ERNIE 3.0 model structure

Guess you like

Origin blog.csdn.net/u013250861/article/details/130451965