DataFun: Detailed model behind ChatGPT

Detailed explanation of the model behind ChatGPT

Overview

insert image description here

Transofrmer

insert image description here
insert image description here
Each component has a role

Multi-head self attention

insert image description here
The importance of each word is different, learning QKV three matrix (query, key, value)
insert image description here
multiple sets of QKV
insert image description here
insert image description here

RLHF

insert image description here
insert image description here

insert image description here
insert image description here
insert image description here
insert image description here
insert image description here
insert image description here

ChatGPT training process

insert image description here
insert image description here
insert image description here
insert image description here
insert image description here
insert image description here

Chain of thought COT

insert image description here
insert image description here
insert image description here

insert image description here

Guess you like

Origin blog.csdn.net/uncle_ll/article/details/131668411