Transform model explanation

Table of contents

game is a game

 With Beijing: Winter Olympics

 transform: encode, decode 12 steps

 The self-attention mechanism is the dismantling comparison of Transformers: generate parts V and weight K, and feedforward neural network for weight adjustment: preliminary deformation 

Encoder Attention is to consider context information

Attention Mechanism: Adopt Multi-Attention Mechanism: Prevent One Person’s Mutiny from Leading to Model Failure

 Data flow: Use algorithms to vectorize words, the same bet. 512 bits

 Through weights: Q, K, V to perform calculations to generate component descriptions and relationship descriptionsEdit

 8 weight matrices to prevent failure and eliminate the influence of initial weightsEdit

Word vectorization, matrix multiplication to generate relationship descriptions, attention weights, and final weighted sumsEdit


game is a game

 With Beijing: Winter Olympics

 

 

 

 transform: encode, decode 12 steps

 The self-attention mechanism is the dismantling comparison of Transformers: generate parts V and weight K, and feedforward neural network for weight adjustment: preliminary deformation 

Encoder Attention is to consider context information

[Transformer model] Graceful animation is easy to learn, image metaphor is easy to remember_哔哩哔哩_bilibili

Attention Mechanism: Adopt Multi-Attention Mechanism: Prevent One Person’s Mutiny from Leading to Model Failure

 

 Data flow: Use algorithms to vectorize words, the same bet. 512 bits

 

 Calculate and generate component description and relationship description through weight: Q, K, V

 

 8 weight matrices to prevent failure and eliminate the influence of initial weights

 

Word vectorization, matrix multiplication to generate relationship description, attention weight, and final weighted sum

 

 

 

 

 

 

Guess you like

Origin blog.csdn.net/qq_38998213/article/details/132376346