Transformer code is simple to understand
In order to better understand Transformer's code, I consulted relevant information and integrated it with reference to the content of many big guys.
The main content is written in blocks according to Transformer's architecture, which can be viewed and modified on Jupyter.
The schematic diagram is as follows:
The relevant content is already very detailed, so I hope to check the relevant warehouse to understand.
Both channels are available: