Table of contents
foreword
This blog will record the size of common models in the field of deep learning. The specific algorithm is as follows
torchinfo.summary(model)
The model may come from official PyTorch, HuggingFace, etc.
If there are any mistakes or suggestions, please point them out in the comment area.
third party library | Version |
---|---|
transformers | 4.30.2 |
PyTorch | 2.0.1 |
1. NLP
1.1 Transformer Architecture
Encoder-Only Architecture
Model | source | total parameters | total parameters |
---|---|---|---|
BERT-base | HuggingFace | 109,482,240 | 109.5M |
BERT-large | HuggingFace | 335,141,888 | 335.1M |
ROBERTa-base | HuggingFace | 124,645,632 | 124.6M |
RoBERTa-large | HuggingFace | 355,359,744 | 355.3M |
deBERTa-base | HuggingFace | 138,601,728 | 138.6M |
DeBERTa-large | HuggingFace | 405,163,008 | 405.2M |
DeBERTa-xlarge | HuggingFace | 757,804,032 | 757.8M |
DistilBERT | HuggingFace | 66,362,880 | 66.4M |
Decoder-Only Architecture
Model | source | total parameters | total parameters |
---|---|---|---|
GPT | HuggingFace | 116,534,784 | 116.5M |
GPT-2 | HuggingFace | 124,439,808 | 124.4M |
GPT-2-medium | HuggingFace | 354,823,168 | 354.8M |
GPT-2-large | HuggingFace | 774,030,080 | 774.0M |
GPT-J | HuggingFace | 5,844,393,984 | 5.8B |
LLaMA | HuggingFace | 6,607,343,616 | 6.6B |
Encoder-Decoder Architecture
Model | source | total parameters | total parameters |
---|---|---|---|
Transformer | PyTorch | 44,140,544 | 44.1M |
T5-small | HuggingFace | 93,405,696 | 93.4M |
T5-base | HuggingFace | 272,252,160 | 272.3M |
T5-large | HuggingFace | 803,466,240 | 803.5M |
2. CV
2.1 CNN Architecture
Model | source | total parameters | total parameters |
---|---|---|---|
AlexNet | PyTorch | 61,100,840 | 61.1M |
GoogleNet | PyTorch | 13,004,888 | 13.0M |
VGG-11 | PyTorch | 132,863,336 | 132.9M |
VGG-13 | PyTorch | 133,047,848 | 133.0M |
VGG-16 | PyTorch | 138,357,544 | 138.4M |
VGG-19 | PyTorch | 143,667,240 | 143.7M |
ResNet-18 | PyTorch | 11,689,512 | 11.7M |
ResNet-34 | PyTorch | 21,797,672 | 21.8m |
ResNet-50 | PyTorch | 25,557,032 | 25.6M |
ResNet-101 | PyTorch | 44,549,160 | 44.5M |
ResNet-152 | PyTorch | 60,192,808 | 60.2m |
2.2 Transformer architecture
Model | source | total parameters | total parameters |
---|---|---|---|
SwinTransformer-tiny | PyTorch | 28,288,354 | 28.3M |
SwinTransformer-small | PyTorch | 49,606,258 | 49.6M |
SwinTransformer-base | PyTorch | 87,768,224 | 87.8M |
ViT-base-16 | PyTorch | 86,567,656 | 86.6M |
ViT-base-32 | PyTorch | 88,224,232 | 88.2M |
ViT-large-16 | PyTorch | 304,326,632 | 304.3M |
ViT-large-32 | PyTorch | 306,535,400 | 306.5M |
ViT-Huge-14 | PyTorch | 632,045,800 | 632.0M |