Summary of common model sizes for deep learning (continuously updated...)

foreword

This blog will record the size of common models in the field of deep learning. The specific algorithm is as follows

torchinfo.summary(model)

The model may come from official PyTorch, HuggingFace, etc.

If there are any mistakes or suggestions, please point them out in the comment area.

third party library Version
transformers 4.30.2
PyTorch 2.0.1

1. NLP

1.1 Transformer Architecture

Encoder-Only Architecture

Model source total parameters total parameters
BERT-base HuggingFace 109,482,240 109.5M
BERT-large HuggingFace 335,141,888 335.1M
ROBERTa-base HuggingFace 124,645,632 124.6M
RoBERTa-large HuggingFace 355,359,744 355.3M
deBERTa-base HuggingFace 138,601,728 138.6M
DeBERTa-large HuggingFace 405,163,008 405.2M
DeBERTa-xlarge HuggingFace 757,804,032 757.8M
DistilBERT HuggingFace 66,362,880 66.4M

Decoder-Only Architecture

Model source total parameters total parameters
GPT HuggingFace 116,534,784 116.5M
GPT-2 HuggingFace 124,439,808 124.4M
GPT-2-medium HuggingFace 354,823,168 354.8M
GPT-2-large HuggingFace 774,030,080 774.0M
GPT-J HuggingFace 5,844,393,984 5.8B
LLaMA HuggingFace 6,607,343,616 6.6B

Encoder-Decoder Architecture

Model source total parameters total parameters
Transformer PyTorch 44,140,544 44.1M
T5-small HuggingFace 93,405,696 93.4M
T5-base HuggingFace 272,252,160 272.3M
T5-large HuggingFace 803,466,240 803.5M

2. CV

2.1 CNN Architecture

Model source total parameters total parameters
AlexNet PyTorch 61,100,840 61.1M
GoogleNet PyTorch 13,004,888 13.0M
VGG-11 PyTorch 132,863,336 132.9M
VGG-13 PyTorch 133,047,848 133.0M
VGG-16 PyTorch 138,357,544 138.4M
VGG-19 PyTorch 143,667,240 143.7M
ResNet-18 PyTorch 11,689,512 11.7M
ResNet-34 PyTorch 21,797,672 21.8m
ResNet-50 PyTorch 25,557,032 25.6M
ResNet-101 PyTorch 44,549,160 44.5M
ResNet-152 PyTorch 60,192,808 60.2m

2.2 Transformer architecture

Model source total parameters total parameters
SwinTransformer-tiny PyTorch 28,288,354 28.3M
SwinTransformer-small PyTorch 49,606,258 49.6M
SwinTransformer-base PyTorch 87,768,224 87.8M
ViT-base-16 PyTorch 86,567,656 86.6M
ViT-base-32 PyTorch 88,224,232 88.2M
ViT-large-16 PyTorch 304,326,632 304.3M
ViT-large-32 PyTorch 306,535,400 306.5M
ViT-Huge-14 PyTorch 632,045,800 632.0M

Guess you like

Origin blog.csdn.net/raelum/article/details/131626578