Large model training graphics card selection

Large model training graphics card comparison

A100 is the first choice for large model training, A40 is used for inference, and H100 is currently launched as the next generation replacement for A100.

Can I use 4090 for training large models?

It is not possible to use 4090 for training large models, but it is not only feasible to use 4090 for inference/serving, but it is also slightly more cost-effective than H100. In fact, the biggest difference between H100/A100 and 4090 is communication and memory, and there is not much difference in computing power.

H100

A100

4090

Tensor FP16 computing power

989 Tflops

312 Tflops

330 Tflops

Tensor FP32 computing power

495 Tflops

156 Tflops

83 Tflops

Memory Capacity

80 GB

80 GB

24 GB

memory bandwidth

3.35 TB/s

2 TB/s

1 TB/s

Communication bandwidth

900 GB/s

900 GB/s

64 GB/s

Communication delay

~1 us

~1 us

~10 us

selling price

$30000~$40000

$15000

$1600

Guess you like

Origin blog.csdn.net/bestpasu/article/details/134098807