Nvidia H100: Is 550,000 enough for this year?

Original title: Nvidia H100: Are 550,000 GPUs Enough for This Year?

Author: Doug Eadline


August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its latest H100 GPUs worldwide in 2023. The appetite for GPUs is obviously coming from the generative AI boom, but the HPC market is also competing for these accelerators. It is not clear if this number includes the throttled China-specific A800 and H800 models.

In a recent Financial Times article, Nvidia reported that it expects to ship 550,000 of its latest H100 GPUs worldwide in 2023. Demand for GPUs is clearly coming from the generative AI boom, but the HPC market is also vying for these accelerators. It's unclear if that number includes the China-exclusive A800 and H800.

The bulk of the GPUs will be going to US technology companies, but the Financial Times notes that Saudi Arabia has purchased at least 3,000 Nvidia H100 GPUs and the UAE has also purchased thousands of Nvidia chips. UAE has already developed its own open-source large language model using 384 A100 GPUs, called Falcon, at the state-owned Technology Innovation Institute in Masdar City, Abu Dhabi.

Most of the GPUs will go to U.S. tech companies, but the Financial Times noted that Saudi Arabia has purchased at least 3,000 Nvidia H100 GPUs, and the United Arab Emirates has also purchased thousands of Nvidia chips. The UAE has developed its own open source large-scale language model, called Falcon, using 384 A100 GPUs at the state-owned Institute of Technology Innovation in Masdar City, Abu Dhabi.

The flagship H100 GPU (14,592 CUDA cores, 80GB of HBM3 capacity, 5,120-bit memory bus) is priced at a massive $30,000 (average), which Nvidia CEO Jensen Huang calls the first chip designed for generative AI. The Saudi university is building its own GPU-based supercomputer called Shaheen III. It employs 700 Grace Hopper chips that combine a Grace CPU and an H100 Tensor Core GPU. Interestingly, the GPUs are being used to create an LLM developed by Chinese researchers who can’t study or work in the US.

The flagship H100 GPU (14,592 CUDA cores, 80GB HBM3 capacity, 5,120-bit memory bus) costs up to $30,000 (average) , and Nvidia CEO Jensen Huang calls it the first chip designed for generative AI. Saudi universities are building their own GPU-based supercomputer called Shaheen III. It uses 700 Grace Hopper chips, combined with Grace CPU and H100 Tensor Core GPU . Interestingly, GPUs were used to create LLM, which was developed by Chinese researchers who could not study or work in the United States.

Meanwhile, generative AI  (GAI) investments continue to fund GPU infrastructure purchases. As reported, in the first 6 months of 2023, funding to GAI start-ups is up more than 5x compared to full-year 2022 and the generative AI infrastructure category has seen over 70% of the funding since Q3’22.

Meanwhile, generative artificial intelligence (GAI) investments continue to fund GPU infrastructure purchases. According to reports, funding to GAI startups in the first six months of 2023 has increased more than 5 times compared to all of 2022, and the generative AI infrastructure category has accounted for more than 70% of funding since the third quarter of 2022.

Worth the Wait

The cost of a H100 varies depending on how it is packaged and presumably how many you are able to purchase. The current (Aug-2023) retail price for an H100 PCIe card is around $30,000 (lead times can vary as well.) A back-of-the-envelope estimate gives a market spending of $16.5 billion for 2023 — a big chunk of which will be going to Nvidia. According to estimates made by Barron’s senior writer Tae Kim in a recent social media post estimates it costs Nvidia  $3,320  to make a H100.  That is a 1000% percent profit based on the retail cost of an Nvidia H100 card.

The cost of H100 varies depending on how it is packaged and how much you are able to purchase. Currently (August 2023) the H100 PCIe card retails for around $30,000 (delivery times may also vary.) A rough estimate of market spending in 2023 is $16.5 billion - a large portion of which will go to Nvidia. According to an estimate recently posted on social media by Barron's senior writer Tae Kim, Nvidia cost $3,320 to make the H100 , a 1000% profit.picture

The Nvidia H100 PCIe GPU

As often reported, Nvidia’s partner TSMC can barely meet the demand for GPUs. The GPUs require a more complex CoWoS manufacturing process (Chip on Wafer on Substrate — a “2.5D” packaging technology from TSMC where multiple active silicon dies, usually GPUs and HBM stacks, are integrated on a passive silicon interposer.) Using CoWoS adds a complex multi-step, high-precision engineering process that slows down the rate of GPU production. 

As has been frequently reported, Nvidia's partner TSMC is barely able to keep up with GPU demand. GPUs require a more complex CoWoS manufacturing process (Chip on Wafer - TSMC's "2.5D" packaging technology where multiple active silicon dies (usually GPU and HBM stacks) are integrated on a passive silicon interposer.) using CoWoS adds complex multi-step, high-precision engineering processes that slow down GPU production.

This situation was confirmed by Charlie Boyle, VP and GM of Nvidia’s DGX systems. Boyle states that delays are not from miscalculating demand or wafer yield issues from TSMC, but instead from the chip packaging CoWoS technology.  

This was confirmed by Charlie Boyle, vice president and general manager of DGX systems at Nvidia. Boyle said the delay was not due to TSMC's miscalculation of demand or wafer production issues, but to chip packaging CoWoS technology .

Original link:

https://www.hpcwire.com/2023/08/17/nvidia-h100-are-550000-gpus-enough-for-this-year/


// You have seen this, why don’t we chat a few words!

1. Someone asked, "Does large models make money?" I don't know how to answer, but nvidia is already picking the low-hanging fruit. Its first-mover advantage comes from the layout of the CUDA software stack more than ten years ago and its accumulation in the direction of GPU architecture for n years.

2. More than 30 domestic accelerator card companies will be involved in the peak time in 2024, and make several predictions:

  • Competing for the market, striving to push the special card for large models

  • Small companies or companies that move slowly will be very dangerous next year, and mergers and acquisitions can be regarded as a retreat.

  • Computing power center/Xinchuang market/urban layout, share competition.

  • 2024 will be the highlight moment of the explosion of computing power basic software companies!

  • The Yankees will choose their targets precisely and time their strikes precisely. Troubles/spies are everywhere!

     

Guess you like

Origin blog.csdn.net/weixin_45571628/article/details/132429573