20 GPUs can carry the equivalent of global Internet traffic, and the Grace CPU super chip is available. What did NVIDIA release this GTC? ...

a73982d70f6c1fedfe4e3f1d654537fc.gif

Organize | Su Mi

Listing | CSDN (ID: CSDNnews)

Where are the limits of technology?

Presumably the science and technology circle will give an answer with a high probability of no!

No, at the GTC 2022 Keynote, Huang Renxun in a black jacket "battle shirt" was in command. Nvidia brought the H100 GPU built with TSMC's 4nm process with 80 billion transistors, the Grace CPU based on the latest Arm v9 architecture, and the Software and hardware such as the Omniverse of the Metaverse and the autonomous driving platform Hyperion 9 have once again refreshed their own and even the industry's records.

b839b52ff6b02cdbba51ce1e1a62317a.png

4d8b11b923303337c30e7caad4110da2.png

20 H100 GPUs can sustain internet traffic equivalent to the entire world?

Nvidia, a company you can call a "chip overlord", or a company that is a big player in artificial intelligence computing and the Metaverse.

In what it does best, GPUs, Nvidia announced the arrival of its next-generation accelerated computing platform based on the Hopper™ architecture, which jumps an order of magnitude in performance over the previous generation and powers the next wave of AI data centers.

The new architecture, named after pioneer American computer scientist Grace Hopper, replaces the NVIDIA Ampere architecture introduced two years ago.

In addition to this, Nvidia also released its first Hopper-based GPU, the Nvidia H100.

87b97a4ece413be9fbed37c366ba5a85.png

As a replacement for the A100, in terms of design, it deviates from the previous 5nm process manufacturing. This H100 uses the most advanced TSMC 4nm process and has 80 billion transistors, which can speed up AI, HPC, memory bandwidth, mutual The development of connectivity and communications has even enabled external connections of nearly 5 megabytes per second.

In terms of performance, the H100 uses the standard model of natural language processing, the new Transformer Engine. The H100 accelerator can increase the speed of these networks up to 6x over the previous generation without loss of accuracy.

In addition, the H100 is also the first GPU to support PCIe Gen5 and the first to utilize HBM3, achieving a memory bandwidth of 3TB/s. Twenty H100 GPUs can sustain the equivalent of the entire world's Internet traffic, making it possible for customers to deliver advanced recommender systems and large language models to run data inference in real-time.

In addition to the above, H100 has achieved the following breakthroughs in technology:

  • Implement second-generation secure multi-instance GPUs. In the previous generation, Nvidia's multi-instance GPU technology can divide a GPU into seven smaller, fully isolated instances to handle different types of work. The Hopper architecture extends MIG capabilities by a factor of 7 over the previous generation by providing a secure multi-tenant configuration for each GPU instance in a cloud environment.

  • Confidential computing. The H100 is the world's first accelerator with confidential computing power to protect AI models and customer data as they are processed. Customers can also apply confidential computing to federated learning in privacy-sensitive industries such as healthcare and financial services, as well as on shared cloud infrastructure.

  • Supports fourth-generation NVLink technology. To accelerate the largest AI models, NVIDIA has combined NVLink with a new external NVLink Switch, extending NVLink as an extension network beyond the server, allowing up to 256 connections compared to the previous generation using NVIDIA HDR Quantum InfiniBand The H100 GPU also has 9x higher bandwidth.

  • Dynamic programming is accelerated by the new DPX instruction, which is used in a wide variety of algorithms, including route optimization and genomics. Dynamic programming is 40 times faster than CPUs and 7 times faster than previous generation GPUs. This includes the Floyd-Warshall algorithm for finding optimal routes for fleets of autonomous robots in dynamic warehouse environments, and the Smith-Waterman algorithm for sequence alignment for DNA and protein sorting and folding.

Jen-Hsun Huang said, "Data centers are becoming artificial intelligence factories. The NVIDIA H100 is the engine of the global AI infrastructure, and enterprises use it to accelerate their AI-driven businesses."

It is worth noting that Nvidia has also released a series of products based on the H100.

“Artificial intelligence has fundamentally changed how software functions and is produced. Companies that are revolutionizing their industries with artificial intelligence realize the importance of their AI infrastructure,” said Jen-Hsun Huang. “Our new DGX H100 system will Powering enterprise AI factories, distilling data into our most valuable resource—intelligence.”

Based on the H100 of the Hopper architecture, NVIDIA has launched the fourth-generation DGX™ system DGX H100.

Featuring 8 H100 GPUs, the DGX H100 can deliver 32 petaflops of AI performance at the new FP8 precision, providing the scale to meet the large-scale computing needs of large language models, recommender systems, healthcare research, and climate science.

Each GPU in the DGX H100 system is connected by fourth-generation NVLink, providing a connection speed of 900GB/s, 1.5 times more than the previous generation. NVSwitch™ enables all eight GPUs of the H100 to be connected via NVLink. 

Nvidia says it can also connect up to 32 DGXs (containing a total of 256 H100 GPUs) using its NVLink technology to create a "DGX Pod."

"The bandwidth of the DGX POD is 768 terbytes per second. In comparison, the current bandwidth of the entire Internet is 100 terbytes per second," Huang Renxun explained.

Multiple DGX Pods can be connected together to create DGS Superpods, which Huang Renxun calls a "modern AI factory".

In this regard, Nvidia has also developed a new supercomputer called Eos, which will be equipped with 18 DGX Pods. In terms of AI processing power, it will be four times as powerful as the Fugaku, the world's most powerful supercomputer.

Eos is expected to go live in the next few months and will be the fastest AI computer in the world.

0d91afa639c173c0a6338cb663e7b3f3.png

A super chip composed of two CPUs - Grace CPU Superchip

In the field of CPU, Huang Renxun officially shared the Grace CPU Superchip, Nvidia's first Arm CPU chip designed for data centers, in a keynote speech.

083145604992ef30ecad4afb278c1f02.png

The reason why it is called a super chip, Huang Renxun said that the chip will double the performance and energy efficiency of Nvidia chips.

However, in essence, this super chip is a combination of two CPUs, consisting of two CPU chips inside, interconnected through NVLink-C2C (a new high-speed, low-latency, chip-to-chip interconnect) Technology comes together.

According to Nvidia, the Grace CPU super chip is designed to provide the best performance, and its single CPU is equipped with 144 Arm Neoverse cores and has achieved an estimated performance of 740 points in the SPECrate2017_int_base benchmark.

This is more than 1.5x better performance than the dual CPUs currently shipping with the DGX A100, as estimated by the same compiler in NVIDIA's labs.

The LPDDR5x memory subsystem of the Grace CPU super chip provides twice the bandwidth of traditional DDR5 designs, up to 1 megabyte per second, while power consumption is greatly reduced, consuming only 500 watts for the entire CPU including memory.

Nvidia says the Grace CPU super chip will excel in the most demanding HPC, AI, data analytics, scientific computing and hyperscale computing applications with the highest performance, memory bandwidth, energy efficiency and configurability, and will be available in 2023 Shipped early in the year.

4fbb2c7f83e1af0123d60662df33dc69.png

The first Omniverse computing system OVX

As a big player in the metaverse field, NVIDIA launched a new industrial digital twin computing system - OVX at this year's GTC developer conference.

OVX was created to run digital twin simulations in the Omniverse, "a real-time physically accurate world simulation and 3D design collaboration platform" published by Nvidia.

"Just as we provided DGX for AI, we now provide OVX for Omniverse," said Jen-Hsun Huang.

OVX is the first Omniverse computing system, consisting of eight Nvidia A40 GPUs, three Nvidia ConnectX-6 Dx 200-Gbps NICs, dual Intel Ice Lake 8362 CPUs, 1TB of system memory, and 16TB of NVMe storage.

When connected to a Spectrum-3 switch fabric, an OVX computing system can scale from a single pod of 8 OVX servers to a SuperPOD of 32 OVX servers. Multiple SuperPODS can also be deployed for larger simulation needs.

According to NVIDIA, "OVX will enable designers, engineers and planners to build physically accurate digital twins of buildings, or create large-scale, realistic simulated environments with precise time synchronization between the physical and virtual worlds. . 

Jen-Hsun Huang also pointed out in his speech that due to the complexity of industrial systems, "Omniverse software and computers need to be scalable, low-latency, and support precise timing," and because data centers process data in the shortest possible time, rather than At the exact time, so Nvidia wanted to create a "synchronized data center" with OVX.

The current first-generation OVX system has already been deployed within NVIDIA and among some early customers, and the second-generation system is currently in development and will benefit from NVIDIA's new Spectrum-4 Ethernet platform today. 

Spectrum-4 is a 51.2 Tbps per second, 100 billion transistors Ethernet switch that enables nanosecond timing accuracy.

In addition, at the Omniverse level, NVIDIA also released a new product called Omniverse Cloud, a cloud service designed to facilitate real-time 3D design collaboration between creatives and engineers.

Omniverse Cloud is said to eliminate the complexity that arises from the need for multiple designers to work together in a variety of different tools and locations.

"We want Omniverse to reach every one of the tens of millions of designers, creators, roboticists and AI researchers," said Jen-Hsun Huang.

6e1f6768a6feed2dbd615f490ce434f4.png

Autonomous Driving DRIVE Hyperion 9

Autonomous driving, a field in which major technology giants have met with each other in recent years, everyone knows that this is a sweet pastry, but if you can win it or not, you have to rely on real skills.

Different from Apple's vision of car building that wants to hold all the software and hardware ecology in its own hands, NVIDIA has a clear goal in the field of autonomous driving, which is to build a fully autonomous driving solution step by step.

Following the release of the Orin chip for autonomous driving in 2019 and its official production and sales this month, NVIDIA has released the next-generation platform for autonomous driving with software - DRIVE Hyperion 9 .

e53f9baff252859b94e8b102622515f7.png

According to the official introduction, the DRIVE Hyperion 9 platform adopts an open and modular design, including computer architecture, sensor groups, and a complete NVIDIA DRIVE driver and concierge service application, which is also convenient for developers to get what they need during development.

At the same time, NVIDIA has added redundancy to the calculation of the DRIVE Hyperion 9 architecture. In addition, the DRIVE Atlan vehicle-specific system chip released in 2021 is used, and its performance is more than twice that of the Orin-based chip. At the detailed parameter level, the DRIVE Hyperion 9 architecture includes 14 cameras, 9 radars, 3 lidars, and 20 ultrasonics for autonomous and autonomous driving, as well as 3 cameras and 1 radar for interior occupant sensing.

Nvidia also likens DRIVE Hyperion to the nervous system of the vehicle and DRIVE Atlan as the brain, Nvidia’s generation of systems ranging from NCAP to Level 3 driving and Level 4 parking with advanced AI cockpit capabilities.

NVIDIA plans to have DRIVE Hyperion 9 mass-produced vehicles in 2026, while the programmable architecture is built on multiple DRIVE Atlan computers to enable intelligent driving and in-vehicle functionality.

References:

https://nvidianews.nvidia.com/news/nvidia-announces-hopper-architecture-the-next-generation-of-accelerated-computing

https://blogs.nvidia.com/blog/2022/03/22/drive-hyperion-9-atlan/

https://venturebeat.com/2022/03/22/nvidia-introduces-arm-based-grace-cpu-superchip/

3820c2e51694044f96c3c388ebf3aec5.png

END

5e6f4342f73ebdd4b6c1d7d434ef091f.png

 
  
— 推荐阅读 —
 
  
☞小米首款汽车预计2024年量产;英伟达发布首款基于Hopper架构GPU;Java 18 正式发布|极客头条
☞近7成开发者无开源收入、最想操作系统开源、Java最受欢迎 | 揭晓中国开源开发者现状
☞苹果被罚3.1635亿元,因不愿开放第三方支付!
—点这里↓↓↓记得关注标星哦~—

"Share", "Like" and "Watching" with one click

Achieve 100 million technicians

1fb1470a05799168e3ebd4cf1887fb41.png

Guess you like

Origin blog.csdn.net/csdnnews/article/details/123700526