[Share the dry goods of NVIDIA GTC 23 conference] The progress of artificial intelligence accelerated computing and scientific computing

[Share NVIDIA GTC dry goods] Advances in Accelerated Computing for AI and Scientific Computing (Advances in Accelerated Computing for AI and Scientific Computing) report video link

Data Center Development Prospects


Three trends are shaping the future of the data center

  1. Energy constraints. Data centers consume 2% of the world's resources, and new approaches require reducing energy consumption while maintaining performance and reliability
  2. Accelerate computing. The rise of accelerated computing is changing data centers. With the end of Moore's Law, traditional CPUs no longer meet resource or computing demands, and energy consumption and costs increase disproportionately.
  3. AI revolution. AI is changing the way we live and work in all aspects,Basic AI modelBringing many new products, applications and services, automation and personalization.

Accelerated computing requires full-stack optimization of software and hardware, and can only be done one specific application domain at a time (molecular genetics, seismic processing, quantum chemistry). NVIDIA has invested in accelerated computing for nearly 20 years. In the past decade, it has increased the performance of HPC applications by 500 times, but this is not just a performance issue, but more about sustainability .

For example, the world's top 500 supercomputers require staggering amounts of power and energy consumption. However, through the use of acceleration technology, NVIDIA has been able to significantly reduce power consumption and improve energy efficiency.

New product introduction


computing platform

For AI tasks, it is very important to understand the underlying models. These pre-trained models have many general skills and are the basis for building various applications.

ChatGPT was a watershed moment that focused the world's attention on Al.

Nvidia launched a set of pre-trainedGenerate Al model, businesses can customize and deploy these models for their applications. GPT models are well suited to perform various tasks such as sentiment analysis for content generation summarization.

Community models such as BloomZ support applications in 101 languages, including translation, natural language understanding, and question answering . Using these models, enterprises can fast-track their adoption of generating AI for language understanding and question answering. Using these models, they can also customize the models in various ways to match their domain and business goals, as shown in the figure below.

insert image description here
Hopper GPU
Hopper architecture GPUs are at the heart of current Nvidia computing platforms to solve state-of-the-art Al and HPC workloads. The H100 GPU features five breakthrough innovations as well as faster and more powerful Tensor Cores and NVLink. Hopper also introduced new DPX instructions to accelerate dynamic programming applications. Golden Suite is a tool used to measure the progress of Nvidia's work in HPC, AI and data science. H100 has been fully put into production and adopted by many cloud service providers.

H100
BlueField-3
BlueField-3 is currently in full production. It reduces server footprint and reduces power consumption by offloading CPU cores. It also enables isolation of control and management plane applications, reducing the attack surface. Bluefield is a chip specially designed to offload and accelerate virtualization, network storage and security software. BlueField-3 can optimize scientific computing, and its core can offload and accelerate MPI collective operations, allowing competition and communication to execute resources in parallel.

Performance tool
Rescale has just released a feature called Performance Profiles. Through this capability, users can gain detailed visibility into their workloads and performance across various compute, architecture, and scale levels, as shown in the figure below. Colors represent different systems. Red represents the X86 CPU system, and blue and green represent GPU-accelerated systems. The graph shows that the Nvidia A100 GPU-accelerated system offers the best performance and energy efficiency, while also offering the lowest power and cost.

insert image description here

This part is what I am more interested in, because the topic I am doing recently is about performance analysis tools . This function is still very powerful, but I think it should not be the result obtained in the actual operation of the program, it should be a simulation of the operation on different systems. We may learn more about it later.

Reasoning platform

In addition to Al training, H100 is being used for the Nvidia inference platform. With the continued adoption of generative Al, more advanced computing solutions are required to handle increasingly complex inference workloads.

  1. Triton is an open-source software for multi-framework inference services for GPUs and CPUs. The matching TMS (Triton Management System) is an automated resource-efficient reasoning model
  2. TensorRT is a high-performance deep learning inference SDK for NVIDIA GPU (this must be familiar to everyone). TensorRT recently launched GPU multi-node inference for GPT-3-based large language models.

Launched two new GPUs for different inference tasks

  1. One use case is inference on video content, where Al was used to understand video content is a transcription augmented reality and inference based on the content itself . The new L4 GPU is optimized for video transfer coding, video content understanding and AR.
  2. Another use case is generative AI, and generative AI is driving the explosion of AI in everyday use cases of image, video, text, and even 3D generation . The L40 GPU can deliver more than seven times the performance on workloads such as stable diffusion image generation

insert image description here

Finally, introduce Grace Hopper reasoning.
Compared with traditional systems, Grace Hopper provides more than 7 times the bandwidth and memory capacity. This enables the highest inference performance for large memory workloads such as vector databases and recommendation systems, while the Grace CPU is designed to be paired with Nvidia GPUs for massive scale and is better at energy efficiency and database movement.

chip design

Accelerated computing also plays an important role in the design of the chip itself.

Photolithography is a critical process in chip fabrication. One of the challenges is computing the mask, which is a very difficult task. Simulating the interference patterns of light in modern chips takes tens of billions of CPU hours each year. we announce the launchCu Litho, a library that brings accelerated computing to computational lithography, enabling semiconductor leaders such as TSMC to accelerate the manufacture of next-generation chips. The comparison with the traditional CPU method calculation is shown in the figure below

CPU and GPU calculation mask time-consuming comparison

Quantum platform

Quantum computing is also an area of ​​great potential, but there are some significant challenges that need to be overcome to make it a reality. One of the biggest hurdles is the need to build quantum algorithms that can exploit quantum properties such as superposition and entanglement. New workflows are needed to enable quantum developers to understand the physical and computer science challenges of qubit computing.

Nvidia announced the Nvidia Quantum platform, which consists of three parts.

  1. cuQuantum A set of accelerated libraries for quantum algorithm development
  2. CUDA Quantum hybrid quantum-classical computing platform, integrating heterogeneous programming
  3. DGX Quantum System for building quantum computing and classical quantum algorithm applications

insert image description here
The Nvidia Quantum platform solves multiple problems, such as algorithms for error correction cooperation and control, QPU design, hybrid application development, and tight integration with GPUs, and is currently working closely with multiple vendors.

Summarize


This report provides a rough overview of some of Nvidia's recent offerings with powerful features. They provide unparalleled capabilities and will be the productivity of the AI ​​supercomputing era. Nvidia is a well-deserved leader in the field of GPU acceleration, and people in related industries need to follow up on product and technology updates.

We live in an age of opportunity, and the possibilities we can achieve through technology are only limited by people's imagination!

Guess you like

Origin blog.csdn.net/hug_clone/article/details/129746621