How to restructure cloud computing? Amazon Cloud Technology continues hardware innovation to help enterprises implement generative AI

In 2023, the innovative power of generative AI is accelerating across the world, demonstrating its powerful capabilities to the world through innovative practices in image and video generation, human-computer dialogue and other fields. However, in the process of implementing generative AI, enterprises often encounter problems such as lack of sufficient computing power and difficulty in completing large model training, which has become the last barrier to the commercial implementation of generative AI.

eef2b527514b509964092a4c17d88e3a.png

At the re:Invent 2023 conference, Amazon Cloud Technology started from the computing power pain points of enterprise-level generative AI and released a number of new hardware chips such as the powerful and lower energy consumption Amazon Graviton4, and the high-performance Amazon Trainium2 for model training. , covering multiple fields such as processors, model training, virtualization system architecture, and supercomputing, it has introduced infrastructure innovation for the development of generative AI with its strong force, and opened the curtain on a new era of enterprise-level generative AI.

A new generation of self-developed processor chipsAmazon Graviton4

Bring stronger performance instances

ed12952c55737d28444b83942d450c94.jpeg

The implementation of generative AI is inseparable from powerful computing power, and processor chips are naturally one of the main sources of computing power. Compared with Amazon Graviton 3, Amazon Graviton 4, the latest processor chip released by Amazon Cloud Technology, has 30% faster processing speed, 50% more cores, 75% more memory bandwidth, and can speed up database applications by 40%. , will increase the speed of processing large Java applications by 45%.

e977f2849b8091a17af65b1074329355.jpeg

Amazon Graviton4

In terms of core, Amazon Graviton4 uses the "Demeter" Neoverse V2 core based on the ARM v9 architecture, while Amazon Graviton3 uses the "Zeus" V1 core. The number of instructions per clock cycle of the V2 core is 13% higher than that of V1. The increase in the number of Amazon Graviton cores brings a final performance increase of 30% (at the same time, the performance per watt is basically the same as Amazon Graviton3);

In terms of core count, the Amazon Graviton4 kit has 96 V2 cores, which is 50% higher than Amazon Graviton3 and Amazon Graviton3E;

In terms of memory controllers, Amazon Graviton4 is packaged with 12 DDR5 controllers, while Amazon Graviton3 previously had only 8 DDR5 memory controllers. In addition, the DDR5 memory used by Amazon Graviton4 is also 16.7% faster, reaching 5.6 GHz. To sum up, Amazon Graviton4 has 536.7 GB/sec of memory bandwidth per socket, which is 75% higher than the 307.2 GB/sec of the previous Amazon Graviton3 and Amazon Graviton3E processors. Amazon Graviton 4 is currently available in preview in the latest R8g instances; it has 3x the vCPU and memory compared to R7g.

a63c9250f7d9007f7caac3390f826afa.jpeg

Amazon Comparison of past generations of Graviton chips, from left to right, generations 1 to 4

At the same time, because the entire range of Amazon Graviton processors adopts the ARM architecture, compared with similar instances based on x86 Amazon Cloud Technology, the cost of Amazon EC2 instances using Amazon Graviton chips can be reduced by up to 20%. In comparison, up to 60% energy can be saved while achieving the same performance.

Currently, 150 Amazon Cloud Technology computing instance types use Amazon Graviton processors, and more than 50,000 customers, including the Top 100 customers, are using these instances. The launch of Amazon Graviton4 will further improve the performance of instances based on Amazon Graviton processor chips, helping enterprises obtain more cost-effective computing power.

New generation training chip Amazon Trainium2‍

Improve the efficiency of large model training

On the other hand, many companies have begun to train their own large generative AI models. At this time, they will find that they need specialized model training chips. Amazon Trainium2, a new generation training chip released by Amazon Cloud Technology, is specially optimized for the training of large models. Compared with the previous generation chip, the performance has been improved to 4 times, the memory capacity is increased to 3 times, and the energy efficiency is increased to 2 times.

79334ea27cd82dffb9557071a2d5ee96.jpeg

Previously, Amazon Cloud Technology has launched Amazon Trainium, a chip specially used for large model training. Using Amazon Trainium's Amazon EC2 Trn1 instance and using the BERT-Large model for testing, compared to the process of expanding Amazon Cloud Technology's P4d instance from a single node to a 16-node cluster, the training throughput is improved compared to the P4d cluster. It reaches 1.2 to 1.5 times, and the training cost per million sequences is only about 40% of that of a P4d cluster of the same scale. The launch of Amazon Trainium2 will further enhance the performance of Amazon Cloud Technology's latest Amazon EC2 Trn2 training instance, allowing customers to train large models with 300 billion parameters in just a few weeks, accelerating the era of generative AI.

2e55c4b75dcd2abe4cb42f3fa5caea21.pngComparison of throughput and cost between Trn1 and similar instances

Currently, Anthropic, a generative AI unicorn company, has planned to use Amazon Trainium2 chips to train its own generative AI product Claude.

Amazon Nitro‍

Redefining virtualization

System architecture design is also a major difficulty in the implementation of enterprise-level generative AI. Because as the demand for computing power continues to increase, the number of GPU and other chips in the instance gradually increases, and a large amount of system resources will be consumed in various aspects such as connecting chips and scheduling tasks. System performance improvement will have marginal effects. At this time, virtualization becomes a choice.

Amazon Cloud Technology’s Amazon Nitro system is the basic platform for the new generation of Amazon EC2 instances. Through a dedicated Amazon Nitro chip card, it can Transfer functions such as CPU, storage, networking, and management to dedicated hardware and software, so that almost all resources of the server are used for instances, thereby improving resource utilization and reducing costs.

230627ca457fd2ad83006cf395d0f555.jpeg

The Amazon Nitro system includes a very lightweight hypervisor that takes up less than 1% of system resources compared to a traditional hypervisor that takes up about 30% of system resources. In this way, by transferring the virtualization function from the server to running on the Amazon Nitro dedicated chip independently developed by Amazon Cloud Technology, the performance loss of the physical server caused by virtualization is minimized.

At the same time, Amazon Nitro can provide hardware-level security mechanisms. The Amazon Nitro security chip isolates the write operations of the user's Amazon EC2 instance to the underlying hardware, so the user's data can be well protected. In addition, through a variety of Amazon Nitro network cards and memory cards, storage virtualization, network I/O virtualization and server hardware update iterations can be decoupled to ensure I/O performance.

e45e0eddbd4ecdb8a06aebcd90fe8d75.jpeg

Today, the Amazon Nitro system has developed to the fifth generation, and network performance has been improved to 100Gbps. With the help of Amazon Nitro, users can improve the security and stability of Amazon EC2 instance operation and management, which means that Amazon EC2 instance design can be more flexible, and most importantly, it can almost completely eliminate the system overhead caused by virtualization itself. , allowing system resources to be fully used for workloads and improving computing power usage efficiency.

NVIDIA X Amazon Cloud Technology‍

Create “the most powerful supercomputer on the cloud”

Finally, let us look at a practical case of enterprise-level generative AI implemented through Amazon Cloud Technology: NVIDIA supercomputing. Based on Amazon Nitro and Amazon EFA, 16,384 Nvidia GH200 chips can be connected to become the artificial intelligence factory "Nvidia DGX Cloud".

This can be seen as a huge virtualized GPU cluster, which can provide 65 exaflops of computing power (the computing power of Frontier, the world's number one supercomputer, is about 1.1 exaflops). It will be the world's first super chip equipped with NVIDIA Grace Hopper. And Amazon Cloud Technology’s scalable UltraCluster, the world’s fastest cloud AI supercomputer, NVIDIA intends to use it for NVIDIA AI research and development and custom model development.

Other enterprises can also use this as a reference to use Amazon Nitro to build a virtualized cluster with a specific level of computing power that meets their own needs to implement their own generative AI demand applications.

3c6ee3f2f0d90572df7303e6a7ec41e6.jpeg

As the world's largest cloud service provider, Amazon Cloud Technology has infrastructure all over the world and can meet the different needs of enterprise customers through a variety of hardware and instances. At this re:Invent 2023 conference, the innovations in hardware, chips and other infrastructure launched by Amazon Cloud Technology are believed to be able to further improve the performance of infrastructure, reconstruct cloud computing from the perspective of computing power, and help enterprise users quickly enter generative AI. New Era.

Click "Read the original text", Quickly view all the hot releases of Amazon Cloud Technology re:Invent 2023 in one link!

c13f3d99a3358bd1000f76dbd7549328.gif

Shenyao's Technology Observation was founded by Shensky, a senior technology media person. He has 20 years of experience in enterprise-level technology content communication and has long been focused on the observation and thinking of industrial Internet, enterprise digitalization, ICT infrastructure, automotive technology and other contents.

9a1241824d20068f734397cf1bb0f6d0.png

Guess you like

Origin blog.csdn.net/W5AeN4Hhx17EDo1/article/details/134820134