Alibaba Cloud Enterprise ECS Conference Launches the Latest Generation of Heterogeneous Computing Products

Abstract:  With the great promotion of deep learning to artificial intelligence, the parameter space of the multi-layer neural network model constructed by deep learning has increased from millions to tens of billions, which poses new challenges to computing power. At the Alibaba Cloud Enterprise ECS conference on August 9, the latest generation of heterogeneous computing general-purpose GPU instances, the GN5 specification family, is a cloud-based tool for building a deep learning acceleration platform. Compared with the previous generation of GPU computing instances, GN5 The maximum performance has been improved by a full 94 times.

With the great promotion of deep learning to artificial intelligence, the parameter space of multi-layer neural network models built by deep learning has risen from millions to tens of billions, which poses new challenges to computing power.

At the Alibaba Cloud Enterprise ECS conference on August 9, the latest generation of heterogeneous computing general-purpose GPU instances, the GN5 specification family, is a cloud-based tool for building a deep learning acceleration platform. Compared with the previous generation of GPU computing instances, GN5 The maximum performance has been improved by a full 94 times.

1

2

The "excellent in character and learning" GN5 adopts the flagship P100 GPU of NVIDIA Pascal architecture, which can build an agile, elastic, high-performance and cost-effective deep learning platform on demand in the cloud, allowing users to enjoy Alibaba Cloud's global Efficient and stable cloud infrastructure resources.
Compared with the previous generation, the single-instance performance of GN5 has been expanded by 5 times. A single instance can provide up to 8 NVIDIA P100 GPUs, providing more than 20,000 parallel processing cores, up to 75TFLOPS FP32 single-precision floating point, 150 TFLOPS FP16 half-precision floating point Point and 38 TFLOPS FP64 double-precision floating-point operations.
Self-built GPU physical servers often encounter difficulties in expansion and adaptation. The GN5 specification family provides flexible and flexible serialized configurations. According to the requirements of deep learning computing power, suitable specifications can be selected on demand, and it can be done in minutes. The instance is created, and the GPU instance can be scaled out horizontally or scaled up vertically according to the computing power requirements.
In order to better utilize the parallel computing efficiency of GN5 multi-GPU cards, GN5 supports GPUDirect. Through point-to-point communication between GPU cards, GPUs can directly perform high-bandwidth and low-latency interconnection communication through PCIe bus without CPU intervention. , which greatly improves the efficiency of model parameter exchange in deep learning training.

3

In addition to GPU, deep learning also requires massive data storage capabilities, business service capabilities, monitoring capabilities, etc., which are complex, time-consuming, and labor-intensive in traditional models. GN5 realizes the perfect combination with ECS elastic computing ecology, and can be connected with OSS object storage, NAS file storage, etc., to meet the low-cost storage and access requirements of deep learning massive training data; preprocessing through EMR service; through cloud monitoring service Monitor and alarm GPU resources; quickly build a complete elastic GPU service in the cloud through load balancing, elastic scaling, resource scheduling, etc.; and can be used with container services to simplify the complexity of deployment and operation and maintenance, and provide resource scheduling services, etc.

4

GN5 instances support flexible usage payment methods, users can pay annually to obtain the highest usage discount; it also supports monthly payment to reduce the one-time investment cost of computing resources for users, and has a relatively low unit-hour usage price ; It also supports hourly payment so that users can meet temporary short-term use needs with the lowest single-use cost.

5

 在本次发布会上,阿里云还发布了从通用计算产品到异构计算产品的企业级ECS产品线,并对基础设施进行了升级:第三代分布式存储技术,以及第二代飞天虚拟化网络交换机,通过这些技术升级,阿里云能够更好的帮助企业在大数据,人工智能时代的数字化转型。

 Original link: http://click.aliyun.com/m/28453/

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326173184&siteId=291194637
Recommended