AI cost reduction tool! Alibaba Cloud elastic acceleration computing instance is here, saving up to 50% of inference costs

Introduction: Recently, Alibaba Cloud launched the Elastic Accelerated Computing Instance (EAIS) family and product Elastic Accelerated Inference Instances (EAIS.EI for short), which realized the decoupling of GPU and CPU/memory for the first time, which can greatly improve the efficiency of AI inference. cut costs.

9989c5b90f96dedb20d3e717592eeed2c54bdb86.jpeg

In artificial intelligence inference scenarios, EAIS.EI allows users to customize the scale of GPU computing power. According to reports, this product can save up to 50% of inference business costs. Currently, EAIS.EI instances support mainstream deep learning frameworks such as Tensorflow and Pytorch, and support up to FP32 19.5 TFlops and FP16 mixed precision 312 TFlops operations.

Conference portal

Product Details

Whether you are visiting Taobao or using Douyin, artificial intelligence is always calculating crazy behind the scenes, short video recommendations, taking pictures of AI beauty, and even ordering food, artificial intelligence is helping the takeaway boy to optimize the route.

One of the fuels for the rise of artificial intelligence is GPU heterogeneous accelerators with different expertise from general-purpose processor CPUs. In terms of computing power, if CPU is a generalist, GPU is a specialist, very good at deep learning and image processing, so it shines in artificial intelligence, live broadcast and short video today.

However, the ratios of computing resources such as CPU, GPU, and memory required by different deep learning applications are not exactly the same. Cloud servers are limited by their specifications and often only have a certain ratio. In some cases, resources will be idle, especially in reasoning scenarios. Deep learning applications are divided into two processes: training and inference. Inference and calculation requirements are closely related to business volume, and are often the bulk of total operating costs, accounting for up to 90% of the cost.

EAIS provides customers with a heterogeneous computing power pool. Users can attach required GPU resources to any Alibaba Cloud ECS server, flexibly optimize the ratio of CPU/memory to GPU according to different application requirements, and match suitable resources Combination, effectively improve resource utilization.

In addition, AI inference computing services are usually accompanied by periodic peak and valley changes of the business. The EAIS.EI instance cooperates with the elastic scaling ESS to quickly sense business changes, achieve efficient business operation and maintenance, and improve business flexibility. The precise combination of resources and the elasticity of cloud services will greatly reduce costs.

The person in charge of Alibaba Cloud heterogeneous computing products revealed that in addition to reasoning scenarios, in the future, flexible accelerated computing examples will also cover graphics and multimedia computing coding scenarios, and even realize the decoupling of Hanguang 800 from CPU/memory.

The elastic acceleration computing instance will, together with the Dragon AI accelerator and cGPU container technology, form the three distinctive advantages of Alibaba Cloud's heterogeneous computing, providing users with a flexible, efficient, and superior-performance heterogeneous computing infrastructure.

Original link: https://developer.aliyun.com/article/775645?

Copyright statement: The content of this article is voluntarily contributed by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find that there is suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

Guess you like

Origin blog.csdn.net/alitech2017/article/details/109100090