The machine learning platform PAI supports preemptive instances, and the cost of model services can be reduced by up to 90%

It helps reduce costs and increase efficiency of model inference services, and is suitable for inference cost-sensitive scenarios, such as: AIGC content generation asynchronous inference, batch image processing, batch audio and video processing, etc.

In the context of the continuous pursuit of efficiency in AI development and services, Alibaba Cloud's machine learning platform PAI announced that it supports preemptive instances (Spot Instance). In the model inference process, users can flexibly select preemptible instances through the PAI-EAS model online service platform to run tasks that are relatively insensitive to inference delays, thereby saving service costs. Compared with pay-as-you-go instances, preemptible instances of the same model , up to 90% cost optimization can be achieved.

What is a preemptible instance

A preemptible instance (Spot Instance) is a way to purchase computing resources. Different from traditional subscription instances and pay-as-you-go instances, the selling price of preemptive instances fluctuates in real time following market demand and inventory supply and demand. PAI-EAS preemptible instances provide services based on idle computing resources in the public resource group, and can provide users with low-cost resources, with prices as low as 10% of pay-as-you-go instances. The unit price comparison of various instance purchase methods is as follows: unprotected preemptive instance < protected preemptive instance < prepaid instance < pay-as-you-go instance.

Before using PAI-EAS preemptible instances, users first need to set the upper limit of bids and choose whether to set a 1-hour protection period. When the service is successfully deployed, PAI-EAS will automatically bid for the corresponding resources.

To purchase preemptible instances:

  • When the instance inventory is sufficient and the upper limit of the bid configured by the user is not lower than the current market price of the preemptive instance, the resource is successfully preempted.

Using preemptible instances:

  • If the user sets a 1-hour protection period, after successfully purchasing a preemptible instance, the instance resources are guaranteed to be used for at least 1 hour by default. During the 1-hour protection period, if the market price of the instance exceeds the bid limit set by the user, it will still be billed according to the bid limit. After more than 1 hour, when the inventory of the instance is insufficient or the upper limit of the bid is lower than the market price, the instance will be released immediately.
  • If the user does not set a 1-hour protection period, after successfully purchasing a preemptible instance, the instance will be released immediately if the inventory of the instance is insufficient or the upper limit of the bid is lower than the market price.

Multiple instance deployment

  • When using preemptible instances to deploy services, if you only specify computing resources of a single specification, the service may not be launched for a long time due to low bids or insufficient inventory; pending. In response to this problem, the PAI-EAS deployment link supports the selection of instances of multiple specifications, and pulls up resources by traversing the specification list in the service resource configuration, thereby greatly reducing the deployment risk caused by the release of preemptive instances and ensuring the stable operation of the service.

This figure briefly illustrates the pricing rules of PAI-EAS preemptive instances (Spot Instance). As shown in the figure, the pay-as-you-go instance price of this instance is 13.98 yuan/hour, the user's bid limit is 5 yuan/hour and a 1-hour protection period is set. With preemptible instances, users can use computing resources at a lower price.
insert image description here

Note: Take the price of the preemptible instance of the PAI-EAS public resource group on April 23, 2023 as an example, and the model is 8vCPU+30GB+1*A10

Application Scenarios of PAI-EAS Preemptible Instances

PAI-EAS preemptive instance (Spot Instance) is suitable for scenarios that are very sensitive to price but relatively insensitive to real-time performance and stability of inference services, such as:

  • Asynchronous inference scenarios for AIGC content generation
  • Image analysis for batch post-processing such as image recognition, OCR, etc.
  • Video analysis for batch post-processing such as video segmentation and video classification
  • Speech analysis for asynchronous inference or batch inference such as speech segmentation and speech-to-text
  • Asynchronous batch processing scenarios for AI painting such as Stable Diffusion

When users do not need to get the results returned by inference in real time, but can accept a delay for a period of time (for example, within 1 hour), these situations are suitable for using preemptive instances to optimize service costs.

In actual business scenarios, customers can first purchase a certain amount of prepaid resources as guaranteed resources to ensure that services can run smoothly; for elastic parts, they can use preemptive instances of different models according to business scenarios, and use PAI-EAS The automatic elastic scaling function is provided to automatically expand and shrink preemptible instances. When preemptible instances cannot be expanded due to price reasons, PAI-EAS also provides multi-specification instance options to allow users to use ordinary pay-as-you-go instances for expansion. The combined cost guarantees the stable operation of the service.

How to configure preemptible instances using PAI-EAS

1. Enter the PAI-EAS console, click "Deployment Service" to enter the detailed configuration interface;

2. In the "Resource Deployment Information" section, select "Public Resource Group" in "Resource Group Type", and switch to "Advanced Resource Configuration" in "Resource Configuration Method" to configure the preemptive instance resources (spot instance);
insert image description here

3. Select the retention period for preemptible instances:

Set a protection period of 1 hour: After the deployment is successful, it can be used for at least 1 hour by default, that is, there is a protection period of 1 hour after the preemption is successful, and the protection period can guarantee that you can use resources. After the 1-hour protection period, if the inventory of preemptible resource instances is sufficient, and the upper limit of the bid you set before is not lower than the current market price of preemptible instances, you can continue to use preemptible instances.

Undetermined protection period: There is no guarantee of a fixed duration of protection period to guarantee the use of resources. If the inventory of preemptible resource instances is sufficient, and the upper limit of your bid price is not lower than the current market price of preemptible instances, you can continue to use preemptible instances. The price will also be cheaper than the one with 1 hour protector.

4. Select the machine model, and you can see the comparison between the current preemptive price of the model and the original price, so as to make a bid. When the price of preemptive resources is lower than the bid and the inventory is sufficient, you can always keep the use of this model.

Click "+" to add an instance specification. After the service goes online, PAI-EAS will pull up resources by traversing the specification list in the service resource configuration, reducing the risk caused by the release of preemptive instances.

5. After completing other configurations, click the "Deploy" button to launch the service.

reference documents

EAS preemptive instance Spot Instance overview
https://help.aliyun.com/document_detail/52088.htm?spm=a2c6h.12873639.article-detail.4.23cd4fe1amQ1Rz

Advanced Configuration: Multi-Specification Instance Selection
https://help.aliyun.com/document_detail/602247.html?spm=a2c6h.12873639.article-detail.5.23cd4fe1amQ1Rz

The machine learning platform PAI supports preemptive instances, and the cost of model services can be reduced by up to 90%

Guess you like

Origin blog.csdn.net/bjchenxu/article/details/130772071