Serverless meets FinOps: Economical Serverless

Abstract: Based on the FinOps exploration and practice of FunctionGraph in the serverless field, this paper proposes the industry's first serverless function total cost estimation model
历川:华为云Serverless研发专家
平山:华为云中间件Serverless负责人
冯嘉:华为云中间件首席专家

Key Takeaways:

1) Although the rapid development of serverless has attracted extensive and in-depth attention, the prior estimation of the total cost of serverless functions still lacks effective theoretical guidance. Based on the FinOps exploration and practice of FunctionGraph in the serverless field, this paper proposes the industry's first serverless function total cost estimation model;

2) Based on the analysis of the key factors of the cost model, five optimization methods for the operation cost of functions are proposed; at the same time, in order to better help users reduce costs and increase efficiency, HUAWEI CLOUD proposes a transparent, efficient, one-click "user function" for the first time. Cost Research Center".

Problem introduction

Serverless's millisecond-accurate pay-per-use model eliminates the need for users to pay for idle time of resources. However, for a given application function, because the factors affecting its billing cost are not unique, it becomes a difficult task for users to accurately estimate the total billing during the function running period .

Taking the cyclical leasing model of traditional cloud resources as an example, by multiplying the cycle number by the cycle unit price, users can easily estimate the total cost of the lease period and form a clear psychological account expectation , even if the cloud platform adopts tiered pricing or price discrimination In the case of a strategy, it is not difficult to calculate the total cost of lease.

However, in serverless scenarios, there is still no effective theoretical guidance for estimating the total cost of functions in advance. On the one hand, the key factors affecting function billing are not unique , such as function memory specifications, single-instance concurrency, function execution time, etc.; "Pay-As-You-Go" has greater uncertainty.

Of course, the theoretical guidance for finding function billing is mainly to provide an effective basis for users to evaluate the total cost of functions, but more importantly, how to further use the estimation model to help users optimize application functions and their configuration choices , thereby significantly reducing the total cost of user functions. Cost is an urgent question for FinOps in the serverless field.

FinOps focuses on cloud resource management and cost optimization, and optimizes cloud resource costs for users, enterprises, and organizations by organically linking technology, business, and financial professionals, and improves the input-output ratio of cloud business [1]. Based on the FinOps exploration and practice of HUAWEI CLOUD FunctionGraph in the serverless field, this paper analyzes the function billing model and key influencing factors in the serverless scenario, and introduces a model framework for pre-estimating the total billing during function operation; more importantly, the The model provides an effective basis for helping users optimize the total cost of function operation, improve the performance of serverless resource management on the user cloud, and achieve economical serverless . 

1. Glossary and background knowledge

First, a brief description of several concepts listed in Table 1 is given.

Table 1: Common nouns for serverless functions

Memory specification (Memory) : Memory specification, also known as function specification and function instance specification, indicates the resource size allocated by the Serverless platform for a single instance of the function, generally expressed as the memory size that the function can use, which is specified by the user; the CPU that the instance can use Shares are proportional to memory size. Serverless cloud platforms usually provide a variety of specifications for users to choose from. Taking FunctionGraph as an example, users can choose from 15 function specifications, as shown in Figure 1.

Figure 1: FunctionGraph provides various function memory specifications

Function Execution Time : This refers to the time consumed by the execution of the function itself in the process of completing a call request response, which is mainly determined by the function code logic. Generally, for CPU-intensive functions, increasing the function resource specification (memory-CPU Share) can significantly reduce the function execution delay. However, for functions that spend most of their time on network IO and other operations, increasing the resource specification will have very limited improvement in execution latency.

Maximum Requests per Instance: The maximum number of requests that a single instance of the function can process at the same time, mainly suitable for scenarios where there is a significant time waiting for the return of downstream services during function execution, such as database access operations or disk IO, etc. . For the same traffic load, increasing the single-instance concurrency of a function can reduce the number of per-meter instances, save users bills, and reduce the cold start ratio of function call requests. 

Maximum Instances per Function : It refers to the upper limit of the number of instances of the same function running at the same time at the same time. For users, the maximum number of instances can prevent the cost from running out of control due to the excessive expansion of the cloud platform during abnormal traffic peaks or function failures; for cloud platforms, the maximum number of instances can prevent platform resources from being partially used under abnormal conditions. consumes light, thereby ensuring performance isolation between different functions. 

2. Function billing and cost model

For the function billing estimation model from the single-instance perspective, please refer to [2]. In a real production environment, in addition to asynchronous functions, serverless cloud platforms usually use FCFS (First Come First Serve) to respond to call requests. For tidal fluctuations in function traffic, the platform adapts itself through automatic scaling of instances, and runs in the system. The variation of the number of concurrent instances with time can be fully characterized by a piecewise constant linear function, as shown in Figure 2.

Figure 2: The number of concurrent instances of the function changes with the scaling process

Although there are differences in the billing methods between different serverless cloud vendors, function billing generally includes two parts: billing for the resources used by the function and billing for the number of requests , as follows:

The second half represents the total amount of free billing provided by the cloud platform, independent of function call traffic and function configuration.

3. Discussion on cost optimization methods

With the estimation model of functional cost, the key factors affecting user cost can be discussed. In the estimation formula (1), ignoring the total free charge provided by the cloud platform, the structure of the total monthly cost of the function is as follows:

Point 1: Optimize the function code logic itself to reduce the function execution delay 

For the same function traffic load, lower execution latency can save users more billing costs. Under the premise of user business logic, it is the natural requirement of software engineering to continuously optimize function code and improve function execution efficiency, but in the serverless scenario, this is even more urgent.
Specifically, consider adopting lightweight programming languages ​​such as Python and Nodejs, reduce unnecessary items in the function initialization configuration, and move the operation of connecting other services such as databases to the initialization stage before the function execution entry as much as possible to simplify the code logic, etc.

In addition, in order to help users grasp the running status of functions, FunctionGraph provides in-depth visualization and observability for application functions, and supports the configuration of rich observation indicators, including the number of calls, the number of errors, and the running delay. The function running time is shown in Figure 3. Monitoring example.

Figure 3: Example of FunctionGraph function runtime monitoring

Point 2: Optimize function code package, dependency package, image size

When a function call triggers a cold start, from the perspective of billing, the cold start delay is included in the execution delay and billed together, while a considerable proportion of the cold start delay is consumed by the cloud platform from third-party storage services (such as Download the user's code package and dependency package from the HUAWEI CLOUD Object Storage Service (OBS), or pull the user's application image from the image repository service, as shown in Figure 4. Although most cloud platforms currently use various caching mechanisms to pre-cache user code and images in order to optimize cold-start performance, the delay in loading user code during instance startup is still very significant . Therefore, the size of the function code package should be optimized as much as possible, including the reduction of dependent packages and images, so as to reduce the billing time.

Figure 4: Billing time and optimization points under hot and cold start

Point 3: Write function-focused lightweight functions 

Under the serverless programming framework, functions should be written as lightweight, function-focused program code as much as possible, that is, "functions should be small and purpose-built"[3]; let " a function only do one thing ", a On the one hand, the running delay of a function with a single function is easier to optimize in a targeted manner; on the other hand, when multiple functions are implemented in a function at the same time, there is a high probability that all functions will be compromised in performance at the same time. As a result, the total billing for the duration of the function run ends up being raised.

Figure 5: Example of HUAWEI CLOUD FunctionGraph function flow

If the application function really needs to provide multiple functions, you can consider decomposing the large function into multiple small functions, and then implement the overall logic through function arrangement, as shown in the FunctionGraph function flow function shown in Figure 5. Large function decomposition is also one of the best practices for users to handle abnormal scenarios such as timeouts in serverless computing [4].

Point 4: On the premise of business model support, single-instance multi-concurrency is adopted 

It can be seen from the function cost structure of formula (2) that, under the premise of the user's business model, configuring a certain single-instance concurrency can effectively reduce the total monthly cost of the function ; if the user does not configure it, the default value of the cloud platform is usually is 1, that is, a single instance can only process one request at the same time; therefore, when the function is called concurrently, the platform will start multiple instances to respond, thus increasing the number of billing instances, as shown in Figure 6; , using a single instance with multiple concurrency can also improve the tail delay of the call request in the waiting state. 

Figure 6: Single Instance Concurrency: Billing Hours Perspective and Instance Count Perspective

Of course, the higher the concurrency of a single instance is, the better . For example, an excessively high concurrency setting will intensify resource competition between multiple threads in a function instance (eg, CPU contention), which will lead to deterioration of function response performance and affect user application performance. QoS indicators, etc. At the same time, as mentioned in the background knowledge of this article, not all application functions are suitable for setting single-instance multi-concurrency. Single-instance multi-concurrency is mainly suitable for scenarios where a considerable proportion of the delay is consumed in the process of function execution waiting for the return of downstream services. In such scenarios, a significant proportion of instance resources such as CPU are idle and waiting, such as accessing databases and message queues. and other middleware, or disk IO, network IO, etc. Single-instance multi-concurrency also requires users to adapt the error capture (eg, consider request-level error capture granularity) and the thread safety (eg, lock protection) of global shared variables in the function code.

Point 5: The selection of function resource specifications should consider the impact on execution delay 

Finally, the choice of function resource specification is discussed. It can be clearly seen from formula (2) that a larger specification of instance memory corresponds to a higher billing cost. However, the choice of memory specification needs to consider the impact on the execution delay of the function at the same time. From the perspective of user functions, the function execution delay is not only determined by the business logic of the code itself, but also affected by the size of the resources available when the instance is running. Larger instance sizes correspond to larger usable memory and more CPU shares, which may significantly improve the execution performance of high-memory or CPU-intensive functions and reduce execution latency; of course, there is an upper limit to this improvement . , after a certain resource specification is exceeded, the effect of resource increase on reducing the function execution delay is almost negligible, as shown by the dashed line in Figure 7. The above facts show that, for a given user function, in order to reduce the total billing cost, it is necessary to configure a reasonable instance size to achieve the minimum value as much as possible, as shown by the solid line in Figure 7.

Figure 7: The choice of function specification needs to consider the impact on both cost and execution latency

Figure 8: Analysis of key factors of function billing cost

4. Serverless Function Cost Research Center

Reducing costs and increasing efficiency for users is the core concept of FunctionGraph. Although the five function cost optimization methods analyzed above are discussed from the perspective of users, we believe that these problems are far from the scope that users need to consider; on the contrary, FunctionGraph continues to explore how to help users to the greatest extent in the serverless field. To achieve the best FinOps effect, users can truly enjoy the benefits of Economical Serverless ; for example, under the premise of in-depth visualization and observability at the instance level, it helps users to automate the entire process of functional FinOps, and provides users with transparent, efficient, and One-click function resource management and cost optimization services.

Figure 9. Online resource consumption perception and dynamic specification recommendation

To this end, based on internal practice, FunctionGraph will launch the “ User Function Cost Research Center  – Cost Analysis and Optimization Center” in the near future, providing users with offline power tuning, online resource consumption perception and optimization. Several heavyweight feature services, including online resource recommendation (online resource recommendation, as shown in Figure 9), predictive auto-scaling preview, etc., minimize the technical threshold for users to implement FinOps functions. User business development, serverless transformation, etc. provide extreme convenience.

V. Summary and Outlook

This paper mainly discusses the FinOps problem in the serverless computing scenario, gives the industry's first user function total cost estimation model, and based on this model, provides theoretical reference and practice for users to optimize application functions, improve serverless resource management efficiency, and reduce total costs. in accordance with.

With the rise of an emerging technology field, the first question that needs to be answered is "Why & Value". As the next-generation serverless function computing and orchestration service supported by Huawei Yuanrong, FunctionGraph, combined with FinOps and other technical concepts, continues to provide users with economical serverless services. In the follow-up, we will share more cutting-edge theories and case practices around general-purpose full-scenario serverless, and give back to the community, including the practical experience of FunctionGraph in microservice serverless.

References:

[1] What is FinOps: https://www.finops.org/introduction/what-is-finops/

[2] Running Lambda Functions Faster and Cheaper: https://levelup.gitconnected.com/running-lambda-functions-faster-and-cheaper-416260fbc375?gi=4370e4c57684

[3] AWS Lambda Cost Optimizations Strategies That Work. https://dashbird.io/blog/aws-lambda-cost-optimization-strategies/

[4] Timeout Best Practices. https://lumigo.io/learn/aws-lambda-timeout-best-practices/

 

Click Follow to learn about HUAWEI CLOUD's new technologies for the first time~

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/5580247