How to design and implement adaptive load balancing

In modern distributed application, a service request is a physical or virtual machine server pool consisting of processing. Usually, server pool is huge and the service capacity varies, subject to various factors network, memory, CPU, and other downstream services, a server service capacity is always in dynamic change and stabilize the state, how to design and implement such load balancing algorithm system is a challenging problem.

Adaptive load balancing of demand background

Load balancing has two main objectives:

Request response time be kept short and small blocking probability;
overhead load balancing algorithm in a controllable level, do not take up too much CPU, network and other resources.

Adaptive load balancing refers to whether the system is idle, busy, or steady state, the load balancing algorithm will automatically serve capacity evaluation system, a reasonable flow distribution, so that the whole system has always maintained a good performance, no hunger or overload, dang machine.

This algorithm for today's electrical business systems, data center, cloud computing and other areas are necessary, the use of adaptive load balancing to more rational use of resources and improve performance. For example, in the double eleven zero, payment orders focus the user, the request rate for the entire system reaches a peak electricity supplier. If these requests are assigned to only a small part of the flow server, the machine receives a request rate will be much more than the processing rate, the processing time to the new task, the task generates a request accumulation.

For users, once the task accumulation will slow down or even request a timeout, experience a serious decline, and even lead service is unavailable. The machine will process the request due to the accumulation of more and more serious task overload, until it is defeated. The remaining yet other machine downtime will gradually This procedure is repeated until the entire application is not available, system failure.

To prevent this from happening, we may think of a common approach: the service performed on-line in advance before the pressure measurement, pressure measurement using the volume as the current limit value, when the online service request rate is greater than a limiting value, New service denial of service, in order to protect the service is always available. However, this approach is also problematic: when the test pressure test performed limiting capacity is usually conservative, can not realize the full performance of heterogeneous systems; can not respond intelligently capacity drops due to network, downstream service changes caused by other issues The system still exists the risk of downtime.

Therefore, we need to have the capability of adaptive load balancing algorithm to better allocate traffic scheduling and safeguard stability, the pursuit of extreme performance, challenges and promote large-peer traffic peak at the scene.

Combined Middleware Performance Challenge Cup title

We combine the preliminary scenario "Fifth Middleware Performance Challenge" is to explore together the basic ideas about the design and implementation of an adaptive load balancing.

The Challenge scene, service caller (Consumer) and three different specifications put pressure on service providers by the program (Ali cloud performance test PTS) (Provider) components. In the evaluation process, each program is deployed on different physical machines to avoid competition CPU, network resources, resulting in the evaluation program jitter affect the final evaluation results.

Becnhmarker responsible for requesting Consumer, Consumer after receipt of the request, from three different physical specifications, service response time and maximum concurrent are different Provider, select a call and returns the result. Which Provider selection process is called load-balancing algorithm in the challenge to be achieved.

In order to simplify the deployment and improve the environmental performance in the challenge without the use of service registration and discovery mechanism. Provider three corresponding URL have been directly disposed in a Consumer, the players in the development and testing directly access the corresponding Provider hostname by Provider-small like.

The title match analysis

Topic description is very simple, without considering the Consumer direct refusal, the scene can be simplified to issue 3-to-1, but how this decision is in the challenge inspection difficult and important.

Providing Random official title group as the default algorithm implemented: 3 Provider from any one of the randomly selected. Dispatcher for Single (Consumer in this race is in question) scene homogeneous system, Random asymptotically can load balance requests per total number of received close Provider. However, for multi dispatcher or heterogeneous system, Random algorithm due to lack of global state, we can not guarantee global random, under extreme conditions, dispatcher may request a plurality of simultaneously assigned to a Provider, cause the system to overload and presence service downtime risk; heterogeneous systems, different service Provider actual capacity is different, even if each Provider request will produce the same rate of free, stable, different services overload state, the optimal traffic distribution can not be achieved, but can not achieve response times a minimum. Obvious, Random adaptive algorithm is not in line with the requirements of the title race.

So, how adaptive load balancing to achieve it? ️ Next we will use the title given the conditions described progressive approach to the design process of this algorithm.

Adaptive algorithm must first solve the problem of how to assess the capacity of the service.

The game according to different hardware specifications, Provider is divided into small, medium, large, and three kinds of ratio, CPU, and memory for the corresponding 1: 2: 3. In the evaluation process, it is dynamically change the processing capability of each Provider, mainly on the maximum number of concurrent single change in response time and allowable. When request rate is too fast from the Consumer, end New Provider requests queued for processing, when the number of threads and the number of queued work threads and the maximum number of threads, the thread pool exhaustion Provider returns abnormal. In the process of implementation and tuning algorithm, you should try to avoid abnormal thread pool to reduce queuing. How to combine a good program and hardware limitations, to distinguish different stages of bottlenecks, the capacity to make a realistic assessment of the difficulties is the first title match. For parameters and change the course of this subject used alone existing theory and practice hard to achieve the best, so players need to fully understand the meaning of the questions and the parameters of configuration, design algorithm is more suitable for the actual scene.

The second issue to consider is how to apply the results of the capacity assessment, that is how to maintain state on behalf of Provider service capabilities, and how to select Provider stage make decisions based on these states?

Dispatcher conventional single load balancing model state by a Dispatcher Provider all maintained in a homogeneous system, this approach can achieve asymptotic optimal load balancing. However, it is also obvious problems: there is a single natural Dispatcher performance bottlenecks, poor scalability, increased exponentially when the number of Provider, also require a maintenance Dispatcher increased exponentially, the communication costs are increased. The Challenge In order to reduce the difficulty, not to build a multi-topic Dispatcher model, but more Dispatcher, multi Provider service is Dubbo and other micro-frame in the actual production environment, the most common situation. Therefore, if high performance and scalability good balancing algorithm would be a nice bonus items.

The third point is to use an auxiliary interface. In order not to limit the algorithm design ideas, the title match provides a number of possible auxiliary interfaces used, including support for two-way communication, Provider current limiting. But these interfaces are non-mandatory, if the need to use these interfaces depends on the algorithm implementation.

In the evaluation environment, any of a Provider service request rate is less than the processing rate of the evaluation process. Provider three total processing speed will float down the rate request. The final result of the request is successful and the maximum number of TPS composed of failed requests will not be graded. For this limit can be interpreted in two ways, first, to ensure that the service is not heavily loaded, can be appropriately reject the request. The second point is the need to make the capacity of each Service Provider, to ensure optimal performance of the Provider requests a reasonable number, a suitable overload is allowed.

More than only as a major algorithm design ideas, excellent load balancing algorithms implemented on the project is also very crucial point, need to select the appropriate data structure, full use of memory and CPU, squeezing every bit of performance out of the game environment. Of course, the evaluation score is not everything, good code structure, coding style and versatility, but also in the final preliminary round results account for a large proportion.

Reviews tournament title

An evaluation table 4 by the nuclear environment. 8G pressing machine, a gateway machine 8G core 4 and three of the 4-core 8G Provider composition. Consumer and Provider program will limit the CPU and memory usage of each evaluation task will be exclusive five machines.

Ready to run sub-environment, create and locked workspace;
The address Git submitted the code from the code repository pull;
Building codes, generating a final fat JAR performed;
Start three Provider, and validate service availability;
Start Consumer, and validate service availability;
Preheating the system for 30 seconds;
The formal evaluation of one minute;
Take a formal evaluation of the total number of successful requests and maximum TPS as the final score, Tianchi reporting system;
According to the order to stop Consumer, three Provider;
Examples of cleaning and Docker mirror;
Collect logs and upload to the OSS;
Unlock the work area, clean up the environment.

to sum up

In this paper, the fifth Middleware Performance Challenge tournament title background, the title scene, the title analysis and evaluation of environmental and process point of view, introduces the basic design ideas adaptive load balancing algorithm, in the hope the students will participate in the competition to have the help is also welcome more students to enroll in our technological challenge, share your thinking and practice in the algorithm.

Original link
This article Yunqi community original content may not be reproduced without permission.