What are the common strategies for load balancing?

Analysis & Answers


Round Robin

This method round-robin distributes incoming requests to each machine in the server cluster, ie the active server. If this method is used, all servers marked into the virtual service should have similar resource capacity and load the same application. If all servers have the same or similar performance then choosing this method will make the server load the same. Based on this premise, round-robin scheduling is a simple and efficient way to distribute requests. However, for different servers, choosing this method means that the server with weaker capabilities will also accept the round robin in the next cycle, even if the server can no longer handle the current request. This can lead to overloading of less capable servers.

Weighted Round Robin

This algorithm solves the shortcomings of the simple round-robin scheduling algorithm: incoming requests are distributed to servers in the cluster in order, but taking into account the weight assigned to each server in advance. The administrator simply defines the weight of each server through the processing capacity of the server. For example, the most capable server A is given a weight of 100, while the least capable server is given a weight of 50. This means that server A will receive 2 consecutive requests before server B receives the first request, and so on.

Least Connection

What neither of the above methods take into account is that the system does not recognize how many connections are held at a given time. So it might happen that server B server receives fewer connections than server A but it is overloaded because users on server B are keeping connections open for longer. This means that the number of connections, ie the load on the server, is cumulative. This potential problem can be avoided with a "least connections" algorithm: incoming requests are distributed based on the number of connections currently open to each server. That is, the server with the fewest active connections will automatically receive the next incoming request. Basically the same principle as simple polling: all servers with virtual services should have similar resource capacity. It is worth noting that in a configuration environment with low traffic rate, the traffic of each server is not the same, and the first server will be given priority. This is because, if all servers are the same, then the first server takes priority until there is continuous active traffic to the first server, otherwise the first server will always be preferred.

Source IP Hash Source IP Hash

This method generates a hash value of the request source IP and uses this hash value to find the correct real server. This means that for the same host it always corresponds to the same server. Using this way, you don't need to save any source IP. However, it should be noted that this method may cause server load imbalance.

Least Connection Slow Start Time

For the minimum number of connections and weighted minimum number of connections scheduling methods, when a server just joins the online environment, a time period can be configured for it. During this period, the number of connections is limited and increases slowly. . This provides a 'transition time' for the server to ensure that the server does not become overloaded with too many connections allocated just after startup. This value is set in the L7 configuration interface.

Weighted Least Connection Weighted Least Connection

If the resource capacity of servers varies, then the "weighted least connections" approach is more appropriate: the number of active connections, determined by weights customized by the administrator based on the server situation, generally provides a very balanced utilization of the server, because it draws on Take advantage of both the least connections and weights. Generally, this is a very fair distribution method because it uses the number of connections and the server weight ratio; the server with the lowest ratio in the cluster automatically receives the next request. Note, however, that when using this method in low-traffic situations, refer to the considerations in the "minimum number of connections" method.

Agent Based Adaptive Balancing Agent Based Adaptive Balancing

In addition to the above methods, the load host contains an adaptive logic to periodically monitor the server status and the weight of the server. For the very powerful "proxy-based adaptive load balancing" method, the load master periodically detects the load status of all servers in this way: each server must provide an include file, which contains a number from 0 to 99 Used to indicate the actual load of the server (0 = unprecedented, 99 = overloaded, 101 = failed, 102 = administrator disabled), and the server uses the http get method to obtain this file; at the same time, for servers in the cluster, It is also part of the server's job to provide its own load status in the form of a binary file. However, there are no restrictions on how the server calculates its own load status. According to the overall load of the server, there are two strategies to choose from: In normal operation, the scheduling algorithm calculates a weight ratio based on the ratio of the collected server load value and the number of connections allocated to the server. Therefore, if a server is overloaded, the weights are adjusted transparently through the system. As with weighted round-robin scheduling, incorrect assignments can be logged so that different servers can be effectively assigned different weights. However, in a very low-traffic environment, the load values ​​reported by the server will not establish a representative sample; then allocating load based on these values ​​will lead to loss of control and command oscillation. Therefore, it is more reasonable in this case to calculate the load distribution based on static weight ratios. When the load of all servers is lower than the lower limit defined by the administrator, the load host will automatically switch to the weighted round robin mode to distribute requests; if the load is greater than the lower limit defined by the administrator, the load host will switch back to the adaptive mode.

Fixed Weighted

The highest weight is only used if all other servers have low weight values. However, if the server with the highest weight goes down, the next highest priority server will serve the client. The weight of each real server in this way needs to be configured based on server priority.

Weighted ResponseWeighted Response

Traffic scheduling is by weighted round robin. The weights used in weighted round robin are calculated based on the response time of the server availability check. Each validity check is timed to mark how long it took for it to respond successfully. However, it should be noted that this method assumes that the server heartbeat detection is based on the speed of the machine, but this assumption may not always be true. The sum of the response times of all servers on the virtual service is added together, and this value is used to calculate the weight of the individual service's physical server; this weight value is calculated approximately every 15 seconds.

Reflect & Expand

5 types of load balancing used in Dubbo:

  1. RandomLoadBalance is a load balancing strategy that is relatively easy to implement, and it is also the default load balancing strategy used by Dubbo. It is to distribute requests through weighted random and load balancing.
  2. LeastActiveLoadBalance minimum activity polling, that is, giving priority to the service with the smallest activity for calling. Activity is simply the number of service calls. The number of service calls is stored through a ConcurrentHashMap to obtain the smallest call. If there are multiple smallest calls Then call it in the above random way.
  3. ConsistentHashLoadBalance Consistent Hash is a way to specifically locate services through request parameters. Dubbo obtains specific service addresses through the consistent Hash algorithm. In order to prevent resource tilt, virtual nodes are added.
  4. RoundRobinLoadBalance is a weighted polling algorithm that simulates polling through weights. Each service maintains a static weight and a dynamic weight that keeps changing. Each service will select the service with the largest dynamic weight, and then trim the dynamic weight of the service from the total weight. The next time the dynamic weight is calculated, it is obtained by [original weight] + [dynamic weight], that is, after selecting the service with the highest weight , the weight will become lower, and the weight of the service that has not been selected will gradually become higher, playing the role of weighting and polling.
  5. ShortestResponseLoadBalance is the shortest response load balancing, that is, calling the service with the longest load balancing response time. If there are multiple, it is selected by weighted randomness. Similar to LeastActiveLoadBalance, it takes the value from a ConcurrentHashMap, obtains the average of previous response times, and then Compare.

Meow Interview Assistant: One-stop solution to interview questions. You can search the WeChat applet [Meow Interview Assistant] or follow [Meow Interview Assistant] -> Interview Assistant to answer questions for free. If you have any good interview knowledge or skills, I look forward to sharing them with you!

Guess you like

Origin blog.csdn.net/jjclove/article/details/124924011