Analysis of Load Balancing

Foreword
The concept of load balancing is often mentioned in our work, because looking at the entire link layer of our system, each layer will use load balancing, from the access layer, the service layer, to the final data layer, and of course. There are MQ, distributed caching, etc., there will be some load balancing ideas in it; a short definition of load balancing: it is to allocate requests to multiple operation units for execution; in fact, it is a divide and conquer idea. In the case of high concurrency, this is a very effective method.
In the
brief definition of core functions above, we can roughly see two things: request distribution, operation unit; in fact, controller + actuator mode, Master + Worker mode, etc., is it very familiar; of course, a mature load balancer The device not only has these two core functions, but also some other functions. Let's see which core functions are available:

Operating unit configuration

The operation unit here is actually the upstream server, which is the real executor to handle the business. This needs to be configurable (preferably supporting dynamic configuration) to facilitate users to add and delete operation units; these operation units are the load balancer to distribute messages Object;

Load balancing algorithm

Since it needs to be distributed, how to distribute the message to the configured executor, which requires related distribution algorithms, such as our common polling, random, consistent hashing, etc.;

Failure retry

Since multiple execution units are configured, the downtime of a server is a high probability event, so when we distribute requests to a downed server, we need to have a failure retry function to redistribute the request to normal executors ;

health examination

The above failure retry is to know that the server is down only when it is really forwarded. It is a lazy strategy. The health check is to exclude the down machine in advance. For example, it is common to check whether the executor is still alive by heartbeat. ;
With the above core functions, a load balancer is roughly formed. These principles can be used in many places to form different middleware or embedded in various middleware, such as the access layer LVS, F5, Nginx, etc., various RPC frameworks in the service layer, message queue RocketMQ, Kafka, distributed cache Redis, memcached, database middleware shardingsphere, mycat, etc. This divide-and-conquer idea is widely used in various middleware , The following is an analysis of how some common middleware do load balancing, which can be roughly divided into two types: stateful and stateless; the
stateless
execution unit itself has no state, in fact, it is easier to do load balancing. The execution units are the same. Common stateless middlewares include Nginx, RPC framework, distributed scheduling, etc.; the
access layer
Nginx can be said to be our most common access layer middleware, providing four to seven layers The load balancing function provides high-performance forwarding and supports the above core functions;

Operating unit configuration

Nginx provides a simple static operation unit configuration, as follows:

upstream tomcatTest {
    
    
     server 127.0.0.1:8081;   #tomcat-8081
     server 127.0.0.1:8082;   #tomcat-8082
}
location / {
    
    
     proxy_pass http://tomcatTest;
}
复制代码

The above configuration is static. If you need to add or delete it, you need to restart Nginx, which is very inconvenient. Of course, dynamic unit configuration is also provided, which requires the help of a third-party service registry such as Consul, etcd; the principle is roughly as follows:

The operation unit will be registered in Consul when it is started, and the downtime will be removed from Consul; the Nginx side will start a Consul-template listener to monitor the changes of the operation unit on Consul, and then update the upstream of Nginx, preferably reload upstream;

Load balancing algorithm

Common examples: ip_hash, round-robin, hash; the configuration is also very simple:

upstream tomcatTest {
    
    
     ip_hash  //根据ip负载均衡,也就是常说的ip绑定
     server 127.0.0.1:8081;   #tomcat-8081
     server 127.0.0.1:8082;   #tomcat-8082
}
复制代码

Failure retry

upstream tomcatTest {
    
    
     server 127.0.0.1:8081 max_fails=2 fail_timeout=20s;
}
location / {
    
    
     proxy_pass http://tomcatTest;
     proxy_next_upstream error timeout http_500;
}
复制代码

When max_fails fails in fail_timeout, it means that this execution unit is unavailable; through proxy_next_upstream configuration, when a configuration error occurs, the next execution unit will be retried;

health examination

Nginx performs health checks by integrating the nginx_upstream_check_module module; supports TCP heartbeat and Http heartbeat detection;

upstream tomcatTest {
    
    
     server 127.0.0.1:8081;
     check interval=3000 rise=2 fall=5 timeout=5000 type=tcp;
}
复制代码

interval: detection interval time;
rise: how many times the detection is successful, the operating unit is identified as available;
fall: how many times the detection fails, the operating unit is identified as unavailable;
timeout: detection request timeout;
type: detection types include tcp, http ;

Service layer The
service layer is mainly microservice frameworks such as Dubbo, Spring Cloud, etc., which have integrated load balancing strategies and are very convenient to use;

Operating unit configuration

The RPC framework generally relies on the registry component. In fact, it is the same as Nginx dynamically changing the operating unit through the registry. The RPC framework already relies on the registry by default. The service is registered to the center when it is started, and the service is removed when it is unavailable. It is automatically synchronized to the consumer end, and the user is completely unaware. What the consumer end has to do is to perform load balancing according to the service list provided by the registry and then use the distribution algorithm;

Load balancing algorithm

Spring Cloud provides Ribbon components to achieve load balancing, and Dubbo directly built-in balancing strategies, common algorithms include: polling, random, least active calls, consistent Hash, etc.; for example, dubbo configuration polling algorithm:
<dubbo:reference interface="" loadbalance="roundrobin" />
Copy code
Ribbon configuration random rules:

@Bean
public IRule loadBalancer(){
    
    
	return new RandomRule();
}
复制代码

Failure retry

For the RPC framework, it is actually a fault-tolerant mechanism. For example, Dubbo has a variety of built-in fault-tolerant mechanisms including: Failover, Failfast, Failsafe, Failback, Forking, Broadcast; the default fault-tolerant mechanism is Failover failure automatic switching, when a failure occurs, retry other servers; Configuring fault tolerance is also very simple:

<dubbo:reference cluster="failback" retries="2"/>
复制代码

health examination

The registration center generally has a health check function, which will check whether the server is available in real time, if it is not available, it will be removed, and the update will be pushed to the consumer at the same time; it is completely unaware to the user;
distributed scheduling separates the scheduler and the executor, The executor is also provided to the scheduler through the registry, and then the scheduler performs load balancing operations. The process is basically similar and will not be introduced one by one here; it
can be found that stateless load balancing is actually more common since the registry, Through the registry to dynamically increase or decrease the execution unit, so that it is very convenient to achieve expansion and contraction;
the stateful
execution unit is more difficult than the stateless, because the state of each node is a part of the entire system, not Nodes that can be added or removed at will; common stateful middleware are: message queues, distributed caches, database middleware, etc.;
message queues
are now high-throughput, high-performance message queues are becoming more and more mainstream, such as RocketMQ, Kafka It has powerful horizontal expansion capabilities; RocketMQ introduces the Message Queue mechanism, Kafka introduces partition (Partition), a topic corresponds to multiple partitions, using the idea of ​​divide and conquer to improve throughput and performance; you can see a simple diagram of RocketMQ:
Insert picture description here
Operation Unit configuration

The operating unit in the message queue is actually the partition or Message Queue here. For example, RocketMQ can dynamically modify the number of read and write queues; RocketMQ also provides the rocketmq-console console, which can be modified directly;

Load balancing algorithm

Message queues generally have a production side and a consumer side. By default, the production side sends messages to each Message Queue in turn. Of course, you can also customize the sending strategy and can be implemented by MessageQueueSelector; the consumer side allocation strategy includes: paging mode (random allocation mode) , Manual configuration mode, designated computer room mode, nearest computer room mode, unified hash mode, ring mode;

Failure retry

For a stateful execution unit, it does not mean that it can be directly removed when it is down. It needs to ensure the integrity of the data. Normally, it will do the main and backup processing, and the host will hang the backup machine to take over; take RocketMQ as an example, each Each partition has its own backup. The strategy adopted by RocketMQ is that the backup area is only to ensure the integrity of the data. Consumers can message the data in the backup area, but will not receive the data again;

health examination

The message queue also has a core component, which can be understood as a coordinator, or as a registry. Kafka uses zookeeper and RocketMQ uses NameServer. In fact, the corresponding information is stored inside, such as Topic corresponding to Message Queue. If a broker is found to be unavailable , The information will be notified to the producer in a similar way to the registry; the common
distributed caches
of distributed caches include redis and memcached. In order to accommodate more data, they will generally be fragmented, and there are various ways of fragmentation. Take redis for example, client can do fragmentation, proxy-based fragmentation, and the official Cluster solution;

Operating unit configuration

Although the cache is also stateful, it has its particularity. It pays more attention to the hit rate. In fact, it can tolerate data loss. For example, the proxy-based fragmentation middleware codis is completely transparent to the client without affecting the service. You can add or delete redis instances under

Load balancing algorithm

Based on the premise of guaranteeing the hit rate, the proxy-based sharding method generally uses a consistent hash algorithm; and the Cluster solution officially provided by redis, because it has 16384 virtual slots built-in, so directly use the modulus to complete the sharding;

Failure retry

Stateful shards generally have backup zones. After the main zone goes down, the backup zone takes over to implement fault migration, such as the sentinel mode of redis, or the built-in function of middleware such as codis; there is no need to switch other zones, right For users, this kind of takeover is completely imperceptible;

health examination

Take redis as an example. In the sentinel mode, sentinel monitors nodes in real time by heartbeat, and implements fault migration through objective offline; it can be found that health checks are basically detected by heartbeat; the
database layer is
balanced by the database layer. Is the most complex, first is stateful, and secondly, data security is very important. Common database middleware includes: mycat, shardingjdbc, etc.;

Operating unit configuration

Take sharding as an example. The operation unit here is actually a sharded data table. The amount of data is sometimes beyond our expectations. Generally, it is rarely said how many shards are allocated to it. It is best to automatically use the load algorithm. Generate a data table, and it is best to evaluate a certain load algorithm in advance, otherwise it is difficult to change it;

Load balancing algorithm

Taking mycat as an example, it provides a variety of load algorithms: range convention, modulus, sharding by date, hash, consistent hash, sharding enumeration, etc.; for example, the following partitioning configuration by day:

<tableRule name="sharding-by-date">
    <rule>
        <columns>create_time</columns>
        <algorithm>sharding-by-date</algorithm>
    </rule>
</tableRule>
<function name="sharding-by-date" class="io.mycat.route.function.PartitionByDate">
    <property name="dateFormat">yyyy-MM-dd</property>
    <property name="sBeginDate">2021-01-01</property>
    <property name="sEndDate">2051-01-01</property>
    <property name="sPartionDay">10</property>
</function>
复制代码

Specify the start time, end time, and the number of days of the partition; because the data is continuous over time, this method has good scalability; if it is a modulo method, the number of shards must be considered clearly. It is very troublesome to change the number of shards, unlike the cache, which can use a consistent hash to ensure the hit rate;

Failure retry

For stateful nodes, the standby database is indispensable. For example, mycat provides a faulty master-slave switching function. Switching from the master to the slave when the master is down is basically this routine, and the data cannot be lost;

health examination

The same active detection is also essential. It is generally based on heartbeat statements to perform timing detection, and then perform failure master-slave switching; the
above are three common stateful middlewares. It can be found that although they are all stateful, they are based on different states of data. (Temporary, final state), the processing method is also very different;
in fact, there is also a stateful middleware: the registry, which supports multiple nodes at the same time, but each node saves the full amount of data, because the registration The center often saves a small amount of data, and the balancing strategy it provides can be as simple as stateless.
Summary In
summary, we can find that the idea of ​​divide and conquer has been widely used in various software, encountering big problems, large amounts of data, large amounts of concurrency, etc., in fact, the core idea is to split, as for how to split To use different splitting algorithms or equalization algorithms according to different business needs, and you need to ensure the basic functions described above.

Reference: "2020 latest Java basics and detailed video tutorials and learning routes!

Original link: https://juejin.cn/post/6914911007349407751

Guess you like

Origin blog.csdn.net/weixin_46699878/article/details/112347585