Introduction to Distributed Current Limiting Algorithm and Implementation

Under the distributed system architecture, in the face of sudden increase in high concurrent access requests, how to implement current limiting to protect the availability of the system is a problem that needs attention. There are many distributed current limiting implementation mechanisms, including gateway-based implementation and middleware-based implementation such as Redis. This article briefly introduces common algorithms and implementation solutions for current limiting.


1. Overview of distributed current limiting

The so-called "current limiting" is a technical means adopted to ensure the stability and availability of the system after the system processing capacity is exceeded in high-concurrency business scenarios. For example, 12306 pictures of verification codes for grabbing train tickets, queuing processing under flash sales activities such as Double Eleven, etc., various technology manufacturers have also emerged in an endless stream of current limiting methods. Current restrictions in real life are also very common, such as subway traffic restrictions during peak periods, scenic spot traffic restrictions during the Golden Week, and nucleic acid queues during the peak period of the epidemic.
insert image description here

In a distributed system, when facing high concurrent access requests with large amounts of data, there may be multiple service nodes working together, and each node may process multiple requests. Therefore, a method is needed to control the access frequency of each request, so as to avoid excessive system load or crash caused by too many requests. A commonly used technical method is current limiting. Current limiting is actually a restriction on resource access in a certain time window. Distributed current limiting is to consider the entire distributed environment as a whole. When the number of requests or the concurrent limit is reached, it will Limits such as waiting, queuing, downgrading, or denial of service.

1.1 Distributed current limiting rules

Current limiting is usually a limitation on two dimensions of time and resources: 1) based on a specific time window such as per minute and per second; 2) based on available resources such as setting the maximum number of visits or connections.

insert image description here

In actual usage scenarios, restrictions will be based on multiple rules. The common current limiting rules are as follows:

  1. Control based on QPS and number of connections: Set the current limit value according to the QPS and number of connections of each node. If the request volume of a node exceeds the current limit value, the request will be rejected.
  2. Based on transfer rate limiting: Control the rate of requests by limiting the transfer rate of each request, and reject the request if the transfer rate exceeds the limit.
  3. Restriction based on IP latitude: Limit the flow based on the IP address of each request. If the IP addresses of multiple requests belong to the same subnet, limit the flow according to the same flow-limiting rule.
  4. Server-based restriction: Limit traffic according to the server accessed by each request. If multiple requests access the same server, restrict traffic according to the same traffic-limiting rule.
  5. Black and white list restrictions: set the reason for rejecting the request to blacklist, and set the reason for allowing the request to white list. The blacklist can be set according to actual needs, and the whitelist can be obtained from configuration files or user input.

In the process of using in actual business scenarios, it is necessary to adjust and optimize according to the specific situation to ensure that the distributed current limiting algorithm can meet the actual needs and improve the performance and availability of the system.

1.2 Distributed current limiting scheme

The mainstream solutions for distributed current limiting include current limiting from user clients, current limiting at the service gateway layer, and current limiting based on middleware or current limiting components.

  1. Client-side current limiting: Similar to the verification code restriction of 12306 ticket grabbing, the purpose of restricting traffic is achieved by adding verification access at the front end, thereby greatly reducing the access pressure of the system.
  2. Gateway layer current limiting strategy: In a distributed system, the service gateway, as the first checkpoint of the distributed link, undertakes all user requests. As shown in the figure, from service gateway layer->backend service->backend cache->database layer, the entire traffic access is a funnel type. After traffic control is done at the gateway layer, the backend service will be greatly reduced. layer pressure. Common gateway implementations can use Nginx, Spring Cloud Gateway, etc.

insert image description here

  1. Middleware-based current-limiting strategy: The current-limiting measures at the gateway layer cannot be controlled by the back-end business, so you can use middleware such as Redis to implement the current-limiting strategy at the service layer, and use Redis scripting programming capabilities to link the current-limiting module and business Functional decoupling, as well as Redis high concurrency features and high availability architecture to achieve a stable middleware layer current limiting mechanism.
  2. Current limiting based on current limiting components: Some open source current limiting components such as Ali's Sentinel can implement current limiting functions

2. Introduction to common algorithms for distributed current limiting

Whether it is the gateway layer current limiting, or the implementation based on the middleware layer, the specific distributed current limiting algorithm is nothing more than the following:

  1. Counter current limiting algorithm: count the number of requests within the effective time, call once +1, call end -1. This can be achieved using Redis' incr or other counting tools.
  2. Sliding window current limiting algorithm: Every time a step is passed, the overall time area slides, and the sliding window mechanism reduces the problem of exceeding the threshold caused by the critical value.
  3. Leaky bucket current-limiting algorithm: A constant-rate current-limiting algorithm, no matter how much the client requests, the speed at which the server processes requests is constant.
  4. Token bucket current-limiting algorithm: There is a token bucket and a timer, and a fixed number of tokens are generated in the token bucket within a period of time, and the request takes a token from the bucket. If the token is gone, the Queue or refuse service.
2.1 Counter Current Limiting Algorithm

The counter current limiting algorithm is a simple and effective current limiting scheme. It assigns a counter to each request, and when the counter reaches a certain value, the processing rate of the request is limited or the request is rejected. This solution is simple and easy to implement, but the current limiting may not be accurate enough.

The steps to implement the counter current limiting algorithm are as follows:

  • Set a counter count, increase the counter by one when receiving a request, and record the current time at the same time.
  • Determine whether the current time is the same minute as the last statistical time. If yes, judge whether the count exceeds the threshold, and if so, return flow-limiting rejection. If not, reset count to 1.

insert image description here

The problem with the counter flow-limiting algorithm is that when the statistical data in the cross-window time range exceeds the threshold, for example, the request is 100 at T-1m, and the request is also 100 at T+1m, which is equivalent to 200 requests within 2 minutes, and a request glitch occurs. Phenomenon, obviously does not meet the requirements of current limiting. To solve this problem, a sliding window current limiting algorithm was developed.

2.2 Sliding window current limiting algorithm

The sliding window current limiting algorithm is a window-based current limiting scheme, which limits the processing rate of requests by maintaining a sliding window. The basic idea of ​​the algorithm is to assign requests to different windows in chronological order, and the number of requests in each window can be adjusted according to actual needs. When the number of requests in a certain window reaches the limit value, limit the processing rate of requests or reject requests. After the traffic is shaped by the sliding time window algorithm, it can ensure that in any time window, it will not exceed the maximum allowable current limit value. From the perspective of the traffic curve, it will be smoother, which can partially solve the critical burst traffic problem mentioned above.

The steps to implement the sliding window current limiting algorithm are as follows:

  • Initialize request counter count, window size, and sliding window window.
  • Receive a request, calculate the time delta of the request, and assign the request to the corresponding window.
  • Update the window size size, move the window one position to the right, and recalculate the number of requests in the window.
  • Limit the processing rate of requests or reject requests if the number of requests exceeds the limit value.

insert image description here

As shown in the figure, if 100 requests are received at the critical point, new requests are received when the next window comes, but the requests in the entire sliding window have reached the online limit, and no new requests will be accepted.

2.3 Leaky Bucket Current Limiting Algorithm

The leaky bucket (Leaky Bucket) flow-limiting algorithm can be understood as the process of water injection and leakage. Water flows into the leaky bucket at an arbitrary rate and water flows out at a fixed rate. When the water exceeds the capacity of the bucket, it will be overflowed, that is, discarded. Because the bucket capacity is constant, the overall rate is guaranteed. The leaky bucket is to put the request into the bucket. If the bucket is full, the subsequent data packets will be lost.

  • The inflow of water droplets can be regarded as a request to access the system, and the inflow rate is uncertain;
  • Bucket capacity is generally used to indicate the number of requests that the system can handle. If the capacity of the bucket is full, it will reach the current limit threshold, and the water droplets will be discarded (that is, the request will be rejected);
  • Outgoing water droplets are constant rate, which is used to indicate that the service processes requests at a fixed rate.

insert image description here

The main steps of the leaky bucket current limiting algorithm:

  1. Initialization: Define the capacity of the bucket and the outflow rate of the leaky bucket.
  2. Leaky bucket cache: When a request arrives, if the capacity of the bucket does not reach the upper limit, the request is accepted directly; otherwise, the request is put into the leaky bucket for processing.
  3. Process requests in the leaky bucket: Periodically take out pending requests from the leaky bucket and process them at a constant rate until the leaky bucket is empty.
  4. External request processing: If the number of external requests exceeds the processing capacity of the system, the request needs to be limited to prevent system overload.
  5. Smooth traffic at startup: If you want the system to start up one after another, you don’t need to accept too many requests at once, and let the system handle requests to increase to the maximum processing capacity.

It should be noted that the advantage of the leaky bucket algorithm is that it can ensure a constant processing speed, but the disadvantage is that it cannot cope with burst traffic. Therefore, it usually needs to be used in combination with other algorithms to achieve a more flexible and efficient current limiting strategy.

2.4 Token Bucket Current Limiting Algorithm

The token bucket (Token Bucket) algorithm is an improvement on the leaky bucket algorithm, which can not only smooth the current limit, but also allow a certain degree of traffic burst.

  1. Token: Only the Request request that has obtained the token will be processed, and other Request requests will either be queued or discarded directly
  2. Bucket: the place used to hold tokens, all Requests get tokens from this bucket

insert image description here

The implementation of the token bucket current limiting algorithm mainly includes the following steps:

  1. Define the capacity of the token bucket and the release rate of tokens: the capacity of the token bucket and the release rate of tokens indicate the strength of flow control. The capacity of the token bucket represents the maximum number of tokens that can be stored in the bucket, and the release rate of tokens represents the number of tokens released per unit time (such as per second). The capacity of the token bucket and the release rate of tokens can be adjusted according to actual needs.
  2. Add tokens to the token bucket: If there is traffic to send or request, you need to get tokens from the token bucket first. If there are tokens left in the token bucket, traffic can be sent or requested. If the token bucket has reached its capacity, new traffic will be rejected.
  3. Maintain the number of tokens in the token bucket: In order to dynamically adjust the flow control strength, it is necessary to add new tokens to the token bucket regularly. At the same time, it is also necessary to regularly check the number of tokens in the token bucket, and discard tokens exceeding a certain number.
  4. Handling of rejected requests or data traffic: When rejecting new requests or data traffic, some measures need to be taken to deal with the rejected requests or data traffic. For example, timeout errors or retry mechanisms can be returned, etc.

The advantages of the token bucket flow-limiting algorithm include the ability to deal with burst traffic and fairness, that is, all requests or data traffic are controlled at the same rate, and there will be no situation that sometimes passes quickly and sometimes passes slowly. The disadvantage is that the number and release rate of tokens need to be dynamically adjusted to adapt to different network environments and traffic characteristics. Therefore, compared with the leaky bucket current limiting algorithm, the token bucket current limiting algorithm is more suitable for complex flow control scenarios.

3. Distributed current limiting implementation scheme

3.1 Implementation using Nginx

Nginx can use a module similar to a current limiter to implement distributed current limiting, which can limit the access rate of each connection or control the number of concurrent connections on each process or sub-process. The following is a simple example of distributed rate limiting using Nginx and the ratelimit module:

1) Control rate

limit_req_zone $binary_remote_addr zone=mylimit:10m rate=2r/s;
  • $binary_remote_addr indicates that the remote_addr flag is used to limit, and the purpose of "binary_" is to abbreviate the memory usage and limit the IP address of the same client.
  • zone=mylimit:10m means to generate a memory area with a size of 10M and a name of one, which is used to store access frequency information.
  • rate=2r/s means that the access frequency of clients with the same identifier is allowed, here the limit is 2 times per second, and it can also be 30r/m for example.

2) Control the number of concurrent connections

 #统一在http域中进行配置
 #限制请求
 limit_req_zone $uri zone=api_read:20m rate=50r/s;

 #按ip配置一个连接 zone
 limit_conn_zone $binary_remote_addr zone=perip_conn:10m;

 #按server配置一个连接 zone
 limit_conn_zone $server_name zone=perserver_conn:100m;
 
===== server =====
location / {
 if (!-e $request_filename){
  rewrite ^/(.*) /index.php last;
 }
 
 #请求限流排队通过 burst默认是0
 limit_req zone=api_read burst=100;
 
 #连接数限制,每个IP并发请求为50
 limit_conn perip_conn 50;
 
 #服务所限制的连接数(即限制了该server并发连接数量)
 limit_conn perserver_conn 200;
 
 #连接限速
 #limit_rate 100k;
}

The current limiting configuration parameters are as follows:

  • rate=50r/s: Add 50 tokens per second
  • burst=100: There are 100 tokens in the token bucket
  • perip_conn 50: Up to 50 concurrent connections per IP
  • perserver_conn 200: limit the number of concurrent connections to the server
3.2 Implementation using Redis

1) Redis+Lua script realizes current limiting

Redis+Lua scripts can be used to achieve high concurrency and high performance traffic limit

-- 获取调用脚本时传入的第一个key值(用作限流的 key)
local key = KEYS[1]
-- 获取调用脚本时传入的第一个参数值(限流大小)
local limit = tonumber(ARGV[1])
 
-- 获取当前流量大小
-- (redis.call方法,从缓存中get和key相关的值,如果为null那么就返回0)
local curentLimit = tonumber(redis.call('get', key) or "0")
 
-- 判断缓存中记录的数值是否会大于限制大小,如果超出表示该被限流,返回0
-- 如果未超过,那么该key的缓存值+1,并设置过期时间,并返回缓存值+1
 
-- 是否超出限流(如果超出限流大小,直接返回)
if curentLimit + 1 > limit then
    return 0
else
    redis.call("INCRBY", key, 1) -- 没有超出 value + 1(请求数+1)
    redis.call("EXPIRE", key, 2) -- 设置过期时间(设置2秒过期)
    return 1 -- 放行请求
end

2) Redis implements token bucket algorithm

A token can be obtained from Redis each time a request is made, and the request is rejected when no token is available in the token bucket. The detailed process is as follows:

  1. Initialize the token bucket: Create a new key in Redis, for example "rate_limiter:myapp", the value of this key should be an ordered set.
  2. Add token: Use the INCRBY command in Redis to auto-increment the value of the "rate_limiter:myapp" key, for example INCRBY 1. This will add a token to the sorted set.
  3. Get Token: Use the GET command in Redis to get the value of the "rate_limiter:myapp" key, for example GET "rate_limiter:myapp". This will return the number of tokens in the sorted set.
  4. Determine whether to limit the current: If the number of tokens obtained is greater than or equal to the capacity of the token bucket, it means that the token bucket is full and needs to be limited. Otherwise, it means that the token bucket is not full and the request can continue to be processed.
  5. Current limiting processing: If current limiting is required, use the LPOP command in Redis to pop up a token from the left side of the ordered collection, for example LPOP "rate_limiter:myapp". This will remove a token from the sorted set and return the token that was removed. In throttling processing, the deleted token can be returned with the client request to tell the client how long to wait before sending the request again.
  6. Add expiration time: In order to prevent the tokens in the token bucket from increasing without an upper limit due to Redis downtime, it is necessary to add an expiration time to the ordered collection. The expiration time can be set to the rate limit (that is, the number of tokens added per second), for example, the expiration time of the "rate_limiter:myapp" key can be set to 1 second. In this way, even if Redis is down, a new token will be added to the sorted set every 1 second, so as to ensure that the number of tokens in the sorted set will not exceed the token bucket capacity.
  7. Reinitialization: If you need to restart the rate limit, you can use the DEL command to delete the "rate_limiter:myapp" key and reinitialize the token bucket.
3.3 Implementation using Spring Cloud Gateway

Spring cloud gateway provides a self-implemented current limiting filter RequestRateLimiterGatewayFilterFactory, which has two parameters: one is KeyResolver, and the other is RateLimiter, which is implemented by token bucket algorithm.

1) Configure Redis and current limiting information

- id: server2
 uri: lb://eureka-server1
 predicates:
   - Path=/server1/test
 filters:
   - name: RequestRateLimiter
     args:
      key-resolver: "#{@hostAddrKeyResolver}"
      redis-rate-limiter.replenishRate: 1 # 令牌桶填充的速率 秒为单位
      redis-rate-limiter.burstCapacity: 1 # 令牌桶总容量
      redis-rate-limiter.requestedTokens: 1 # 每次请求获取的令牌数

Configuration parameters: The token generation rate is 1/s, the total capacity of the token bucket is also 1, and each request obtains a token. That is, only one request is allowed per second.


References:

  1. https://blog.csdn.net/tanwei010915/article/details/122489227
  2. https://blog.csdn.net/qq_38584262/article/details/108273978
  3. https://blog.csdn.net/kansting/article/details/126560538
  4. https://blog.csdn.net/tanwei010915/article/details/122489227

Guess you like

Origin blog.csdn.net/solihawk/article/details/131060156