Current limiting processing in high concurrency scenarios

For business systems, high concurrency is to support "massive user requests", and QPS will be hundreds of times or even higher than usual.
If high concurrency is not considered, even if the business system usually runs well, once the amount of concurrency increases, various weird business problems will occur frequently. For example, in the e-commerce business, there may be loss of user orders and inventory deductions. Abnormal, oversold and other issues.

Current limiting is a means of service degradation. As the name implies, the purpose of protecting the system is achieved by limiting the flow of the system.

Reasonable current limiting configuration requires an understanding of the throughput of the system. Therefore, current limiting generally requires a combination of capacity planning and pressure testing.

When the external request approaches or reaches the maximum threshold of the system, the current limit is triggered and other measures are taken to downgrade to protect the system from being overwhelmed. Common downgrade strategies include delayed processing, denial of service, random denial, etc.

The current limiting strategy is actually very similar to the thread pool in Java concurrent programming. We all know that when the thread pool is full, different rejection strategies can be configured, such as:

AbortPolicy, will discard tasks and throw exception
DiscardPolicy, discard tasks, not throw exception
DiscardOldestPolicy, etc. Of course, you can also implement the rejection strategy yourself.
Java's thread pool is a small feature point in the development, but it can also be extended to the system. In terms of design and architecture, knowledge is transferred and reused reasonably.

There is a very key point in the current limiting scheme, that is, how to judge that the current flow has reached the maximum value we set. There are different implementation strategies. The following is a simple analysis.

  1. Counter method
    Generally speaking, when we perform current limiting, we use the number of requests per unit time, which is commonly referred to as QPS. The most direct idea of ​​counting QPS is to implement a counter.

The counter method is the simplest algorithm in the current limiting algorithm. We assume that an interface limits the number of accesses within 100 seconds to no more than 10,000 times. A counter is maintained. Each time a new request comes, the counter is increased by 1.

Judging at this time,

If the value of the counter is less than the current limit value, and the time interval with the last request is within 100 seconds, the request is allowed to pass, otherwise the request
is rejected. If the time interval is exceeded, the counter should be cleared.
The code below uses AtomicInteger as the counter, Can be used as a reference:

public class CounterLimiter {
    
     
    //初始时间 
    private static long startTime = System.currentTimeMillis(); 
    //初始计数值 
    private static final AtomicInteger ZERO = new AtomicInteger(0); 
    //时间窗口限制 
    private static final int interval = 10000; 
    //限制通过请求 
    private static int limit = 100; 
    //请求计数 
    private AtomicInteger requestCount = ZERO; 
    //获取限流 
    public boolean tryAcquire() {
    
     
        long now = System.currentTimeMillis(); 
        //在时间窗口内 
        if (now < startTime + interval) {
    
     
            //判断是否超过最大请求 
            if (requestCount.get() < limit) {
    
     
                requestCount.incrementAndGet(); 
                return true; 
            } 
            return false; 
        } else {
    
     
            //超时重置 
            requestCount = ZERO; 
            startTime = now; 
            return true; 
        } 
    } 
} 
 

The counter strategy limits the current, and can be extended from a single point to a cluster, which is suitable for application in a distributed environment.

Single-point current limiting can use memory. If you expand to cluster current limiting, you can use a separate storage node, such as Redis or Memcached, for storage. Set the expiration time in a fixed time interval, and then you can count the cluster traffic and perform the overall Limiting.

The counter strategy has a big disadvantage, it is not friendly to the critical flow, and the current limit is not smooth enough.

Assuming such a scenario, we limit users to place orders no more than 100,000 times per minute. Now at the intersection of the two time windows, 100,000 requests are sent within one second. In other words, within two seconds of window switching, the system received 200,000 order requests. This peak value may exceed the system threshold and affect service stability.

The optimization of the counter algorithm is to avoid the request of twice the window limit. It can be implemented using the sliding window algorithm, and interested students can go and find out.

  1. Leaky Bucket and Token Bucket Algorithms The
    leaky bucket algorithm and the token bucket algorithm are more widely used in practical applications and are often compared.

The leaky bucket algorithm can be compared with the leaky bucket. Assuming that there is a fixed-capacity bucket and a small hole is drilled at the bottom to leak water, we control the processing of requests by controlling the rate of water leakage to achieve current limiting.

The rejection strategy of the leaky bucket algorithm is very simple: if an external request exceeds the current threshold, it will be accumulated in the bucket until it overflows, and the system does not care about the overflow traffic.

The leaky bucket algorithm limits the request rate from the exit. There is no critical problem with the counter method above, and the request curve is always smooth.

One of its core problems is that the filtering of requests is too precise. We often say "there is no fish when the water is clear". In fact, the same is true in the current limit. We limit the order to 100,000 times per second, which is 100,000. What about this request? Do I have to refuse it?

In most business scenarios, the answer is no. Although the flow is limited, it is still hoped that the system allows a certain burst of traffic. At this time, the token bucket algorithm is required.

In the token bucket algorithm, suppose we have a bucket of constant size. The capacity of this bucket is related to the set threshold. There are a lot of tokens in the bucket. At a fixed rate, tokens are put into the bucket. When it is full, discard the tokens, and the maximum number of tokens that can be stored in the bucket will never exceed the bucket size. When a request comes in, it tries to remove a token from the bucket. If the bucket is empty, the request will be rejected.

I wonder if you have used Google's Guava open source toolkit? The tool class RateLimiter of the limited flow strategy in Guava, RateLimiter implements flow restriction based on the token bucket algorithm, which is very convenient to use.

RateLimiter will throw tokens into the bucket at a certain frequency, and the thread can execute the token after it gets the token. The API of RateLimter can be directly applied. The main methods are acquire and tryAcquire.

acquire will block, tryAcquire method is non-blocking.

Here is a simple example:

public class LimiterTest {
    
     
    public static void main(String[] args) throws InterruptedException {
    
     
        //允许10个,permitsPerSecond 
        RateLimiter limiter = RateLimiter.create(100); 
        for(int i=1;i<200;i++){
    
     
            if (limiter.tryAcquire(1)){
    
     
                System.out.println("第"+i+"次请求成功"); 
            }else{
    
     
                System.out.println("第"+i+"次请求拒绝"); 
            } 
        } 
    } 
} 
 

Comparison of different current-limiting algorithms The
counter algorithm is relatively simple to implement and is particularly suitable for use in clusters. However, critical conditions must be considered and sliding window strategies can be used for optimization. Of course, it also depends on specific current-limiting scenarios.

The leaky bucket algorithm and the token bucket algorithm. The leaky bucket algorithm provides a relatively strict current limit. The token bucket algorithm allows a certain degree of burst traffic in addition to the current limit. In actual development, we do not need to control the flow so precisely, so the token bucket algorithm has more applications.

If the peak traffic we set is permitsPerSecond=N, which is the number of requests per second, the counter algorithm will have 2N traffic, the leaky bucket algorithm will always limit the N traffic, and the token bucket algorithm allows more than N, but not Reach a peak as high as 2N.

Reference: "2020 latest Java basics and detailed video tutorials and learning routes!

Link: https://segmentfault.com/a/1190000038949009

Guess you like

Origin blog.csdn.net/weixin_46699878/article/details/112524965