Soul torture: If you are in a high concurrency scenario, what do you do to achieve system current limit?

If high concurrency is not considered, even if the business system usually runs well, once the amount of concurrency increases, various weird business problems will occur frequently. For example, in the e-commerce business, there may be loss of user orders and inventory deductions. Abnormal, oversold and other issues.

Current limiting is a means of service degradation. As the name implies, the purpose of protecting the system is achieved by limiting the flow of the system.

Reasonable current limiting configuration requires an understanding of the throughput of the system. Therefore, current limiting generally requires a combination of  capacity planning  and  pressure testing  .

When the external request approaches or reaches the maximum threshold of the system, the current limit is triggered and other measures are taken to downgrade to protect the system from being overwhelmed. Common downgrade strategies include  delayed processing  ,  denial of service  ,  random denial,  etc.

The current limiting strategy is actually very similar to the thread pool in Java concurrent programming. We all know that when the thread pool is full, different rejection strategies can be configured, such as:

  • AbortPolicy, will discard the task and throw an exception
  • DiscardPolicy, discard tasks without throwing exceptions
  • DiscardOldestPolicy, etc., of course, you can also implement the rejection policy yourself

Java's thread pool is a small function point in development, but it can also be extended to the design and architecture of the system, and knowledge can be transferred and reused reasonably.

One of the key points in the current limiting scheme is  how to judge that the current flow has reached the maximum value we set.  There are different implementation strategies. The following is a simple analysis.

1. Counter method

Generally speaking, when we limit current, we use the number of requests per unit time, which is commonly referred to as QPS. The most direct idea of ​​counting QPS is to implement a counter.

The counter method is the simplest algorithm in the current limiting algorithm. We assume that an interface limits the number of accesses within 100 seconds to no more than 10,000 times. A counter is maintained. Each time a new request comes, the counter is increased by 1.

Judging at this time,

  • If the value of the counter is less than the current limit value and the time interval from the last request is within 100 seconds, the request is allowed to pass, otherwise the request is rejected
  • If the time interval is exceeded, clear the counter

The following code uses AtomicInteger as a counter, which can be used as a reference:

public class CounterLimiter { 
    //初始时间 
    private static long startTime = System.currentTimeMillis(); 
    //初始计数值 
    private static final AtomicInteger ZERO = new AtomicInteger(0); 
    //时间窗口限制 
    private static final int interval = 10000; 
    //限制通过请求 
    private static int limit = 100; 
    //请求计数 
    private AtomicInteger requestCount = ZERO; 
    //获取限流 
    public boolean tryAcquire() { 
        long now = System.currentTimeMillis(); 
        //在时间窗口内 
        if (now < startTime + interval) { 
            //判断是否超过最大请求 
            if (requestCount.get() < limit) { 
                requestCount.incrementAndGet(); 
                return true; 
            } 
            return false; 
        } else { 
            //超时重置 
            requestCount = ZERO; 
            startTime = now; 
            return true; 
        } 
    } 
} 

The counter strategy limits the current, and can be extended from a single point to a cluster, which is suitable for application in a distributed environment.

Single-point current limiting can use memory. If you expand to cluster current limiting, you can use a separate storage node, such as Redis or Memcached, for storage. Set the expiration time in a fixed time interval, and then you can count the cluster traffic and perform the overall Limiting.

The counter strategy has a big disadvantage, it  is not friendly to the critical flow, and the current limit is not smooth enough  .

Assuming such a scenario, we limit users to place orders no more than 100,000 times per minute. Now at the intersection of the two time windows, 100,000 requests are sent within one second. In other words, within two seconds of window switching, the system received 200,000 order requests. This peak value may exceed the system threshold and affect service stability.

The optimization of the counter algorithm is to avoid the request of twice the window limit. It can be implemented using the sliding window algorithm, and interested students can go and find out.

2. Leaky bucket and token bucket algorithm

The leaky bucket algorithm and the token bucket algorithm are more widely used in practical applications and are often compared.

The leaky bucket algorithm can be compared with the leaky bucket. Assuming that there is a fixed-capacity bucket and a small hole is drilled at the bottom to leak water, we control the processing of requests by controlling the rate of water leakage to achieve current limiting.

The rejection strategy of the leaky bucket algorithm is very simple: if an external request exceeds the current threshold, it will be accumulated in the bucket until it overflows, and the system does not care about the overflow traffic.

The leaky bucket algorithm limits the request rate from the exit. There is no critical problem with the counter method above, and the request curve is always smooth.

One of its core problems is  that the filtering of requests is too precise  . We often say "there is no fish when the water is clear". In fact, the same is true in the current limit. We limit the order to 100,000 times per second, which is 100,000. What about this request? Do I have to refuse it?

In most business scenarios, the answer is no. Although the flow is limited, it is still hoped that the system allows a certain burst of traffic. At this time, the token bucket algorithm is required.

In the token bucket algorithm, suppose we have a bucket of constant size. The capacity of this bucket is related to the set threshold. There are a lot of tokens in the bucket. At a fixed rate, tokens are put into the bucket. When it is full, discard the tokens, and the maximum number of tokens that can be stored in the bucket will never exceed the bucket size. When a request comes in, it tries to remove a token from the bucket. If the bucket is empty, the request will be rejected.

I wonder if you have used Google's Guava open source toolkit? The tool class RateLimiter of the limited flow strategy in Guava, RateLimiter implements flow restriction based on the token bucket algorithm, which is very convenient to use.

RateLimiter will throw tokens into the bucket at a certain frequency, and the thread can execute the token only when the thread gets the token. The API of RateLimter can be directly applied. The main methods are acquire and tryAcquire.

acquire will block, tryAcquire method is non-blocking.

Here is a simple example:

public class LimiterTest { 
    public static void main(String[] args) throws InterruptedException { 
        //允许10个,permitsPerSecond 
        RateLimiter limiter = RateLimiter.create(100); 
        for(int i=1;i<200;i++){ 
            if (limiter.tryAcquire(1)){ 
                System.out.println("第"+i+"次请求成功"); 
            }else{ 
                System.out.println("第"+i+"次请求拒绝"); 
            } 
        } 
    } 
} 

Comparison of different current limiting algorithms

The implementation of the counter algorithm is relatively simple, and it is especially suitable for use in clusters, but the critical situation should be considered, and the sliding window strategy can be used for optimization. Of course, it also depends on the specific current limiting scenario.

The leaky bucket algorithm and the token bucket algorithm. The leaky bucket algorithm provides a relatively strict current limit. The token bucket algorithm allows a certain degree of burst traffic in addition to the current limit. In actual development, we do not need to control the flow so precisely, so the token bucket algorithm has more applications.

If the peak traffic we set is permitsPerSecond=N, that is, the number of requests per second, the counter algorithm will have 2N traffic, the leaky bucket algorithm will always limit the traffic of N, and the token bucket algorithm allows more than N, but not Reach a peak as high as 2N.

Guess you like

Origin blog.csdn.net/m0_46757769/article/details/112577393