[Current Limiting] 4 Common Current Limiting Algorithms

1. Background

Service current limiting refers to the purpose of protecting services by controlling the rate or number of requests. In microservices, we usually use it together with circuit breakers and downgrades to avoid a large number of instantaneous requests from causing load on the system. To achieve the purpose of protecting the smooth operation of services

What is the difference between current limiting and fusing?

Traffic limiting occurs before traffic comes in, and excess traffic is restricted.

Circuit breaker is a mechanism to deal with failures. It occurs after traffic comes in. If the system fails or is abnormal, the circuit breaker will automatically cut off the request to prevent the failure from further expanding and causing a service avalanche.

What is the difference between current limiting and peak clipping?

Peak clipping is a smoothing process for traffic that avoids instantaneous overloading of the system by slowly increasing the processing rate of requests.

Peak shaving is probably a reservoir, which stores the flow and flows slowly. Flow limiting is probably a gate, which rejects excess flow.

2. Overview of current limiting

At the beginning of the design of most microservice architectures, such as in the technology selection stage, architects will plan the combination of technology stacks from a global perspective. For example, should Dubbo be used based on the current product status? Or springcloud? As the underlying framework for microservice governance. Even in order to meet the rapid launch, iteration and delivery, we directly use springboot as the base for development, and then introduce new technology stacks, etc...

Therefore, when talking about specific technical solutions for a certain business scenario, we cannot generalize. Instead, we need to make a comprehensive assessment based on the current status of the product and business. In terms of current limiting, the specific selection may not be possible under the different technical architectures below. Same

2.1 dubbo service governance model

Choosing the dubbo framework as the basic service governance is good for applications that are biased toward internal platforms. The bottom layer of dubbo uses netty. Compared with the http protocol, it still has advantages in certain scenarios. If you choose dubbo, choose current limiting. The following references can be made on the plan.

2.1.1 dubbo frame-level current limiting

Dubbo officially provides complete service governance, which can meet the needs of most development scenarios. For the current limiting scenario, it specifically includes the following methods. For specific configuration, please refer to the official manual;

Client current limit

  • Semaphore current limit (through statistics)
  • Connection number limit (socket->tcp)

Server side current limit

  • Thread pool current limit (isolation means)
  • Semaphore current limiting (non-isolation means)
  • Receive number current limit (socket->tcp)

2.1.2 Thread pool settings

Multi-threaded concurrent operations must be inseparable from thread pools. Dubbo itself provides support for four thread pool types. The key parameters of the thread pool can be configured in the Producer <dubbo:protocol>tab, such as the thread pool type, the size of the blocking queue, and the number of core threads. By configuring the number of thread pools on the production side, the current limiting effect can be achieved to a certain extent.

2.1.3 Integrate third-party components

If it is a springboot framework project, you can consider directly introducing local components or SDKs, such as hystrix, guava, sentinel native SDK, etc. If your technical strength is strong enough, you can even consider building your own wheels.

2.2 springcloud service governance model

If you use springcloud or springcloud-alibaba as your service governance framework, the framework’s own ecology already contains corresponding current-limiting components, which can be used out of the box. Here are some commonly used current-limiting components based on the springcloud framework .

2.2.1 hystrix

Hystrix is ​​a fault-tolerant framework open sourced by Netflix. When springcloud was launched to the market in the early days, it was used as a component in the springcloud ecosystem for current limiting, fusing, and downgrading.

Hystrix provides a current limiting function. In the springcloud architecture system, Hystrix can be enabled on the gateway for current limiting processing, and each microservice can also enable Hystrix for current limiting.

Hystrix uses thread isolation mode by default, which can limit the current through the number of threads + queue size. For specific parameter configuration, please refer to the relevant information on the official website.

2.2.2 sentinel

Sentinel is known as the traffic guard of distributed systems. It is an important component in the springcloud-alibaba ecosystem. It is a traffic control component for distributed service architecture. It mainly uses traffic as the entry point, ranging from current limiting, traffic shaping, circuit breaker degradation, and system load protection. , hotspot protection and other dimensions to help developers ensure the stability of microservices.

2.3 Gateway layer current limiting

As the scale of microservices increases, when many microservices in the entire system need to implement current limiting, you can consider limiting current at the gateway layer. Generally speaking, the current limiting at the gateway layer is for general purposes. services, such as those malicious requests, crawlers, attacks, etc. To put it simply, flow limiting at the gateway level provides a layer of protection for the entire system.

3. Four commonly used current limiting algorithms

3.1 Algorithm Overview

Here are several commonly used current limiting algorithms for understanding:

  1. fixed window algorithm
  2. sliding window algorithm
  3. Token Bucket Algorithm
  4. leaky bucket algorithm

No matter what kind of current-limiting component it is, its underlying current-limiting implementation algorithm is similar. Its general process is:
insert image description here

  1. Statistics of request traffic: record the number or rate of requests, and statistics can be carried out through counters, sliding windows, etc.
  2. Determine whether the limit is exceeded: Determine whether the current request traffic exceeds the limit based on the set limit conditions.
  3. Execute the current limiting policy: If the request traffic exceeds the limit, implement the current limiting policy, such as rejecting the request, delaying processing, returning error information, etc.
  4. Update statistical information: Update statistical information based on the processing results of the request, such as increasing the value of the counter, updating the data of the sliding window, etc.
  5. Repeat the above steps: continuously count request traffic, determine whether the limit is exceeded, implement the current limiting policy, and update statistical information.

It should be noted that the specific current limiting algorithm implementation may be adjusted and optimized according to different scenarios and needs, such as using the token bucket algorithm, leaky bucket algorithm, etc.

3.2 Algorithm implementation

Next, we will implement the above four common current limiting algorithms. Here we use Redis as distributed storage and Redission as Redis client.

1. Introduce dependencies

<dependency>
    <groupId>org.redisson</groupId>
    <artifactId>redisson</artifactId>
    <version>3.16.2</version>
</dependency>

2. Obtain RedissonClient in singleton mode

public class RedissonConfig {
    
    

    private static final String REDIS_ADDRESS = "redis://127.0.0.1:6379";

    private static volatile RedissonClient redissonClient;

    public static RedissonClient getInstance(){
    
    
        if (redissonClient == null){
    
    
            synchronized (RedissonConfig.class){
    
    
                if (redissonClient == null){
    
    
                    Config config = new Config();
                    config.useSingleServer().setAddress(REDIS_ADDRESS);
                    redissonClient = Redisson.create(config);
                    return redissonClient;
                }
            }
        }
        return redissonClient;
    }

}

3.2.1 Fixed window current limiting algorithm

Fixed window algorithm (counter algorithm): It is a relatively simple current limiting algorithm. It divides time into fixed time windows and sets a limit on the number of requests allowed in each window. If the number of requests exceeds the upper limit within a time window, current limiting will be triggered.
insert image description here

Algorithm implementation

The implementation of fixed windows based on Redisson is quite simple. Within each window period, we can incrementAndGetcount the number of requests through the operation. Once the window period is over, we can use Redis's key expiration feature to automatically reset the count

public class FixedWindowRateLimiter {
    
    

    public static final String KEY = "fixedWindowRateLimiter:";

    // 请求限制数量
    private Long limit;

    // 窗口大小 秒
    private Long windowSize;

    public FixedWindowRateLimiter(Long limit, Long windowSize) {
    
    
        this.limit = limit;
        this.windowSize = windowSize;
    }

    public boolean triggerLimit(String path) {
    
    
        RedissonClient redissonClient = RedissonConfig.getInstance();
        //加分布式锁,防止并发情况下窗口初始化时间不一致问题
        RLock rLock = redissonClient.getLock(KEY + "LOCK:" + path);
        try {
    
    
            rLock.lock(100, TimeUnit.MILLISECONDS);
            String redisKey = KEY + path;
            RAtomicLong counter = redissonClient.getAtomicLong(redisKey);
            //计数
            long count = counter.incrementAndGet();
            //如果为1的话,就说明窗口刚初始化
            if (count == 1) {
    
    
                //直接设置过期时间,作为窗口
                counter.expire(windowSize, TimeUnit.SECONDS);
            }
            //触发限流
            if (count > limit) {
    
    
                //触发限流的不记在请求数量中
                counter.decrementAndGet();
                return true;
            }
        } catch (Exception e) {
    
    
            e.printStackTrace();
        } finally {
    
    
            rLock.unlock();
        }
        return false;
    }

}

test:

public class FixedWindowRateLimiterTest {
    
    

    public static void main(String[] args) throws InterruptedException {
    
    
        ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(20, 50, 10, TimeUnit.SECONDS, new LinkedBlockingDeque<>(10));
        FixedWindowRateLimiter fixedWindowRateLimiter = new FixedWindowRateLimiter(10L,60L);
        //模拟不同窗口内的调用
        for (int i = 0; i < 3; i++) {
    
    
            CountDownLatch countDownLatch = new CountDownLatch(20);
            //20个线程并发调用
            for (int j = 0; j < 20; j++) {
    
    
                threadPoolExecutor.execute(() -> {
    
    
                    boolean isLimit = fixedWindowRateLimiter.triggerLimit("/test");
                    System.out.println(isLimit);
                    countDownLatch.countDown();
                });
            }
            countDownLatch.await();
            //休眠1min
            TimeUnit.MINUTES.sleep(1);
        }
    }

}

The advantage of the fixed window algorithm is that it is simple to implement and takes up little space, but it has a critical problem. Since the window switching is completed instantaneously, the processing of the request is not smooth, and the traffic may fluctuate violently at the moment of window switching.

For example, in this example, if at 00:02, a large number of requests come over suddenly, but we reset the count at this time, then there is no way to limit these sudden traffic.
insert image description here

3.2.2 Sliding window algorithm

In order to alleviate the burst flow problem of the fixed window, the sliding window algorithm can be used. The flow control of TCP in the computer network is to use the sliding window algorithm.

Sliding window current limiting algorithm : Divide a large time window into multiple small time windows, and each small window has an independent count. When a request comes in, it is judged whether the number of requests exceeds the limit of the entire window. The movement of the window is to slide a small unit window forward each time

For example: The sliding window below divides the large time window of 1 minute into 5 small windows, and the time of each small window is 12 seconds. Each cell has its own independent counter, which moves forward one cell every 12 seconds.

If a request comes at 00:01, the window count at this time is 3+12+9+15=39, which can also play a role in limiting the current flow.

insert image description here
This is why the sliding window can solve the critical problem. The more sliding grids, the smoother the overall sliding will be, and the more accurate the current limiting effect will be.

Algorithm implementation

So how do we implement the sliding window current limiting algorithm here? It's very simple, we can directly use Redis's ordered set (zset) structure.

We use timestamp as score and member. When a request comes, we add the current timestamp to the ordered set. Then for requests outside the window, we can calculate the starting timestamp based on the window size and delete the requests outside the window. In this way, the size of the ordered set is the number of requests in our window.
insert image description here

public class SlidingWindowRateLimiter {
    
    

    public static final String KEY = "fixedWindowRateLimiter:";

    // 请求限制数量
    private Long limit;

    // 窗口大小 秒
    private Long windowSize;

    public SlidingWindowRateLimiter(Long limit, Long windowSize) {
    
    
        this.limit = limit;
        this.windowSize = windowSize;
    }

    public boolean triggerLimit(String path) {
    
    
        RedissonClient redissonClient = RedissonConfig.getInstance();
        // 使用分布式锁,避免并发设置初始值的时候,导致窗口计数被覆盖
        RLock rLock = redissonClient.getLock(KEY + "LOCK:" + path);
        // 窗口计数
        RScoredSortedSet<Long> counter = redissonClient.getScoredSortedSet(KEY + path);
        try {
    
    
            rLock.lock(200, TimeUnit.MILLISECONDS);
            long currentTimestamp = System.currentTimeMillis();
            // 窗口起始时间戳
            long windowStartTimestamp = currentTimestamp - windowSize * 1000;
            // 移除窗口外的时间戳,左闭右开
            counter.removeRangeByScore(0, true, windowStartTimestamp, false);
            // 将当前时间戳作为score,也作为member,
            // TODO:高并发情况下可能没法保证唯一,可以加一个唯一标识
            counter.add(currentTimestamp, currentTimestamp);
            //使用zset的元素个数,作为请求计数
            long count = counter.size();
            // 判断时间戳数量是否超过限流阈值
            if (count > limit) {
    
    
                System.out.println("[triggerLimit] path:" + path + " count:" + count + " over limit:" + limit);
                return true;
            }
        } catch (Exception e) {
    
    
            e.printStackTrace();
        } finally {
    
    
            rLock.unlock();
        }
        return false;
    }
    
}

There is another small point that can be improved here. Zset will overwrite it if the members are the same. That is to say, under high concurrency conditions, the timestamps may be repeated, so there may be fewer requests in statistics. Here is Use timestamp + random number to alleviate it, or you can generate a unique sequence to solve it, such as UUID, snowflake algorithm, etc.

test:

public class SlidingWindowRateLimiterTest {
    
    

    public static void main(String[] args) throws InterruptedException {
    
    
        ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(30, 50, 10, TimeUnit.SECONDS, new LinkedBlockingDeque<>(10));
        SlidingWindowRateLimiter slidingWindowRateLimiter = new SlidingWindowRateLimiter(10L, 1L);
        // 模拟在不同时间片内的请求
        for (int i = 0; i < 8; i++) {
    
    
            CountDownLatch countDownLatch = new CountDownLatch(20);
            for (int j = 0; j < 20; j++) {
    
    
                threadPoolExecutor.execute(() -> {
    
    
                    boolean isLimit = slidingWindowRateLimiter.triggerLimit("/test");
                    System.out.println(isLimit);
                    countDownLatch.countDown();
                });
            }
            countDownLatch.await();
            //休眠10s
            TimeUnit.SECONDS.sleep(10L);
        }
    }

}

Redis is used to implement sliding window current limiting, which solves the boundary problem of fixed window current limiting. Of course, it also brings new problems here, because we store all requests during the window period, so in high concurrency situations, it may take up more space. Memory

3.2.3 Leaky bucket algorithm

We can see that the current limit of the counter class reflects a "sudden stop". You can consider using other current limiting algorithms

Leaky bucket algorithm : requests are injected into the leaky bucket at any speed like water, and the bucket will leak water at a fixed rate.

insert image description here
When the water inflow rate is greater than the water outflow rate, the leaky bucket will become full, and new incoming requests will be discarded.

The two major functions of the leaky bucket algorithm are network traffic shaping (Traffic Shaping) and rate limiting (Rate Limiting)

Algorithm implementation

We used it in the sliding window current limiting algorithm RScoredSortedSet. You can also use this structure here and use ZREMRANGEBYSCOREthe command directly to delete old requests.

  • Needless to say, when water comes in, the request comes in and it is judged whether the bucket is full. If it is full, it will be rejected. If it is not full, the request will be thrown into the bucket.
  • So what about the water? To ensure a stable rate of water discharge, you can use a scheduled task to regularly delete old requests.
public class LeakyBucketRateLimiter {
    
    

    private RedissonClient redissonClient = RedissonConfig.getInstance();
    private static final String KEY_PREFIX = "LeakyBucket:";

    /**
     * 桶的大小
     */
    private Long bucketSize;
    /**
     * 漏水速率,单位:个/秒
     */
    private Long leakRate;

    public LeakyBucketRateLimiter(Long bucketSize, Long leakRate) {
    
    
        this.bucketSize = bucketSize;
        this.leakRate = leakRate;
        //这里启动一个定时任务,每s执行一次
        ScheduledExecutorService executorService = Executors.newScheduledThreadPool(1);
        executorService.scheduleAtFixedRate(this::leakWater, 0, 1, TimeUnit.SECONDS);
    }

    /**
     * 漏水
     *
     * @author zzc
     * @date 2023/7/19 17:16
     */
    public void leakWater() {
    
    
        RSet<String> pathSet = redissonClient.getSet(KEY_PREFIX + ":pathSet");
        //遍历所有path,删除旧请求
        for(String path : pathSet){
    
    
            RScoredSortedSet<Long> bucket = redissonClient.getScoredSortedSet(KEY_PREFIX + path);
            // 获取当前时间
            long now = System.currentTimeMillis();
            // 删除旧的请求
            bucket.removeRangeByScore(0, true,now - 1000 * leakRate,true);
        }
    }

    public boolean triggerLimit(String path) {
    
    
        //加锁,防止并发初始化问题
        RLock rLock = redissonClient.getLock(KEY_PREFIX + "LOCK:" + path);
        try {
    
    
            rLock.lock(100,TimeUnit.MILLISECONDS);
            String redisKey = KEY_PREFIX + path;
            RScoredSortedSet<Long> bucket = redissonClient.getScoredSortedSet(redisKey);
            //这里用一个set,来存储所有path
            RSet<String> pathSet = redissonClient.getSet(KEY_PREFIX + ":pathSet");
            pathSet.add(path);
            // 获取当前时间
            long now = System.currentTimeMillis();
            // 检查桶是否已满
            if (bucket.size() < bucketSize) {
    
    
                // 桶未满,添加一个元素到桶中
                bucket.add(now, now);
                return false;
            }
            // 桶已满,触发限流
            System.out.println("[triggerLimit] path:"+path+" bucket size:"+bucket.size());
            return true;
        } finally {
    
    
            rLock.unlock();
        }
    }

}

In the code implementation, we use RSet to store the path. In this way, a scheduled task can handle the water discharge of all buckets corresponding to the path, without creating a scheduled task for each bucket.

Here I directly use ScheduledExecutorService to start a scheduled task, which runs once every 1 second. Of course, in a cluster environment, each machine runs a scheduled task, which is a huge waste of performance and difficult to manage. We can use distributed scheduled tasks. For example xxl-job to execute leakWater.

test:

public class LeakyBucketRateLimiterTest {
    
    

    public static void main(String[] args) throws InterruptedException {
    
    
        ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(30, 50, 10, TimeUnit.SECONDS, new LinkedBlockingDeque<>(10));
        LeakyBucketRateLimiter leakyBucketRateLimiter = new LeakyBucketRateLimiter(10L, 1L);
        for (int i = 0; i < 8; i++) {
    
    
            CountDownLatch countDownLatch = new CountDownLatch(20);
            for (int j = 0; j < 20; j++) {
    
    
                threadPoolExecutor.execute(() -> {
    
    
                    boolean isLimit = leakyBucketRateLimiter.triggerLimit("/test");
                    System.out.println(isLimit);
                    countDownLatch.countDown();
                });
            }
            countDownLatch.await();
            //休眠10s
            TimeUnit.SECONDS.sleep(10L);
        }
    }

}

The leaky bucket algorithm can effectively prevent network congestion and is relatively simple to implement.

However, because the water output rate of the leaky bucket is fixed, if a large number of requests suddenly come in, the excess requests can only be discarded. Even if the downstream can handle larger traffic, system resources cannot be fully utilized.

3.2.4 Token Bucket Algorithm

Token bucket algorithm : The system adds tokens to the bucket at a fixed rate. Each request needs to take a token out of the bucket before being sent. Only requests with tokens are passed. Therefore, the token bucket algorithm allows requests to be sent at any rate as long as there are enough tokens in the bucket
insert image description here

Algorithm implementation

The first is to issue tokens at a fixed rate, then we have to open a thread to regularly throw tokens into the bucket, and then Redission provides the implementation of the token bucket algorithm.

public class TokenBucketRateLimiter {
    
    

    public static final String KEY = "TokenBucketRateLimiter:";

    /**
     * 阈值
     */
    private Long limit;

    /**
     * 添加令牌的速率,单位:个/秒
     */
    private Long tokenRate;

    public TokenBucketRateLimiter(Long limit, Long tokenRate) {
    
    
        this.limit = limit;
        this.tokenRate = tokenRate;
    }

    /**
     * 限流算法
     */
    public boolean triggerLimit(String path){
    
    
        RedissonClient redissonClient = RedissonConfig.getInstance();
        RRateLimiter rateLimiter = redissonClient.getRateLimiter(KEY + path);
        // 初始化,设置速率模式,速率,间隔,间隔单位
        rateLimiter.trySetRate(RateType.OVERALL, limit, tokenRate, RateIntervalUnit.SECONDS);
        // 获取令牌
        return rateLimiter.tryAcquire();
    }

}

test:

public class TokenBucketRateLimiterTest {
    
    

    public static void main(String[] args) throws InterruptedException {
    
    
        ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(30, 50, 10, TimeUnit.SECONDS, new LinkedBlockingDeque<>(10));
        TokenBucketRateLimiter tokenBucketRateLimiter = new TokenBucketRateLimiter(10L, 1L);
        for (int i = 0; i < 8; i++) {
    
    
            CountDownLatch countDownLatch = new CountDownLatch(20);
            for (int j = 0; j < 20; j++) {
    
    
                threadPoolExecutor.execute(() -> {
    
    
                    boolean isLimit = tokenBucketRateLimiter.triggerLimit("/test");
                    System.out.println(isLimit);
                    countDownLatch.countDown();
                });
            }
            countDownLatch.await();
            //休眠10s
            TimeUnit.SECONDS.sleep(10L);
        }
    }

}

Summarize

In this article, we implement distributed implementations of four current limiting algorithms, using a very easy-to-use Redissionclient.

Of course, we also have imperfections:

  • Concurrent processing uses distributed locks. In the case of high concurrency, there is a certain loss in performance. It is best to implement the logic directly using Lua scripts to improve performance.
  • It can provide more elegant calling methods, such as using aop to implement annotation calling, code design can also be more elegant, and the inheritance system can be improved
  • There is no rejection strategy for current limiting, such as throwing exceptions, caching, and throwing into MQ to disperse... Current limiting is a method, and the ultimate goal is to ensure the system is as stable as possible

In addition, there are many useful open source current limiting tools on the market:

  • Guava RateLimiter, based on the token bucket algorithm for current limiting, is of course stand-alone;
  • Sentinel, based on sliding window current limit, supports stand-alone and cluster
  • Gateway current limiting, many gateways come with their own current limiting methods, such as Spring Cloud Gateway, Nginx

Guess you like

Origin blog.csdn.net/sco5282/article/details/131806018