8. "Java Master" teaches you how to easily implement efficient "current limiting" with Spring Boot!

Foreword: Java is a language widely used in back-end development. Its rich open source ecosystem and huge community continue to promote the development of Java. However, with the continuous development of Internet applications, problems such as high concurrency and large traffic have become more and more prominent. How to ensure the stability and robustness of the system has become an inevitable problem in Java development. This article will start with the concept of Java current limiting, explain the principles and implementation methods of Java current limiting in detail, and conduct code demonstrations combined with the Spring Boot framework to help readers better understand how to effectively implement current limiting .


Introduction to current limiting

Each system has services online, so when the traffic exceeds the service limit, the system may get stuck or crash, so there are downgrades and current limiting. Current limiting actually means: when there is high concurrency or instantaneous high concurrency, in order to ensure the stability and availability of the system, the system will sacrifice some requests or delay the processing of requests to ensure that the overall system service is available.

algorithm

Token Bucket, leaky bucket and counter algorithms are the three most commonly used current limiting algorithms.

Classification

Application level-standalone

The application-level current limiting method only limits the request current within a single application and cannot limit the global traffic.

  1. Total number of resources with current limit
  2. Current limiting total number of concurrency/connections/requests
  3. Limit the total number of concurrent requests/requests for an interface
  4. Limit the number of time window requests for an interface
  5. Smoothly limit the number of requests for an interface
  6. Guava RateLimiter

distributed

We need distributed current limiting and access layer current limiting to implement global traffic limiting.

  1. Lua script in redis+lua implementation
  2. Lua script implemented using Nginx+Lua
  3. Using OpenResty open source current limiting solution
  4. Current limiting framework, such as Sentinel to implement downgraded current limiting and circuit breakers

Option 1: Token Bucket

The token bucket algorithm is the most commonly used algorithm in network traffic shaping (Traffic Shaping) and rate limiting (Rate Limiting). There is a wooden bucket first. The system adds tokens to the bucket at a fixed speed. If the bucket is full, it will not add tokens. When a request comes, each will take a Token. Only when the Token is obtained can the request processing be continued. Without the Token, the service will be refused.

Insert image description here

If there are no requests for a period of time, some tokens will accumulate in the bucket. If there is sudden traffic next time, it can be processed at once as long as there are enough tokens. Therefore, the characteristic of the token bucket algorithm is that it allows sudden traffic .

Example: Guava RateLimiter - SmoothBursty

Guava RateLimiter provides token bucket algorithm implementation: smooth burst current limiting (SmoothBursty) and smooth preheating current limiting (SmoothWarmingUp) implementation.

  • Case 1
package com.example.throttle;

import com.google.common.util.concurrent.RateLimiter;
import org.junit.jupiter.api.Test;
import org.springframework.boot.test.context.SpringBootTest;

@SpringBootTest
public class GuavaLimiterTest {
    
    


    @Test
    public void bucket() {
    
    
        RateLimiter limiter = RateLimiter.create(1); // 表示桶容量为1且每秒新增1个令牌,即每隔200毫秒新增一个令牌;
        for (int i = 0; i < 10; i++) {
    
    
            System.out.println(limiter.acquire());
        }
    }

}

// 0.0
// 0.999589
// 0.994164
// 0.993863
// 0.997597

1. RateLimiter.create(1)Indicates that the bucket capacity is 1 and 1 token is added every second;

2. limiter.acquire()Indicates consuming a token. If there are enough tokens in the current bucket, it will succeed (the return value is 0). If there is no token in the bucket, it will pause for a period of time. For example, if the token issuance interval is 1 second, wait for 1 second. Then consume the token (the test case above returns 0.99, and it takes almost 1 second for the token to be available in the bucket). This implementation averages the burst request rate to a fixed request rate. If the structure doesn't want to wait, you can use tryAcquire to return immediately!

  • Case 2 - RateLimiter emergency handling:
RateLimiter limiter = RateLimiter.create(5);
System.out.println(limiter.acquire(5));
System.out.println(limiter.acquire(1));
System.out.println(limiter.acquire(1))

// 将得到类似如下的输出:
0.0
0.98745
0.183553
0.199909

limiter.acquire(5) means that the capacity of the bucket is 5 and 5 tokens are added every second. The token bucket algorithm allows a certain degree of burst, so 5 tokens can be consumed at one time, but the next limiter.acquire (1) It will wait for almost 1 second before there is a token in the bucket, and the subsequent requests will also be shaped to a fixed rate.

  • Case 3 - RateLimiter emergency situation handling:
RateLimiter limiter = RateLimiter.create(5);
System.out.println(limiter.acquire(10));
System.out.println(limiter.acquire(1));
System.out.println(limiter.acquire(1));

// 将得到类似如下的输出:
0.0
1.997428
0.192273
0.200616

Similar to the example above, 10 requests burst in the first second. The token bucket algorithm also allows this burst (allowing consumption of future tokens) , but the next limiter.acquire(1) will wait for almost 2 There will be tokens in the second bucket, and subsequent requests will be shaped to a fixed rate.

  • Case 4
RateLimiter limiter = RateLimiter.create(2);
System.out.println(limiter.acquire());
Thread.sleep(2000L);
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());

// 将得到类似如下的输出:
0.0
0.0
0.0
0.0
0.499876
0.495799

1. Create a bucket with a capacity of 2 and add 2 tokens every second;

2. First call limiter.acquire() to consume a token. At this time, the token bucket can be satisfied (the return value is 0);

3. Then the thread pauses for 2 seconds. The next two limiter.acquire() can consume the token. The third limiter.acquire() also consumes the token. The fourth one needs to wait 500 milliseconds. .

Here you can see that the bucket capacity we set is 2 (that is, the allowed burst amount). This is because there is a parameter in SmoothBursty: the maximum burst seconds (maxBurstSeconds). The default value is 1s, burst amount/bucket capacity = Rate * maxBurstSeconds, so the bucket capacity/burst volume in this example is 2. The first two in the example consume the previously accumulated burst volume, and the third one is calculated normally from the beginning. The token bucket algorithm allows tokens that have not been consumed for a period of time to be temporarily stored in the token bucket for future use and allows for such bursts of future requests.

SmoothBursty calculates the time for the next new token based on the average rate and the time of the last new token. In addition, a bucket is needed to temporarily store tokens that have not been used for a period of time (that is, the number of tokens that can burst). In addition, RateLimiter also provides the tryAcquire method for non-blocking or timeoutable token consumption.

Because SmoothBursty allows a certain degree of burst, some people worry that if such bursts are allowed, assuming a large amount of traffic suddenly comes, the system may not be able to handle such bursts. Therefore, a smooth rate current limiting tool is needed so that the system slowly approaches the average fixed rate after cold start (that is, the rate is smaller at the beginning, and then slowly approaches the fixed rate we set). Guava also provides SmoothWarmingUp to achieve this requirement similar to the leaky bucket algorithm;

Example: Guava RateLimiter - SmoothWarmingUp

How to create SmoothWarmingUp:RateLimiter.create(doublepermitsPerSecond, long warmupPeriod, TimeUnit unit)

permitsPerSecondIndicates the number of new tokens per second and warmupPeriodrepresents the time interval between the cold start rate and the average rate.

RateLimiter limiter = RateLimiter.create(5,1000, TimeUnit.MILLISECONDS);
for(inti =1; i < 5;i++) {
    
    
    System.out.println(limiter.acquire());
}
Thread.sleep(1000L);
for(inti =1; i < 5;i++) {
    
    
    System.out.println(limiter.acquire());
}

// 将得到类似如下的输出:
0.0
0.51767
0.357814
0.219992
0.199984
0.0
0.360826
0.220166
0.199723
0.199555

The rate is a trapezoidal rising rate, which means that during cold start, it will slowly reach the average rate at a relatively large rate; then it will tend to the average rate (trapezoidal decrease to the average rate). You can adjust the warmupPeriod parameter to achieve a smooth fixed rate from the beginning.

Option 2: Leaky bucket method

Water (request) first enters the leaky bucket, and the leaky bucket outputs water at a certain speed (the interface has a response rate). When the water flows in too fast, it will directly overflow (the access frequency exceeds the interface response rate), and then the request is rejected. It can be seen that The leaky bucket algorithm can forcefully limit the data transmission rate.

Insert image description here

It can be seen that there are two variables here, one is the size of the bucket, which supports how much water can be stored when the traffic suddenly increases (burst), and the other is the size of the bucket vulnerability (rate).

Because the leakage rate of the leaky bucket is a fixed parameter, even if there is no resource conflict in the network (no congestion occurs), the leaky bucket algorithm cannot cause the flow to burst to the port rate. Therefore, the leaky bucket algorithm is suitable for bursts. It is inefficient for the traffic characteristics.

Comparison between token bucket and leaky bucket

  • Token bucket adds tokens to the bucket at a fixed rate. Whether the request is processed depends on whether there are enough tokens in the bucket. When the number of tokens reduces to zero, new requests are rejected;
  • Leaky buckets outflow requests at a constant fixed rate, and the incoming request rate is arbitrary. When the number of incoming requests accumulates to the leaky bucket capacity, new incoming requests are rejected;
  • The token bucket limits the average inflow rate (burst requests are allowed, as long as there are tokens, they can be processed, and 3 or 4 tokens are supported at one time), and a certain degree of burst traffic is allowed;
  • The leaky bucket limits the constant outflow rate (that is, the outflow rate is a fixed constant value, for example, it always flows out at a rate of 1, but cannot be 1 once and 2 the next time), thereby smoothing the sudden inflow rate;
  • The token bucket allows a certain degree of bursting, while the main purpose of the leaky bucket is to smooth the inflow rate;
  • The implementation of the two algorithms can be the same, but in opposite directions, and the current limiting effect obtained for the same parameters is the same.

Option 3: Counter method

The counter current limiting algorithm is also commonly used and is mainly used to limit the total number of concurrencies. For example, the size of the database connection pool, the size of the thread pool, the number of concurrent program accesses, etc. all use counter algorithms. It is also the simplest and crudest algorithm.

Using AtomicInteger

Use AomicInteger to count the current number of concurrent executions. If it exceeds the domain value, it will simply and rudely respond to the user, indicating that the system is busy, please try again later or other business-related information.

Disadvantages: Using AomicInteger is simple and rude, and the request will be rejected if the domain value is exceeded. The request may be rejected only if the request volume is high momentarily.

package com.example.throttle;


import org.junit.jupiter.api.Test;
import org.springframework.boot.test.context.SpringBootTest;

import java.util.concurrent.atomic.AtomicInteger;

/**
 * 计算器限流
 */
@SpringBootTest
public class AtomicIntegerTest {
    
    


    @Test
    void simple() {
    
    
        var atomicInteger = new AtomicInteger(0);
        var throttle = 10;
        for (int i = 0; i < 100; i++) {
    
    
            try {
    
    
                new Thread(() -> {
    
    
                    try {
    
    
                        if (atomicInteger.incrementAndGet() > throttle) {
    
    
                            System.out.println("太快啦");
                            throw new RuntimeException("拒绝");
                        }
                        System.out.println(Thread.currentThread().getId() + "处理啦");

                        Thread.sleep(1000);
                    } catch (InterruptedException e) {
    
    
                        System.out.println("请求被拒绝");
                    } finally {
    
    
                        int i1 = atomicInteger.decrementAndGet();
                    }
                }).start();

            } catch (Exception e) {
    
    
                System.out.println(e);
            }
        }
    }
}

Adopt Token Semaphore

Use Semaphore semaphore to control the number of concurrent executions. If the threshold semaphore is exceeded, it will enter the blocking queue and wait to obtain the semaphore for execution. If there are too many requests queued in the blocking queue than the system can handle, the request can be rejected.

Advantages over Atomic: If there is instantaneous high concurrency, the request can be queued in the blocking queue instead of rejecting the request immediately, thereby achieving the purpose of peak-cutting traffic.

package com.example.throttle;

import org.junit.jupiter.api.Test;
import org.springframework.boot.test.context.SpringBootTest;

import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;


/**
 * 信号量实现单接口限流
 */
@SpringBootTest
public class SemaphoreTest {
    
    

    @Test
    void test() throws InterruptedException {
    
    
        SemaphoreTest test = new SemaphoreTest();

        for (int i = 0; i < 100; i++) {
    
    
            new Thread(() -> {
    
    
                test.myMethod();
            }).start();
        }
        Thread.sleep(1500);
    }

    private final Semaphore semaphore = new Semaphore(10);

    public void myMethod() {
    
    
        try {
    
    
            if (semaphore.tryAcquire(500, TimeUnit.MILLISECONDS)) {
    
    
                System.out.println(Thread.currentThread().getId() + "处理中");
                Thread.sleep(500);
                semaphore.release();
            } else {
    
    
                System.out.println(Thread.currentThread().getId() + "被拒绝");
                throw new RuntimeException("请求被拒绝");
            }
        } catch (Exception e) {
    
    
            System.out.println();
            Thread.currentThread().interrupt();
        }
    }
}


Solution 4: Limit the number of time window requests for a certain interface

To limit the number of time window requests for an interface, the following methods can be used to implement it:

LoadingCache<Long, AtomicLong> counter =
        CacheBuilder.newBuilder()
                .expireAfterWrite(2, TimeUnit.SECONDS)
                .build(new CacheLoader<Long, AtomicLong>() {
    
    
                    @Override
                    public AtomicLong load(Long seconds) throws Exception {
    
    
                        return new AtomicLong(0);
                    }
                });
long limit = 1000;
while(true) {
    
    
    //得到当前秒
    long currentSeconds = System.currentTimeMillis() / 1000;
    if(counter.get(currentSeconds).incrementAndGet() > limit) {
    
    
        System.out.println("限流了:" + currentSeconds);
        continue;
    }
    //业务处理
}

Here, the total number of requests per second is counted. The expiration time can be set to a time greater than 1s to ensure that the statistics within 1s will not expire. This method is not affected by the speed of business execution and only counts how many times it has been executed within 1 second.

Option 5. Others

1. Use Hystrix to implement Java current limiting

Hystrix is ​​an open source fault-tolerant framework for controlling access speed in the system. By setting the maximum concurrent requests and timeout of the system, we can effectively control the load of the system and prevent system resources from being exhausted. The following is sample code for implementing Java current limiting using Hystrix:

@RestController
public class RateLimitController {
    
    

    // 创建一个HystrixCommand配置
    private HystrixCommand.Setter commandConfig = HystrixCommand.Setter
            .withGroupKey(HystrixCommandGroupKey.Factory.asKey("group1"))
            .andCommandKey(HystrixCommandKey.Factory.asKey("command1"))
            .andThreadPoolKey(HystrixThreadPoolKey.Factory.asKey("threadPool1"))
            .andCommandPropertiesDefaults(
                    HystrixCommandProperties.Setter()
                            .withExecutionIsolationSemaphoreMaxConcurrentRequests(10)
                            .withExecutionTimeoutInMilliseconds(500)
                            .withFallbackIsolationSemaphoreMaxConcurrentRequests(100)
            );

    // 创建一个HystrixCommand实例
    private HystrixCommand<String> command = new HystrixCommand<String>(commandConfig) {
    
    
        @Override
        protected String run() throws Exception {
    
    
            // 执行业务逻辑
            return "success";
        }

        @Override
        protected String getFallback() {
    
    
            // 返回失败信息
            return "fail";
        }
    };

    @GetMapping("/test")
    public String test() throws Exception {
    
    
        // 执行HystrixCommand实例
        return command.execute();
    }
}

In the above code, we control the load of the system by creating a HystrixCommand configuration and setting the maximum number of concurrent requests and timeout. When executing business logic, we execute the business logic through the HystrixCommand instance. If the execution is successful, success information is returned; if the execution fails, failure information is returned.

2. Use Redis to implement Java current limiting

Redis is an open source in-memory database used to store large amounts of data and cache data. By using the counter and expiration time in Redis, we can implement the Java current limiting function. The following is sample code for using Redis to implement Java current limiting:

@RestController
public class RateLimitController {
    
    

    // 创建Redis实例
    private RedisTemplate<String, Integer> redisTemplate = new RedisTemplate<>();
    private ValueOperations<String, Integer> valueOperations = redisTemplate.opsForValue();

    @GetMapping("/test")
    public String test() {
    
    
        // 获取当前时间
        long current = System.currentTimeMillis();

        // 获取Redis中的计数器值
        Integer count = valueOperations.get("count");

        // 如果计数器不存在,则初始化计数器
        if (count == null) {
    
    
            valueOperations.set("count", 0, Duration.ofSeconds(1));
            count = 0;
        }

        // 如果计数器达到最大值,则限流
        if (count >= 10) {
    
    
            return "fail";
        }

        // 更新计数器的值
        valueOperations.increment("count");

        // 执行业务逻辑
        return "success";
    }
}

In the above code, we create a Redis instance and use the counter in Redis to implement the Java current limiting function. Before each request is processed, we first obtain the counter value in Redis, and if the counter does not exist, initialize the counter. Then, we determine whether the counter value reaches the maximum value, and if so, limit the current. Finally, we update the counter value and execute the business logic.

Guess you like

Origin blog.csdn.net/qq_45704048/article/details/132948283