Source Code Analysis of Speed Limiter RateLimiter | JD Cloud Technical Team

Author: JD Technology Li Yuliang

directory guide

Limiting scene

There are generally two scenarios where current limiting is used in software systems:

Scenario 1, high-concurrency client-side scenario. Especially the C-end system is often faced with massive user requests. If there is no current limit, the system may be overwhelmed when encountering instantaneous high concurrency scenarios.

Scenario 2: Internal transaction processing scenario. For example, when a certain type of transaction task is processed, there is a speed requirement, and when the upstream and downstream calls, the downstream has a speed requirement for the upstream.

•In any scenario, it is necessary to limit the request processing rate, or the processing rate of a single request is relatively fixed, or the processing rate of batch requests is relatively fixed, as shown in the figure below:

Commonly used current limiting algorithms are as follows:

Algorithm 1, semaphore algorithm. Maintain the maximum number of concurrent requests (such as the number of connections), and report an error or wait when the number of concurrent requests reaches the threshold, such as a thread pool.

Algorithm 2, Leaky Bucket Algorithm. Simulates a bucket that leaks at a constant rate, overflowing when incoming requests are greater than the bucket's capacity.

Algorithm 3: token bucket algorithm. Issue tokens into the bucket at a fixed rate. When request processing, get the token from the bucket first, and only serve the request with the token.

The RateLimiter to be introduced this time uses the token bucket algorithm. RateLimiter is a lightweight current limiting component in Google's guava package. It mainly has two java class files, RateLimiter.java and SmoothRateLimiter.java. The two class files have a total of 301 lines of java code and 420 lines of comments. The comments are more than the java code, and the writing is very detailed. The following introductions also have relevant content translated from their comments. Some descriptions are more accurate and clear in the original English version. If you are interested It can also be combined with the original notes for a more detailed understanding.

Introduction

When using RateLimiter, you only need to import the guava jar. The latest version is 31.1-jre, and the source code introduced in this article is also this version.

            <dependency>
                <groupId>com.google.guava</groupId>
                <artifactId>guava</artifactId>
                <version>31.1-jre</version>
            </dependency>

Two intuitive usage examples are provided in the source code.

Example 1. There is a series of task lists to be submitted for execution, and the submission rate is controlled to not exceed 2 per second.

 final RateLimiter rateLimiter = RateLimiter.create(2.0); // 创建一个每秒2个许可的RateLimiter对象.
 void submitTasks(List<Runnable> tasks, Executor executor) {
   for (Runnable task : tasks) {
     rateLimiter.acquire(); // 此处可能有等待
     executor.execute(task);
   }
 }

Example 2: Generate a data stream at a rate not exceeding 5kb/s.

 final RateLimiter rateLimiter = RateLimiter.create(5000.0); // 创建一个每秒5k个许可的RateLimiter对象
 void submitPacket(byte[] packet) {
   rateLimiter.acquire(packet.length);
   networkService.send(packet);
 }

It can be seen that the use of RateLimiter is very simple. It only needs to construct the rate limiter and call the method of obtaining the license without releasing the license.

Algorithm Introduction

Before the introduction, let me talk about a few nouns in RateLimiter:

Permission ( permit ): Represents a token, and the request to obtain the permission can be released.

Insufficient resource utilization ( underunilization ): Licenses are generally issued at a uniform rate, but requests may not be at a uniform rate. Sometimes there will be no request (underunilization of resources), and the token bucket will have a storage mechanism.

Stored Permit ( storedPermit ): The token bucket supports permission storage for idle resources, and the storage permission is given priority when requesting permission.

Fresh Permit ( freshPermit ): When the storage permit is empty, use the overdraft method to issue a fresh permit, and set the effective time of the next permit as the end time of this fresh permit.

•The following is an example of license issuance, the rectangle represents the entire token bucket, the license generation rate is 1 per second, and there is a storage bucket in the token bucket with a capacity of 2.

In the above example, when the storage capacity of T1 is 0, 1 fresh license is returned directly when the license is requested, and the storage capacity grows to a maximum value of 2 as time goes by, and when 3 license requests are received at T2, it will start from Take 2 out of the storage bucket, and then generate a fresh license. After 0.5s, another license request comes at time T3. Since the latest license will be issued after 0.5s, sleep for 0.5s before issuing it.

The core function of RateLimiter is speed limit. The speed limit scheme we first think of is to remember the time of the last issued token permit (permit). When the next time a permit is requested, if the interval between the time of the last issued permit is less than 1/QPS , then sleep to 1/QPS, otherwise it will be issued directly, but this method cannot perceive the scenario of insufficient resource utilization. On the one hand, if you request permission after a long period of time, the system may be relatively idle at this time, and more permissions can be issued to make full use of resources; on the other hand, if you request permission after a long period of time, it may also mean that When the resource processing the request becomes cold (such as cache invalidation), the processing efficiency will decrease. Therefore, in RateLimiter, the management of resource underutilization (underutilization) is added, which is reflected in the code as stored permits (storedPermits). The storage permit value is initially 0, and as time increases, it has been increasing to the maximum number of storage permits. When the license is obtained, it is first obtained from the storage license, and then the fresh license is obtained according to the next fresh license acquisition time. What I want to say here is that RateLimiter remembers the time of the next token issuance, which is similar to the function of overdraft. It returns immediately when the current license is obtained, and records the time when the next time the license is obtained.

Code structure and main process

code structure

The overall class diagram is as follows:

RateLimiter class

The RateLimiter class is the top-level class and the only class exposed to users, which provides factory methods to create RateLimiter methods. The create(double permitsPerSecond) method creates a burst speed limiter, and the create(double permitsPerSecond, Duration warmupPeriod) method creates a warmup speed limiter. At the same time, it provides the acquire method for acquiring tokens and the tryAcquire method for trying to acquire tokens. In the internal implementation of this class, on the one hand, there is a SleepingStopWatch for sleep operation, on the other hand, there is a mutexDoNotUseDirectly variable and the mutex() method for mutual exclusion and locking.

SmoothRateLimiter class

This class inherits the RateLimiter class, which is an abstract class, meaning a smooth rate limiter, and the limit rate is smooth. maxPermits and storedPermits maintain the maximum number of storage licenses and the current number of storage licenses; stableIntervalMicros refers to the specified stable license issuance interval, nextFreeTicketMicros Refers to the next free grant time.

SmoothBursty class

Smooth burst speed limiter, this class inherits SmoothRateLimiter, it stores the issuing frequency of the license with the set stableIntervalMicros, and has a member variable maxBurstSeconds, which represents the maximum storage time of the token license.

SmoothWarmingUp class

The smooth warm-up speed limiter inherits SmoothRateLimiter and is at the same level as SmoothBursty. Its warm-up algorithm requires a certain amount of understanding.

Main process

The main process of obtaining permission is as follows:

The main process is mainly to calculate and update the storage permit quantity and the fresh permit quantity, and obtain the waiting time of the current permit request. The SmoothBursty algorithm and the SmoothWarmingUp algorithm share this set of main processes. The main difference is the management strategy of the storage license. The different strategies of the two algorithms are implemented in the two subcategories. The SmoothBursty algorithm is relatively simple. The algorithm will be introduced first, and then introduced SmoothWarmingUp algorithm.

Smooth Bursty Algorithm

speed limiter creation

The factory model is used to create, the source code is as follows:

  public static RateLimiter create(double permitsPerSecond) {
    // permitsPerSecond指每秒允许的许可数. 该方法调用了下面的方法
    return create(permitsPerSecond, SleepingStopwatch.createFromSystemTimer());
  }
  // 创建SmoothBursty(固定贮存1s的贮存许可), 然后设置速率
  static RateLimiter create(double permitsPerSecond, SleepingStopwatch stopwatch) {
    RateLimiter rateLimiter = new SmoothBursty(stopwatch, 1.0 /* maxBurstSeconds */);
    rateLimiter.setRate(permitsPerSecond);
    return rateLimiter;
  }

1. The construction method of SmoothBursty is relatively simple:

    SmoothBursty(SleepingStopwatch stopwatch, double maxBurstSeconds) {
      super(stopwatch);
      this.maxBurstSeconds = maxBurstSeconds;
    }

2. The definition of rateLimiter.setRate is in the parent class RateLimiter

  public final void setRate(double permitsPerSecond) {
    checkArgument(
        permitsPerSecond > 0.0 && !Double.isNaN(permitsPerSecond), "rate must be positive");
    synchronized (mutex()) {
      doSetRate(permitsPerSecond, stopwatch.readMicros());
    }
  }

This method uses the synchronized(mutex()) method to synchronize the mutex to ensure the safety of multi-thread calls, and then calls the doSetRate method of the subclass. The value passed by the second parameter nowMicros is to call the stopwatch method, define the creation time of the speed limiter as 0, and then calculate the time difference between the current time and the creation time, so the relative time is used.

2.1 The mutex method is implemented as follows:

  // Can't be initialized in the constructor because mocks don't call the constructor.
  // 从上行注释可看出,这是因为mock才用了懒加载, 实际上即时加载代码更简洁
  @CheckForNull private volatile Object mutexDoNotUseDirectly;
  // 双重检查锁的懒加载模式
  private Object mutex() {
    Object mutex = mutexDoNotUseDirectly;
    if (mutex == null) {
      synchronized (this) {
        mutex = mutexDoNotUseDirectly;
        if (mutex == null) {
          mutexDoNotUseDirectly = mutex = new Object();
        }
      }
    }
    return mutex;
  }

This method uses a double-check lock to lazy-load the lock object mutexDoNotUseDirectly. In addition, this method uses mutex temporary variables to solve the problem of double-check lock failure.

2.2 The main body of the doSetRate method is implemented in the SmoothRateLimiter class:

  final void doSetRate(double permitsPerSecond, long nowMicros) {
    // 同步贮存许可和时间
    resync(nowMicros);
    double stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
    this.stableIntervalMicros = stableIntervalMicros;
    doSetRate(permitsPerSecond, stableIntervalMicros);
  }

This method will be called when the speed limiter is created, and it will also be called when the setRate of the speed limiter is called to reset the rate after creation.

2.2.1 The resync method is used to refresh and calculate the latest storedPermis and nextFreeTicketMicros based on the current time.

  /** Updates {@code storedPermits} and {@code nextFreeTicketMicros} based on the current time. */
  void resync(long nowMicros) {
    // if nextFreeTicket is in the past, resync to now
    if (nowMicros > nextFreeTicketMicros) {
      double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
      storedPermits = min(maxPermits, storedPermits + newPermits);
      nextFreeTicketMicros = nowMicros;
    }
  }

From a practical point of view, this method represents that the storage permission is continuously increased as time goes by, but from the perspective of technical implementation, it is not a real continuous refresh, but only invokes refresh when needed. In this method, if the current time is less than or equal to the next permission time, the storage permission quantity and the next permission time do not need to be refreshed; otherwise, the value calculated by ( current time-next permission time)/storage permission issuance interval is the maximum storage quantity If it is smaller, it is the stored license quantity. It should be noted that the stored license quantity is of double type.

speed limiter use

The commonly used methods of the speed limiter mainly include accquire and tryAccquire.

Let’s talk about the accquire method first. There are two common methods, one with no parameters, which acquires one license each time, and the other with integer parameters, which acquires multiple licenses each time.

  // 获取1个许可
  public double acquire() {
    return acquire(1);
  }
 
  // 获取多个许可
  public double acquire(int permits) {
    // 留出permits个许可,得到需要sleep的微秒数.
    long microsToWait = reserve(permits);
    // 该方法如果小于等于零则直接返回,否则sleep
    stopwatch.sleepMicrosUninterruptibly(microsToWait);
    // 返回休眠的秒数.
    return 1.0 * microsToWait / SECONDS.toMicros(1L);
  }

As can be seen from the above source code, the logic of obtaining a license is very simple: set aside permits, and decide whether to sleep and wait according to the return value. The method for setting aside permission is implemented as follows:

 // 预留出permits个许可
  final long reserve(int permits) {
    checkPermits(permits);
    synchronized (mutex()) {
      return reserveAndGetWaitLength(permits, stopwatch.readMicros());
    }
  }
   
  // 预留出permits个需求,得到需要等待的时间
  final long reserveAndGetWaitLength(int permits, long nowMicros) {
    long momentAvailable = reserveEarliestAvailable(permits, nowMicros);
    return max(momentAvailable - nowMicros, 0);
  }
  abstract long reserveEarliestAvailable(int permits, long nowMicros);

reserveEarliestAvailable is an abstract method, which is implemented in the SmoothRateLimiter class. This method is the core main link method . This method first obtains from the storage license, and returns directly if the number is sufficient. Otherwise, take out all the storage licenses first, and then calculate the waiting time needed time, the logic is as follows:

  final long reserveEarliestAvailable(int requiredPermits, long nowMicros) {
    // 刷新贮存许可和下个令牌时间
    resync(nowMicros);
    // 返回值为当前的下次空闲时间
    long returnValue = nextFreeTicketMicros;
    // 要消耗的贮存数量为需要的贮存数量
    double storedPermitsToSpend = min(requiredPermits, this.storedPermits);
    // 新鲜许可数=需要的许可数-使用的贮存许可
    double freshPermits = requiredPermits - storedPermitsToSpend;
    // 等待时间=贮存许可等待时间(实现方决定)+新鲜许可等待时间(数量*固定速率)
    long waitMicros =
        storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend)
            + (long) (freshPermits * stableIntervalMicros);
    // 透支后的下次许可可用时间=当前时间(nextFreeTicketMicros)+等待时间(waitMicros)
    this.nextFreeTicketMicros = LongMath.saturatedAdd(nextFreeTicketMicros, waitMicros);
    // 贮存许可数量减少
    this.storedPermits -= storedPermitsToSpend;
    return returnValue;
  }

There are two explanations for this method: 1. The returnValue is the previously calculated next idle time (it was mentioned earlier that RateLimiter adopts the advance payment mode, and this time it returns directly, and at the same time calculates the earliest idle time for the next time) 2. The waiting time for the storage license is different The implementation logic is different. The SmoothBursty algorithm believes that the storage license is directly available, so it returns 0. The subsequent SmoothWarmingUp algorithm believes that the storage license needs to consume more warm-up time than the normal rate, which has a certain algorithm logic.

At this point, the call link analysis of the entire accquire method is over, and it is relatively simple to look at the tryAccquire method next. The logic of the difference between tryAccquire and accquire lies in that the tryAccquire method will judge the next permission time-whether the current time is greater than the timeout time, and if so, directly return false , otherwise sleep and return true. The source code of the method is as follows:

  public boolean tryAcquire(Duration timeout) {
    return tryAcquire(1, toNanosSaturated(timeout), TimeUnit.NANOSECONDS);
  }

  public boolean tryAcquire(long timeout, TimeUnit unit) {
    return tryAcquire(1, timeout, unit);
  }

  public boolean tryAcquire(int permits) {
    return tryAcquire(permits, 0, MICROSECONDS);
  }

  public boolean tryAcquire() {
    return tryAcquire(1, 0, MICROSECONDS);
  }

  public boolean tryAcquire(int permits, Duration timeout) {
    return tryAcquire(permits, toNanosSaturated(timeout), TimeUnit.NANOSECONDS);
  }

  public boolean tryAcquire(int permits, long timeout, TimeUnit unit) {
    long timeoutMicros = max(unit.toMicros(timeout), 0);
    checkPermits(permits);
    long microsToWait;
    synchronized (mutex()) {
      long nowMicros = stopwatch.readMicros();
      // 判断超时微秒数是否可等到下个许可时间
      if (!canAcquire(nowMicros, timeoutMicros)) {
        return false;
      } else {
        microsToWait = reserveAndGetWaitLength(permits, nowMicros);
      }
    }
    // 休眠等待
    stopwatch.sleepMicrosUninterruptibly(microsToWait);
    return true;
  }
  
  // 下次许可时间-超时时间<=当前时间
  private boolean canAcquire(long nowMicros, long timeoutMicros) {
    return queryEarliestAvailable(nowMicros) - timeoutMicros <= nowMicros;
  }

SmoothWarmingUp algorithm

The main processing flow of the SmoothWarmingUp algorithm is the same as that of the SmoothBurstry algorithm. The two methods for calculating the storage permission time are newly implemented. This algorithm is not as intuitive and easy to understand as the SmoothBurstry algorithm. You need to understand the algorithm logic first, and then look at the source code.

Algorithm description

The algorithm has been described clearly in the source code comments. The main idea is that the initial storage permission quantity of the current limiter is the maximum storage permission value. When the storage permission is executed, it will be generated from slow to fast according to a certain algorithm, until the set fixed rate, in order to achieve the preheating process. The algorithm involves some mathematical knowledge, if you are not very interested, just understand its main idea. The algorithm is described in detail below.

Before talking about the algorithm, let's look back at the storage license of SmoothRateLimiter. The storage license has the current quantity and the maximum quantity. There are also two algorithm logics, one is the rate control of the storage license production, and the other is the storage license consumption rate. Control, in the Bursty algorithm, the production rate is the same as the set fixed rate, while the consumption rate is infinite (immediate consumption, does not take up time); in the WarmingUp algorithm, it needs to be analyzed against the following figure:

The figure can be understood as follows. The consumption time of each storage permit is the right trapezoidal area, and the trapezoidal area = (upper side length + lower side length)/2 * height. It can be seen that the area of ​​each storage permit is getting smaller and smaller until The area of ​​a rectangle at a constant rate.

When the speed limiter is initialized, the input variables have a fixed rate and warm-up time, and the cooling factor is a fixed value of 3; in the author's algorithm, the first calculation is the threshold permission number = 0.5 * warm-up cycle / fixed rate. Then calculate What is the maximum allowable number, we know the area of ​​the trapezoid, the upper side (large speed), the lower side (small speed), we can push to the height, the maximum allowable = threshold value + high.

void doSetRate(double permitsPerSecond, double stableIntervalMicros) {
      double oldMaxPermits = maxPermits;
      double coldIntervalMicros = stableIntervalMicros * coldFactor;
      thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros;
      maxPermits =
          thresholdPermits + 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);
      slope = (coldIntervalMicros - stableIntervalMicros) / (maxPermits - thresholdPermits);
      if (oldMaxPermits == Double.POSITIVE_INFINITY) {
        // if we don't special-case this, we would get storedPermits == NaN, below
        storedPermits = 0.0;
      } else {
        storedPermits =
            (oldMaxPermits == 0.0)
                ? maxPermits // initial state is cold
                : storedPermits * maxPermits / oldMaxPermits;
      }
    }

In specific use, one is the production rate, which is fixed at the warm-up time/maximum number of licenses. The source code is as follows:

  double coolDownIntervalMicros() {
      return warmupPeriodMicros / maxPermits;
    }

The other is the rate of consumption. According to the above curve, the area from right to left = trapezoidal area + rectangular area, trapezoidal area = (top + bottom) /2 * height, the source code is as follows:

    long storedPermitsToWaitTime(double storedPermits, double permitsToTake) {
      double availablePermitsAboveThreshold = storedPermits - thresholdPermits;
      long micros = 0;
      // measuring the integral on the right part of the function (the climbing line)
      if (availablePermitsAboveThreshold > 0.0) {
        double permitsAboveThresholdToTake = min(availablePermitsAboveThreshold, permitsToTake);
        // TODO(cpovirk): Figure out a good name for this variable.
        double length =
            permitsToTime(availablePermitsAboveThreshold)
                + permitsToTime(availablePermitsAboveThreshold - permitsAboveThresholdToTake);
        micros = (long) (permitsAboveThresholdToTake * length / 2.0);
        permitsToTake -= permitsAboveThresholdToTake;
      }
      // measuring the integral on the left part of the function (the horizontal line)
      micros += (long) (stableIntervalMicros * permitsToTake);
      return micros;
    }

Source code analysis

After understanding the above algorithm, it is relatively simple to look at the source code below.

  static final class SmoothWarmingUp extends SmoothRateLimiter {
    // 预热时间
    private final long warmupPeriodMicros;
    //斜率
    private double slope;
    //阈值许可
    private double thresholdPermits;
    //冷却因子
    private double coldFactor;

    SmoothWarmingUp(
        SleepingStopwatch stopwatch, long warmupPeriod, TimeUnit timeUnit, double coldFactor) {
      super(stopwatch);
      this.warmupPeriodMicros = timeUnit.toMicros(warmupPeriod);
      this.coldFactor = coldFactor;
    }
    
    // 参数初始化
    @Override
    void doSetRate(double permitsPerSecond, double stableIntervalMicros) {
      double oldMaxPermits = maxPermits;
      double coldIntervalMicros = stableIntervalMicros * coldFactor;
      thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros;
      maxPermits =
          thresholdPermits + 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);
      slope = (coldIntervalMicros - stableIntervalMicros) / (maxPermits - thresholdPermits);
      if (oldMaxPermits == Double.POSITIVE_INFINITY) {
        // if we don't special-case this, we would get storedPermits == NaN, below
        storedPermits = 0.0;
      } else {
        storedPermits =
            (oldMaxPermits == 0.0)
                ? maxPermits // initial state is cold
                : storedPermits * maxPermits / oldMaxPermits;
      }
    }

    // 有storedPermits个贮存许可,要使用permitsToTake个时的等待时间计算
    @Override
    long storedPermitsToWaitTime(double storedPermits, double permitsToTake) {
      double availablePermitsAboveThreshold = storedPermits - thresholdPermits;
      long micros = 0;
      // measuring the integral on the right part of the function (the climbing line)
      if (availablePermitsAboveThreshold > 0.0) {
        double permitsAboveThresholdToTake = min(availablePermitsAboveThreshold, permitsToTake);
        // TODO(cpovirk): Figure out a good name for this variable.
        double length =
            permitsToTime(availablePermitsAboveThreshold)
                + permitsToTime(availablePermitsAboveThreshold - permitsAboveThresholdToTake);
        micros = (long) (permitsAboveThresholdToTake * length / 2.0);
        permitsToTake -= permitsAboveThresholdToTake;
      }
      // measuring the integral on the left part of the function (the horizontal line)
      micros += (long) (stableIntervalMicros * permitsToTake);
      return micros;
    }
    // 许可耗时=固定速率+许可值*斜率
    private double permitsToTime(double permits) {
      return stableIntervalMicros + permits * slope;
    }
    // 冷却间隔固定为预热时间/最大许可数.
    @Override
    double coolDownIntervalMicros() {
      return warmupPeriodMicros / maxPermits;
    }
  }

Thinking summary

sleep description and relative time

RateLimiter internally uses the class StopWatch to perform a relative time measurement. When RateLimiter is created, the time is 0, and then it accumulates backwards. Sleep is not affected by interrupt exceptions.

doublefloat

The license quantity input parameter of the API exposed by RateLimiter is an integer type, but the internal calculation is actually a floating point double type, which supports decimal license quantity. On the one hand, floating point has loss of precision, and on the other hand, it is not easy to understand; whether it is possible to use an integer value consider.

Only supports stand-alone

These algorithms of RateLimiter only support single-node current limiting. If you want to support cluster current limiting, one way is to first calculate the speed limit value of a single machine according to the weight of load balancing, and then perform single-node speed limit; another way is to refer to This component uses centralized quantity management middleware such as redis, but the performance and stability will be reduced.

Scalability

RateLimiter provides limited expansion capabilities. The built-in SmoothBursty and SmoothWarmingUp classes are not public classes, and cannot directly create or adjust parameters, such as turning off the storage function or adjusting the warm-up coefficient. This scenario needs to be rewritten by inheriting SmoothRateLimiter. The production and consumption algorithms of storage licenses are easy to change and rewrite. Copying the entire source code for secondary modification is also a solution.

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/8785196