3. Sentinel source code analysis - QPS traffic control is how to achieve?

In this week finally wrote the source code parsing, a week can not break even no matter how busy

Sentinel Source analytic series:

1.Sentinel source code analysis -FlowRuleManager loaded rule did what?

2. Sentinel source code analysis -Sentinel How is traffic statistics?


Last time we use based on the number of concurrent terms of the whole process a bit Sentinel of this article we are speaking about the Sentinel's flow control QPS is how to achieve.

First on a minimalist demo, our code will start from this demo:

public static void main(String[] args) {
    List<FlowRule> rules = new ArrayList<FlowRule>();
    FlowRule rule1 = new FlowRule();
    rule1.setResource("abc"); 
    rule1.setCount(20);
    rule1.setGrade(RuleConstant.FLOW_GRADE_QPS);
    rule1.setLimitApp("default");
    rules.add(rule1);
    FlowRuleManager.loadRules(rules);

    Entry entry = null;

    try {
        entry = SphU.entry("abc");
        //dosomething 
    } catch (BlockException e1) {

    } catch (Exception e2) {
        // biz exception
    } finally {
        if (entry != null) {
            entry.exit();
        }
    }
}
复制代码

In this example, we created a FlowRule first instance and then call the method to load loadRules rule, this part of the code and the code number of concurrent flow-based control is the same, you want to know my friends can go and see this an article 1.Sentinel source code analysis -FlowRuleManager loaded rule did what? , Let's talk about the different places.

It creates a rater instance when calling loadRules method FlowRuleManager of:

FlowRuleUtil#buildFlowRuleMap

//设置拒绝策略:直接拒绝、Warm Up、匀速排队,默认是DefaultController
TrafficShapingController rater = generateRater(rule);
rule.setRater(rater);
复制代码

We enter into generateRater look at is how to create an instance of:

FlowRuleUtil#generateRater

private static TrafficShapingController generateRater(/*@Valid*/ FlowRule rule) {
    if (rule.getGrade() == RuleConstant.FLOW_GRADE_QPS) {
        switch (rule.getControlBehavior()) {
            case RuleConstant.CONTROL_BEHAVIOR_WARM_UP:
                //warmUpPeriodSec默认是10 
                return new WarmUpController(rule.getCount(), rule.getWarmUpPeriodSec(),
                    ColdFactorProperty.coldFactor);
            case RuleConstant.CONTROL_BEHAVIOR_RATE_LIMITER:
                //rule.getMaxQueueingTimeMs()默认是500
                return new RateLimiterController(rule.getMaxQueueingTimeMs(), rule.getCount());
            case RuleConstant.CONTROL_BEHAVIOR_WARM_UP_RATE_LIMITER:
                return new WarmUpRateLimiterController(rule.getCount(), rule.getWarmUpPeriodSec(),
                    rule.getMaxQueueingTimeMs(), ColdFactorProperty.coldFactor);
            case RuleConstant.CONTROL_BEHAVIOR_DEFAULT:
            default:
                // Default mode or unknown mode: default traffic shaping controller (fast-reject).
        }
    }
    return new DefaultController(rule.getCount(), rule.getGrade());
}
复制代码

This method is provided by which if QPS way limiting, it may be provided a ControlBehavior property, are used for flow control: directly rejected, Warm Up, uniform line.

All subsequent current limiting operation in FlowSlot all, the Sentinel are not familiar with the process of my friends can go and see this article: 2. Sentinel source code analysis -Sentinel how traffic statistics? , This article describes the whole process of Sentinel analysis of the basic processes of other paper in terms of this article, only FlowSlot different parts of the code.

FlowSlot inside then we are speaking about is how to achieve current limit of QPS

FlowSlot#entry

public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count,
                  boolean prioritized, Object... args) throws Throwable {
    checkFlow(resourceWrapper, context, node, count, prioritized);

    fireEntry(context, resourceWrapper, node, count, prioritized, args);
}

void checkFlow(ResourceWrapper resource, Context context, DefaultNode node, int count, boolean prioritized)
    throws BlockException {
    checker.checkFlow(ruleProvider, resource, context, node, count, prioritized);
}
复制代码

FlowSlot instantiated when instantiates a FlowRuleChecker instance as the checker. Inside checkFlow method continues to call checkFlow FlowRuleChecker method, wherein ruleProvider The examples are intended to be acquired from the respective FlowRule according flowRules resource.

We enter into checkFlow method of FlowRuleChecker

FlowRuleChecker#checkFlow

public void checkFlow(Function<String, Collection<FlowRule>> ruleProvider, ResourceWrapper resource,
                      Context context, DefaultNode node, int count, boolean prioritized) throws BlockException {
    if (ruleProvider == null || resource == null) {
        return;
    }
    //返回FlowRuleManager里面注册的所有规则
    Collection<FlowRule> rules = ruleProvider.apply(resource.getName());
    if (rules != null) {
        for (FlowRule rule : rules) {
            //如果当前的请求不能通过,那么就抛出FlowException异常
            if (!canPassCheck(rule, context, node, count, prioritized)) {
                throw new FlowException(rule.getLimitApp(), rule);
            }
        }
    }
}
复制代码

Here is the call ruleProvider to get all FlowRule, then traverse a set of rule to filter through canPassCheck method, if you do not meet the conditions will be thrown FlowException exception.

We now go directly to passLocalCheck method:

private static boolean passLocalCheck(FlowRule rule, Context context, DefaultNode node, int acquireCount,
                                      boolean prioritized) {
    //节点选择
    Node selectedNode = selectNodeByRequesterAndStrategy(rule, context, node);
    if (selectedNode == null) {
        return true;
    }
    //根据设置的规则来拦截
    return rule.getRater().canPass(selectedNode, acquireCount, prioritized);
}
复制代码

This method calls inside canPass method rater after the meeting, choosing the appropriate node to determine whether obstruction.

Rater four, namely: DefaultController, RateLimiterController, WarmUpController, WarmUpRateLimiterController, we analyze one by one.

DefaultController which is a direct denial strategy, we have analyzed in the previous article, we take a look at the other three.

RateLimiterController uniform line up

Its central idea is to make a fixed time interval by a request. When the request comes, if the distance by the time of the request by the current request at intervals of not less than the predetermined value, so that the current request by; otherwise, the current request by calculating the expected time, if the request is expected to rule by less than preset timeout period, the request will wait until the time comes by default (queued for processing); if the expected duration exceeds the maximum queuing time through, directly reject the request.

This approach is suitable for thrusting like a request to come at this time we do not want all of a sudden all the requests are passed, this would probably overwhelm the system; we also expect the system at a steady speed, gradually deal with these requests, to act as a "load shifting" effect, rather than rejecting all requests.

To use this strategy needs to be set when the instance of FlowRule of rule1.setControlBehavior(RuleConstant.CONTROL_BEHAVIOR_RATE_LIMITER)such a code.

When instantiated Rater will call FlowRuleUtil # generateRateri create an instance:

new RateLimiterController(rule.getMaxQueueingTimeMs(), rule.getCount());
复制代码

MaxQueueingTimeMs default is 500, Count passed in our case is 20.

We look at specific canPass method is how to achieve current limit of:

public boolean canPass(Node node, int acquireCount, boolean prioritized) {
    // Pass when acquire count is less or equal than 0.
    if (acquireCount <= 0) {
        return true;
    }
    // Reject when count is less or equal than 0.
    // Otherwise,the costTime will be max of long and waitTime will overflow in some cases.
    if (count <= 0) {
        return false;
    }

    long currentTime = TimeUtil.currentTimeMillis();
    //两个请求预期通过的时间,也就是说把请求平均分配到1秒上
    // Calculate the interval between every two requests.
    long costTime = Math.round(1.0 * (acquireCount) / count * 1000);

    //latestPassedTime代表的是上一次调用请求的时间
    // Expected pass time of this request.
    long expectedTime = costTime + latestPassedTime.get();
    //如果预期通过的时间加上上次的请求时间小于当前时间,则通过
    if (expectedTime <= currentTime) {
        // Contention may exist here, but it's okay.
        latestPassedTime.set(currentTime);
        return true;
    } else {
        //默认是maxQueueingTimeMs
        // Calculate the time to wait.
        long waitTime = costTime + latestPassedTime.get() - TimeUtil.currentTimeMillis();

        //如果预提时间比当前时间大maxQueueingTimeMs那么多,那么就阻塞
        if (waitTime > maxQueueingTimeMs) {
            return false;
        } else {
            //将上次时间加上这次请求要耗费的时间
            long oldTime = latestPassedTime.addAndGet(costTime);
            try {
                waitTime = oldTime - TimeUtil.currentTimeMillis();
                //再次判断一下是否超过maxQueueingTimeMs设置的时间
                if (waitTime > maxQueueingTimeMs) {
                    //如果是的话就阻塞,并重置上次通过时间
                    latestPassedTime.addAndGet(-costTime);
                    return false;
                }
                //如果需要等待的时间大于零,那么就sleep
                // in race condition waitTime may <= 0
                if (waitTime > 0) {
                    Thread.sleep(waitTime);
                }
                return true;
            } catch (InterruptedException e) {
            }
        }
    }
    return false;
}
复制代码

This method will initially costTime calculate this value, the request is distributed to one second in average. For example: when the count time is set to 10, it represents one second uniform by 10 requests, i.e. each request interval constant average 1000/10 = 100 ms.

But here there is a small bug, if the count is set relatively large, for example, is set to 10000, then costTime will always be equal to 0, the entire QPS limiting? Will fail.

And then will costTime last request time are added, if the current is greater than the time that the request is too frequent, costTime latestPassedTime this property will add this request and call sleep method to make this thread to sleep for a while and then request.

There is a detail, if multiple simultaneous requests come together, so that each request by addAndGet atomicity of sequentially adding the latestPassedTime provided oldtime time, and assigned to oldtime, so that each thread of sleep time not the same, the thread will not wake up at the same time.

WarmUpController limiting cold start

When the system is in a low level of long-term, when the sudden increase in traffic, directly to the system instantly pulled up to the high water level may overwhelm the system. By "cold start", so that the flow rate through increased slowly, gradually increased the upper threshold within a certain time, to the cooling system of a warm-up time, avoid the cooling system is overwhelmed.

//默认为3
private int coldFactor;
//转折点的令牌数
protected int warningToken = 0;
//最大的令牌数
private int maxToken;
//斜线斜率
protected double slope;
//累积的令牌数
protected AtomicLong storedTokens = new AtomicLong(0);
//最后更新令牌的时间
protected AtomicLong lastFilledTime = new AtomicLong(0);

public WarmUpController(double count, int warmUpPeriodInSec, int coldFactor) {
    construct(count, warmUpPeriodInSec, coldFactor);
}

private void construct(double count, int warmUpPeriodInSec, int coldFactor) {

    if (coldFactor <= 1) {
        throw new IllegalArgumentException("Cold factor should be larger than 1");
    }

    this.count = count;
    //默认是3
    this.coldFactor = coldFactor;

    // thresholdPermits = 0.5 * warmupPeriod / stableInterval.
    // 10*20/2 = 100
    // warningToken = 100;
    warningToken = (int) (warmUpPeriodInSec * count) / (coldFactor - 1);
    // / maxPermits = thresholdPermits + 2 * warmupPeriod /
    // (stableInterval + coldInterval)
    // maxToken = 200
    maxToken = warningToken + (int) (2 * warmUpPeriodInSec * count / (1.0 + coldFactor));

    // slope
    // slope = (coldIntervalMicros - stableIntervalMicros) / (maxPermits
    // - thresholdPermits);
    slope = (coldFactor - 1.0) / count / (maxToken - warningToken);
}
复制代码

Here I take a picture to illustrate:

X-axis represents the number of storedPermits, Y axis represents a time of acquisition of permits required.

PermitsPerSecond designated 10 is assumed, then stableInterval to 100ms, and coldInterval three times, i.e. 300ms (coldFactor, 3 times). That is, when reached maxPermits, this time in the coldest time of the system, the need to obtain a permit 300ms, and if storedPermits less than thresholdPermits time, just 100ms.

Using "Get cold Permits" requires more time to wait, by limiting the burst request, the purpose of preheating the system.

So in our code, maxToken is represented by the figure of maxPermits, on behalf of warningToken is thresholdPermits, slope is to represent each permit to reduce the degree of acquisition.

We turn next to WarmUpController of canpass method:

WarmUpController#canpass

public boolean canPass(Node node, int acquireCount, boolean prioritized) {
    //获取当前时间窗口的流量大小
    long passQps = (long) node.passQps();
    //获取上一个窗口的流量大小
    long previousQps = (long) node.previousPassQps();
    //设置 storedTokens 和 lastFilledTime 到正确的值
    syncToken(previousQps);

    // 开始计算它的斜率
    // 如果进入了警戒线,开始调整他的qps
    long restToken = storedTokens.get();
    if (restToken >= warningToken) {
        //通过计算当前的restToken和警戒线的距离来计算当前的QPS
        //离警戒线越接近,代表这个程序越“热”,从而逐步释放QPS
        long aboveToken = restToken - warningToken;
        //当前状态下能达到的最高 QPS
        // current interval = restToken*slope+1/count
        double warningQps = Math.nextUp(1.0 / (aboveToken * slope + 1.0 / count));

        // 如果不会超过,那么通过,否则不通过
        if (passQps + acquireCount <= warningQps) {
            return true;
        }
    } else {
        // count 是最高能达到的 QPS
        if (passQps + acquireCount <= count) {
            return true;
        }
    }
    return false;
}
复制代码

After setting value storedTokens by syncToken (previousQps) in this method, and the warning value to make a judgment, if the warning value is not reached, and then calculating the distance plus the guard value to calculate a slope value of QPS current, the greater the current storedTokens the smaller the QPS.

If the current is less than the warning value has storedTokens, the description has been warmed up, the count is determined directly enough.

WarmUpController#syncToken

protected void syncToken(long passQps) {
    long currentTime = TimeUtil.currentTimeMillis();
    //去掉毫秒的时间
    currentTime = currentTime - currentTime % 1000;
    long oldLastFillTime = lastFilledTime.get();
    if (currentTime <= oldLastFillTime) {
        return;
    }

    // 令牌数量的旧值
    long oldValue = storedTokens.get();
    // 计算新的令牌数量,往下看
    long newValue = coolDownTokens(currentTime, passQps);

    if (storedTokens.compareAndSet(oldValue, newValue)) {
        // 令牌数量上,减去上一分钟的 QPS,然后设置新值
        long currentValue = storedTokens.addAndGet(0 - passQps);
        if (currentValue < 0) {
            storedTokens.set(0L);
        }
        lastFilledTime.set(currentTime);
    } 
}
复制代码

CoolDownTokens by this method to obtain a new value method, and then provided to the CAS via storedTokens, then subtracting storedTokens QPS value of a window, and set a new value lastFilledTime.

In fact, I have a doubt here, and do not control when subtracting the QPS on a window with storedTokens, if the processing speed is very fast, within a window is cut many times, directly reduced to less than the current storedTokens warningToken, it is not no cold-start effect starting within a certain time frame?

private long coolDownTokens(long currentTime, long passQps) {
    long oldValue = storedTokens.get();
    long newValue = oldValue;

    // 添加令牌的判断前提条件:
    // 当令牌的消耗程度远远低于警戒线的时候
    if (oldValue < warningToken) {
        // 根据count数每秒加上令牌
        newValue = (long) (oldValue + (currentTime - lastFilledTime.get()) * count / 1000);
    } else if (oldValue > warningToken) {
        //如果还在冷启动阶段
        // 如果当前通过的 QPS 大于 count/coldFactor,说明系统消耗令牌的速度,大于冷却速度
        //    那么不需要添加令牌,否则需要添加令牌
        if (passQps < (int) count / coldFactor) {
            newValue = (long) (oldValue + (currentTime - lastFilledTime.get()) * count / 1000);
        }
    }
    return Math.min(newValue, maxToken);
}
复制代码

This method is mainly used for the operation to add token, if the flow rate is relatively small or is already warmed up, then you need to count the number of tokens per second, plus, if it is not carried out so that the preheating phase brand added.

WarmUpRateLimiterController is a combination of cold start and a uniform line, the code is very simple, with the above analysis, I believe we all can understand, so it does not explain.

Guess you like

Origin juejin.im/post/5d799a41f265da039b24c57a