4.Sentinel source code analysis - Sentinel is how to do downgrade?

Happy Mid-Autumn Festival, ah, I feel the need to write in the full moon night a source resolution, to express my heart happy ~

Sentinel Source analytic series:

1.Sentinel source code analysis -FlowRuleManager loaded rule did what?

2. Sentinel source code analysis -Sentinel How is traffic statistics?

3. Sentinel source code analysis - QPS traffic control is how to achieve?


In my second article inside 2. Sentinel source code analysis -Sentinel how traffic statistics? Which describes the main processes across Sentinel is like. So downgrade general process can be summarized as follows:
1. Set downgrade strategy is to downgrade according to the average response time or abnormal ratio of
2. The resources to create a series of slots as
in turn call the 3 slot, according to the set of slots type downgrade

Let's look at an example, to facilitate its own break point tracking:

private static final String KEY = "abc";
private static final int threadCount = 100;
private static int seconds = 60 + 40;

public static void main(String[] args) throws Exception {
         
        List<DegradeRule> rules = new ArrayList<DegradeRule>();
        DegradeRule rule = new DegradeRule();
        rule.setResource(KEY);
        // set threshold rt, 10 ms
        rule.setCount(10);
        rule.setGrade(RuleConstant.DEGRADE_GRADE_RT);
        rule.setTimeWindow(10);
        rules.add(rule);
        DegradeRuleManager.loadRules(rules);

    for (int i = 0; i < threadCount; i++) {
        Thread entryThread = new Thread(new Runnable() {

            @Override
            public void run() {
                while (true) {
                    Entry entry = null;
                    try {
                        TimeUnit.MILLISECONDS.sleep(5);
                        entry = SphU.entry(KEY);
                        // token acquired
                        pass.incrementAndGet();
                        // sleep 600 ms, as rt
                        TimeUnit.MILLISECONDS.sleep(600);
                    } catch (Exception e) {
                        block.incrementAndGet();
                    } finally {
                        total.incrementAndGet();
                        if (entry != null) {
                            entry.exit();
                        }
                    }
                }
            }
        });
        entryThread.setName("working-thread");
        entryThread.start();
    }
}

Other processes basically second article describes similar article to introduce the main flow Sentinel, and Sentinel are all relegated strategies operating in DegradeSlot in.

DegradeSlot

public class DegradeSlot extends AbstractLinkedProcessorSlot<DefaultNode> {
    @Override
    public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count, boolean prioritized, Object... args)
        throws Throwable {
        DegradeRuleManager.checkDegrade(resourceWrapper, context, node, count);
        fireEntry(context, resourceWrapper, node, count, prioritized, args);
    }
}

DegradeSlot directly call DegradeRuleManager downgrade operation, we go directly to the DegradeRuleManager.checkDegrade method.

DegradeRuleManager#checkDegrade

public static void checkDegrade(ResourceWrapper resource, Context context, DefaultNode node, int count)
    throws BlockException {
    //根据resource来获取降级策略
    Set<DegradeRule> rules = degradeRules.get(resource.getName());
    if (rules == null) {
        return;
    }
    
    for (DegradeRule rule : rules) {
        if (!rule.passCheck(context, node, count)) {
            throw new DegradeException(rule.getLimitApp(), rule);
        }
    }
}

This method logic is very clear, first of all is to get registered to downgrade rules according to the resource name, and then traverse the rule set call rules passCheck, if it returns false then thrown downgrade.

DegradeRule#passCheck

public boolean passCheck(Context context, DefaultNode node, int acquireCount, Object... args) {
    //返回false直接进行降级
    if (cut.get()) {
        return false;
    }
    //降级是根据资源的全局节点来进行判断降级策略的
    ClusterNode clusterNode = ClusterBuilderSlot.getClusterNode(this.getResource());
    if (clusterNode == null) {
        return true;
    }
    //根据响应时间降级策略
    if (grade == RuleConstant.DEGRADE_GRADE_RT) {
        //获取节点的平均响应时间
        double rt = clusterNode.avgRt();
        if (rt < this.count) {
            passCount.set(0);
            return true;
        }
        //rtSlowRequestAmount默认是5
        // Sentinel will degrade the service only if count exceeds.
        if (passCount.incrementAndGet() < rtSlowRequestAmount) {
            return true;
        }
        //    根据异常比例降级
    } else if (grade == RuleConstant.DEGRADE_GRADE_EXCEPTION_RATIO) {
        double exception = clusterNode.exceptionQps();
        double success = clusterNode.successQps();
        double total = clusterNode.totalQps();
        // If total amount is less than minRequestAmount, the request will pass.
        if (total < minRequestAmount) {
            return true;
        }

        // In the same aligned statistic time window,
        // "success" (aka. completed count) = exception count + non-exception count (realSuccess)
        double realSuccess = success - exception;
        if (realSuccess <= 0 && exception < minRequestAmount) {
            return true;
        }

        if (exception / success < count) {
            return true;
        }
        //    根据异常数降级
    } else if (grade == RuleConstant.DEGRADE_GRADE_EXCEPTION_COUNT) {
        double exception = clusterNode.totalException();
        if (exception < count) {
            return true;
        }
    }
    //根据设置的时间窗口进行重置
    if (cut.compareAndSet(false, true)) {
        ResetTask resetTask = new ResetTask(this);
        pool.schedule(resetTask, timeWindow, TimeUnit.SECONDS);
    }

    return false;
}

This method first gets the value of cut, if it is true then the direct current limiting operation. Then you will get ClusterNode global nodes based resource. Down respectively according to three different strategies to downgrade.

The response time DEGRADE_GRADE_RT downgrade

if (grade == RuleConstant.DEGRADE_GRADE_RT) {
    //获取节点的平均响应时间
    double rt = clusterNode.avgRt();
    if (rt < this.count) {
        passCount.set(0);
        return true;
    }
    //rtSlowRequestAmount默认是5
    // Sentinel will degrade the service only if count exceeds.
    if (passCount.incrementAndGet() < rtSlowRequestAmount) {
        return true;
    } 
}

If the response time is degraded, then obtains clusterNode average response time, if the average response time is greater than the set count (default milliseconds), then the call passCount plus 1, if passCount greater than 5, less direct degraded.

So we see here should know that according to several requests before the average response time is too long relegated even if the response does not immediately downgraded, but to wait until the arrival of the sixth request will be downgraded.

We enter into clusterNode method of avgRt looks into how to get the average response time clusterNode.

Examples are StatisticNode clusterNode
StatisticNode # avgRt
java public double avgRt() { //获取当前时间窗口内调用成功的次数 long successCount = rollingCounterInSecond.success(); if (successCount == 0) { return 0; } //获取窗口内的响应时间 return rollingCounterInSecond.rt() * 1.0 / successCount; } E

This method is mainly the number of calls rollingCounterInSecond succeed, then get a response within the time window to obtain an average response time of each successful call to the total response time divided by the number.

In 1.Sentinel source code analysis -FlowRuleManager loaded rule did what? I have specifically told StatisticNode inside rollingCounterInMinute implementation principle, rollingCounterInMinute is the time window of statistics by the minute. Now we are speaking about rollingCounterInSecond in seconds for statistical time window.

In StatisticNode inside initialization rollingCounterInSecond:

private transient volatile Metric rollingCounterInSecond = new ArrayMetric(SampleCountProperty.SAMPLE_COUNT,
    IntervalProperty.INTERVAL);

In this initialization method, the two will pass parameters, values SampleCountProperty.SAMPLE_COUNT is 2,
the value of IntervalProperty.INTERVAL is 1000.

We enter into the constructor ArrayMetric in:

private final LeapArray<MetricBucket> data;
public ArrayMetric(int sampleCount, int intervalInMs) {
    this.data = new OccupiableBucketLeapArray(sampleCount, intervalInMs);
}

ArrayMetric instance when creating data will create a OccupiableBucketLeapArray instance.

OccupiableBucketLeapArray

public OccupiableBucketLeapArray(int sampleCount, int intervalInMs) {
    // This class is the original "CombinedBucketArray".
    super(sampleCount, intervalInMs);
    this.borrowArray = new FutureBucketLeapArray(sampleCount, intervalInMs);
}

OccupiableBucketLeapArray inherit LeapArray this abstract class initialization time will call the parent class constructor:
LeapArray

public LeapArray(int sampleCount, int intervalInMs) {
    AssertUtil.isTrue(sampleCount > 0, "bucket count is invalid: " + sampleCount);
    AssertUtil.isTrue(intervalInMs > 0, "total time interval of the sliding window should be positive");
    //intervalInMs是sampleCount的整数
    AssertUtil.isTrue(intervalInMs % sampleCount == 0, "time span needs to be evenly divided");
    //每个小窗口的时间跨度
    this.windowLengthInMs = intervalInMs / sampleCount;
    //窗口的长度
    this.intervalInMs = intervalInMs;
    //窗口个数
    this.sampleCount = sampleCount;

    this.array = new AtomicReferenceArray<>(sampleCount);
}

OccupiableBucketLeapArray at initialization time will create a FutureBucketLeapArray instance assigned to borrowArray.

FutureBucketLeapArray is inherited LeapArray:

public FutureBucketLeapArray(int sampleCount, int intervalInMs) {
    // This class is the original "BorrowBucketArray".
    super(sampleCount, intervalInMs);
}

Directly initialized by calling the constructor of the superclass LeapArray.

Here rollingCounterInSecond creation process finished.

Let's go back in StatisticNode, when calling StatisticNode of avgRt method calls rollingCounterInSecond.success () method to get the number of calls succeed the current time window:

ArrayMetric#success

public long success() {
    //设置或更新当前的时间窗口
    data.currentWindow();
    long success = 0;
    //获取窗口里有效的Bucket
    List<MetricBucket> list = data.values();
    for (MetricBucket window : list) {
        success += window.success();
    }
    return success;
}

Data here is the parent class is LeapArray, LeapArray array which has an array to record the time window, the time window is based here seconds, the array size of 2. structure chart directly from the data I did what 1.Sentinel source code analysis -FlowRuleManager loaded rule? The take over:

WindowWrap but here only two array elements, each element consists of a WindowWrap MetricBucket objects for the statistics, such as: by frequency blocking frequency and number of times of abnormal ~

Call data of currentWindow method calls to currentWindow method of LeapArray go:
LeapArray # currentWindow

public WindowWrap<T> currentWindow(long timeMillis) {
    if (timeMillis < 0) {
        return null;
    }
    //通过当前时间判断属于哪个窗口
    int idx = calculateTimeIdx(timeMillis);
    //计算出窗口开始时间
    // Calculate current bucket start time.
    long windowStart = calculateWindowStart(timeMillis);

    while (true) {
        //获取数组里的老数据
        WindowWrap<T> old = array.get(idx);
        if (old == null) {
           
            WindowWrap<T> window = new WindowWrap<T>(windowLengthInMs, windowStart, newEmptyBucket(timeMillis));
            if (array.compareAndSet(idx, null, window)) {
                // Successfully updated, return the created bucket.
                return window;
            } else {
                // Contention failed, the thread will yield its time slice to wait for bucket available.
                Thread.yield();
            }
            // 如果对应时间窗口的开始时间与计算得到的开始时间一样
            // 那么代表当前即是我们要找的窗口对象,直接返回
        } else if (windowStart == old.windowStart()) {
             
            return old;
        } else if (windowStart > old.windowStart()) { 
            //如果当前的开始时间小于原开始时间,那么就更新到新的开始时间
            if (updateLock.tryLock()) {
                try {
                    // Successfully get the update lock, now we reset the bucket.
                    return resetWindowTo(old, windowStart);
                } finally {
                    updateLock.unlock();
                }
            } else {
                // Contention failed, the thread will yield its time slice to wait for bucket available.
                Thread.yield();
            }
        } else if (windowStart < old.windowStart()) {
            //一般来说不会走到这里
            // Should not go through here, as the provided time is already behind.
            return new WindowWrap<T>(windowLengthInMs, windowStart, newEmptyBucket(timeMillis));
        }
    }
}

Here I simply describe the method, explain in detail the method has been done in the first chapter source code analysis.

This method which will be calculated based on the current timestamp array inside the array index, array and then the array to find the corresponding data, if the node already exists, then the CAS updated with a new node; if the new node is then directly return; if a node fails, set the current node, remove all the failed node.

Here I refer directly to 1.Sentinel source code analysis -FlowRuleManager loaded rule did what? The example:

1. 如果array数据里面的bucket数据如下所示:
  NULL      B4
|_______|_______|
800     1000    1200   
    ^
   time=888
正好当前时间所对应的槽位里面的数据是空的,那么就用CAS更新

2. 如果array里面已经有数据了,并且槽位里面的窗口开始时间和当前的开始时间相等,那么直接返回
      B3      B4
 ||_______|_______||___
800     1000    1200  timestamp
      ^
    time=888

3. 例如当前时间是1676,所对应窗口里面的数据的窗口开始时间小于当前的窗口开始时间,那么加上锁,然后设置槽位的窗口开始时间为当前窗口开始时间,并把槽位里面的数据重置
   (old)
             B0      
 |_______||_______|
 ...    1200     1400
    ^
  time=1676

Method ArrayMetric back to success, the call to go down data.values () method:
LeapArray success #

public List<T> values(long timeMillis) {
    if (timeMillis < 0) {
        return new ArrayList<T>();
    }
    int size = array.length();
    List<T> result = new ArrayList<T>(size);

    for (int i = 0; i < size; i++) {
        WindowWrap<T> windowWrap = array.get(i);
        if (windowWrap == null || isWindowDeprecated(timeMillis, windowWrap)) {
            continue;
        }
        result.add(windowWrap.value());
    }
    return result;
}

This method is used to obtain all valid MetricBucket, and return.
Then get the number was called on by the success of the method call MetricBucket.

We then look at rt ArrayMetric the method:

public long rt() {
    data.currentWindow();
    long rt = 0;
    //获取当前时间窗口的统计数据
    List<MetricBucket> list = data.values();
    //统计当前时间窗口的平均相应时间之和
    for (MetricBucket window : list) {
        rt += window.rt();
    }
    return rt;
}

This method is similar to the above method of success, to get all the data summation rt MetricBucket of return.
You can then be obtained by the sum of the average number of time divided by the sum of rt method returns a successful call.

We go back to passCheck method DegradeRule in response time downgrade policy:

if (grade == RuleConstant.DEGRADE_GRADE_RT) {
    //获取节点的平均响应时间
    double rt = clusterNode.avgRt();
    if (rt < this.count) {
        passCount.set(0);
        return true;
    }
    //rtSlowRequestAmount默认是5
    // Sentinel will degrade the service only if count exceeds.
    if (passCount.incrementAndGet() < rtSlowRequestAmount) {
        return true;
    }
    //    根据异常比例降级
}
//省略
return false;

If the average count is less than the response time for the time set, and then reset passCount returns true, it indicates nothrow; if 5 consecutive response time exceeds the count, it returns false thrown downgrade .

The proportion of abnormal DEGRADE_GRADE_EXCEPTION_RATIO downgrade

if (grade == RuleConstant.DEGRADE_GRADE_EXCEPTION_RATIO) {
    //获取每秒异常的次数
    double exception = clusterNode.exceptionQps();
    //获取每秒成功的次数
    double success = clusterNode.successQps();
    //获取每秒总调用次数
    double total = clusterNode.totalQps();
    // If total amount is less than minRequestAmount, the request will pass.
    // 如果总调用次数少于5,那么不进行降级
    if (total < minRequestAmount) {
        return true;
    }

    // In the same aligned statistic time window,
    // "success" (aka. completed count) = exception count + non-exception count (realSuccess)
    double realSuccess = success - exception;
    if (realSuccess <= 0 && exception < minRequestAmount) {
        return true;
    }

    if (exception / success < count) {
        return true;
    } 
}
。。。
return false;

This method calls for success and Qps Qps abnormal call, after verification, and then ask about the ratio, if not greater than count, it returns true, false otherwise be thrown.

We then go to look over exceptionQps method:
StatisticNode # exceptionQps

public double exceptionQps() {
    return rollingCounterInSecond.exception() / rollingCounterInSecond.getWindowIntervalInSec();
}

rollingCounterInSecond.getWindowIntervalInSec method is the length of the time window, expressed in seconds. Here returns 1.
ArrayMetric # exception

public long exception() {
    data.currentWindow();
    long exception = 0;
    List<MetricBucket> list = data.values();
    for (MetricBucket window : list) {
        exception += window.exception();
    }
    return exception;
}

This method is similar to my above analysis, it looks just fine.

According to the number of abnormal downgrade DEGRADE_GRADE_EXCEPTION_COUNT

if (grade == RuleConstant.DEGRADE_GRADE_EXCEPTION_COUNT) {
    double exception = clusterNode.totalException();
    if (exception < count) {
        return true;
    }
}

According to the number of abnormal downgrade is very straightforward, based on the total number of direct statistical anomaly judgment exceeds count.

Here it is finished demoted to achieve slightly ~ ~

Guess you like

Origin www.cnblogs.com/luozhiyun/p/11517918.html