Sentinel source code 10-fuse downgrade DegradeSlot

Welcome everyone to pay attention to  github.com/hsfxuebao  , I hope it will be helpful to you. If you think it is possible, please click Star.

1 Overview

DegradeSlotis a circuit breaker for service degradation:

  • In entrythe process of execution, if it is in the blown openstate, it is judged whether the fuse period has passed and the setting of half-open is successful, then it will pass. Otherwise, the report will not pass.DegradeException

  • When it is in a degraded state half-open, it will be thrown directly DegradeException.

 

2. Fuse

There are two types ExceptionCircuitBreakerof   Sentinel circuit breakers .ResponseTimeCircuitBreakerextends  AbstractCircuitBreaker  implements CircuitBreaker

During the execution of the previous slot, if a non-BlockException or some unknown throw occurs, it will be judged in exit whether the error reaches the configured number of errors or the proportion of errors.

If the entire calling process exceeds the configured timeout period, the circuit breaker will also be triggered.

The purpose of the fuse is to set the state of the fuse to half-open or fully-open, so that a pass or exception can be returned during tryPass verification.

The configuration panel is as follows:

According to the configuration items, you can take a look at the circuit breaker interface:

public interface CircuitBreaker {

    /**
     *  降级熔断规则
     */
    DegradeRule getRule();

    /**
     * true  判断需要降级
     */
    boolean tryPass(Context context);

    /**
     * 当前熔断器的状态
     */
    State currentState();

    /**
     * 回调方法   当请求pass通过后触发
     */
    void onRequestComplete(Context context);

    /**
     * Circuit breaker state.
     */
    enum State {
      
        OPEN,
        
        HALF_OPEN,
       
        CLOSED
    }
}
复制代码

After understanding the fusing rules, the fusing process will be explained in detail below.

3. The processing flow of the circuit breaker mechanism

image.pngWhen the circuit breaker trigger condition is reached (assuming that the trigger condition is that an exception occurs when the interface exceeds 20% of processing per second, and the specific circuit breaker rule is configured by the user), the circuit breaker will be enabled. In the circuit breaker state, all access to the interface within X seconds will be blocked. fail fast (service degradation)

After X seconds, the next time the interface is requested, it is in a half-open state:

  • If the request interface is successful, return to the normal state
  • If the request interface fails, return to the fuse state and continue Blocked for X seconds

4. Fuse Status

Let's take a look at the state transition diagram of the entire fuse, where the state from open to half-open only occurs during the fuse inspection process:image.png

5. Source code analysis

Sentinel's circuit breaker is DegradeSlotachieved by the last one in the chain of responsibility

@SpiOrder(-1000)
public class DegradeSlot extends AbstractLinkedProcessorSlot<DefaultNode> {

    @Override
    public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count,
                      boolean prioritized, Object... args) throws Throwable {
        //在触发后续slot前执行熔断的检查
        performChecking(context, resourceWrapper);

        fireEntry(context, resourceWrapper, node, count, prioritized, args);
    }

    void performChecking(Context context, ResourceWrapper r) throws BlockException {
      	//通过资源名称获取所有的熔断CircuitBreaker
        List<CircuitBreaker> circuitBreakers = DegradeRuleManager.getCircuitBreakers(r.getName());
        if (circuitBreakers == null || circuitBreakers.isEmpty()) {
            return;
        }
        for (CircuitBreaker cb : circuitBreakers) {
              //cb.tryPass里面只做了状态检查,熔断是否关闭或者打开
            if (!cb.tryPass(context)) {
                throw new DegradeException(cb.getRule().getLimitApp(), cb.getRule());
            }
        }
    }

    @Override
    public void exit(Context context, ResourceWrapper r, int count, Object... args) {
        Entry curEntry = context.getCurEntry();
      	//如果当前其他solt已经有了BlockException直接调用fireExit,不用继续走熔断逻辑了
        if (curEntry.getBlockError() != null) {
            fireExit(context, r, count, args);
            return;
        }
      	//通过资源名称获取所有的熔断CircuitBreaker
        List<CircuitBreaker> circuitBreakers = DegradeRuleManager.getCircuitBreakers(r.getName());
        if (circuitBreakers == null || circuitBreakers.isEmpty()) {
            fireExit(context, r, count, args);
            return;
        }

        if (curEntry.getBlockError() == null) {
            //调用CircuitBreaker的onRequestComplete()方法
            for (CircuitBreaker circuitBreaker : circuitBreakers) {
                circuitBreaker.onRequestComplete(context);
            }
        }

        fireExit(context, r, count, args);
    }
}
复制代码

进入DegradeSlot时,只会检查断路器是否已经打开,再根据是否超过了重试时间来开启半开状态,然后就直接返回是否通过。而真正判断是否需要开启断路器的地方时在exit()方法里面,因为这个方法是在业务方法执行后调用的,断路器需要收集业务异常或者业务方法的执行时间来判断是否打开断路器

先来看进入DegradeSlot的entry()方法,这里调用了CircuitBreaker.tryPass()方法,CircuitBreakerExceptionCircuitBreakerResponseTimeCircuitBreaker两种类型的断路器,CircuitBreaker继承关系图如下:

image.png

entry()方法实际上调用了AbstractCircuitBreaker.tryPass()方法,这里只做了一个处理,如果断路器开启,但是上一个请求距离现在已经过了重试间隔时间就开启半启动状态。

public abstract class AbstractCircuitBreaker implements CircuitBreaker {    

		@Override
    public boolean tryPass(Context context) {
        if (currentState.get() == State.CLOSED) {
            return true;
        }
        if (currentState.get() == State.OPEN) {
            //如果断路器开启,但是上一个请求距离现在已经过了重试间隔时间就开启半启动状态
            return retryTimeoutArrived() && fromOpenToHalfOpen(context);
        }
        return false;
    } 
}
复制代码

exit()方法调用了ExceptionCircuitBreakerResponseTimeCircuitBreakeronRequestComplete()方法。

5.1 ExceptionCircuitBreaker

下面分析下比较简单的ExceptionCircuitBreaker,其对应的熔断策略为异常比例和异常数:

image.png

详细代码如下:

public class ExceptionCircuitBreaker extends AbstractCircuitBreaker {
   
    // 策略,异常比例还是异常数
    private final int strategy;
    // 最小请求数
    private final int minRequestAmount;
    // 比例阈值
    private final double threshold;

    private final LeapArray<SimpleErrorCounter> stat;

    @Override
    public void onRequestComplete(Context context) {
        Entry entry = context.getCurEntry();
        if (entry == null) {
            return;
        }
        Throwable error = entry.getError();
        //异常时间窗口计数器
        SimpleErrorCounter counter = stat.currentWindow().value();
        //异常数加1
        if (error != null) {
            counter.getErrorCount().add(1);
        }
        //总数加1
        counter.getTotalCount().add(1);

        handleStateChangeWhenThresholdExceeded(error);
    }

    private void handleStateChangeWhenThresholdExceeded(Throwable error) {
        //断路器已开直接返回
        if (currentState.get() == State.OPEN) {
            return;
        }

        //断路器处于半开状态
        if (currentState.get() == State.HALF_OPEN) {
            if (error == null) {
                //本次请求没有出现异常,关掉断路器
                fromHalfOpenToClose();
            } else {
                //本次请求出现了异常,打开断路器
                fromHalfOpenToOpen(1.0d);
            }
            return;
        }

        //获取所有的窗口计数器
        List<SimpleErrorCounter> counters = stat.values();
        long errCount = 0;
        long totalCount = 0;
        for (SimpleErrorCounter counter : counters) {
            errCount += counter.errorCount.sum();
            totalCount += counter.totalCount.sum();
        }
        //请求总数小于minRequestAmount时不做熔断处理 minRequestAmount时配置在熔断规则里面的
        if (totalCount < minRequestAmount) {
            return;
        }
        double curCount = errCount;
        if (strategy == DEGRADE_GRADE_EXCEPTION_RATIO) {
            //如果熔断策略配置的是窗口时间内错误率就需要做百分比的计算
            curCount = errCount * 1.0d / totalCount;
        }
        //错误率或者错误数大于阈值就开启断路器
        if (curCount > threshold) {
            transformToOpen(curCount);
        }
    }
}
复制代码

ExceptionCircuitBreaker在业务方法执行后被调用,主要做了如下处理:

  • 断路器处于半开状态

    • 本次请求没有出现异常,关掉断路器
    • 本次请求出现了异常,打开断路器
  • Sentinel Dashboard降级规则中会配置最小请求数,如果请求总数小于最小请求数时不做熔断处理

  • 如果错误率或者错误数大于阈值就开启断路器

5.2 ResponseTimeCircuitBreaker

下面分析ResponseTimeCircuitBreaker

image.png

public class ResponseTimeCircuitBreaker extends AbstractCircuitBreaker {

    private static final double SLOW_REQUEST_RATIO_MAX_VALUE = 1.0d;

    // 最大RT
    private final long maxAllowedRt;
    // 最大 慢请求比例
    private final double maxSlowRequestRatio;
    // 最小请求数量
    private final int minRequestAmount;

    private final LeapArray<SlowRequestCounter> slidingCounter;

    public ResponseTimeCircuitBreaker(DegradeRule rule) {
        this(rule, new SlowRequestLeapArray(1, rule.getStatIntervalMs()));
    }

    ResponseTimeCircuitBreaker(DegradeRule rule, LeapArray<SlowRequestCounter> stat) {
        super(rule);
        AssertUtil.isTrue(rule.getGrade() == RuleConstant.DEGRADE_GRADE_RT, "rule metric type should be RT");
        AssertUtil.notNull(stat, "stat cannot be null");
        this.maxAllowedRt = Math.round(rule.getCount());
        this.maxSlowRequestRatio = rule.getSlowRatioThreshold();
        this.minRequestAmount = rule.getMinRequestAmount();
        this.slidingCounter = stat;
    }

    @Override
    public void resetStat() {
        // Reset current bucket (bucket count = 1).
        slidingCounter.currentWindow().value().reset();
    }

    @Override
    public void onRequestComplete(Context context) {
        SlowRequestCounter counter = slidingCounter.currentWindow().value();
        Entry entry = context.getCurEntry();
        if (entry == null) {
            return;
        }
        long completeTime = entry.getCompleteTimestamp();
        if (completeTime <= 0) {
            completeTime = TimeUtil.currentTimeMillis();
        }
        long rt = completeTime - entry.getCreateTimestamp();
        if (rt > maxAllowedRt) {
            counter.slowCount.add(1);
        }
        counter.totalCount.add(1);

        handleStateChangeWhenThresholdExceeded(rt);
    }

    private void handleStateChangeWhenThresholdExceeded(long rt) {
        if (currentState.get() == State.OPEN) {
            return;
        }
        
        if (currentState.get() == State.HALF_OPEN) {
            // In detecting request
            // TODO: improve logic for half-open recovery
            if (rt > maxAllowedRt) {
                fromHalfOpenToOpen(1.0d);
            } else {
                fromHalfOpenToClose();
            }
            return;
        }

        List<SlowRequestCounter> counters = slidingCounter.values();
        long slowCount = 0;
        long totalCount = 0;
        for (SlowRequestCounter counter : counters) {
            slowCount += counter.slowCount.sum();
            totalCount += counter.totalCount.sum();
        }
        if (totalCount < minRequestAmount) {
            return;
        }
        double currentRatio = slowCount * 1.0d / totalCount;
        if (currentRatio > maxSlowRequestRatio) {
            transformToOpen(currentRatio);
        }
        if (Double.compare(currentRatio, maxSlowRequestRatio) == 0 &&
                Double.compare(maxSlowRequestRatio, SLOW_REQUEST_RATIO_MAX_VALUE) == 0) {
            transformToOpen(currentRatio);
        }
    }

    static class SlowRequestCounter {
        private LongAdder slowCount;
        private LongAdder totalCount;

        public SlowRequestCounter() {
            this.slowCount = new LongAdder();
            this.totalCount = new LongAdder();
        }

        public LongAdder getSlowCount() {
            return slowCount;
        }

        public LongAdder getTotalCount() {
            return totalCount;
        }

        public SlowRequestCounter reset() {
            slowCount.reset();
            totalCount.reset();
            return this;
        }

        @Override
        public String toString() {
            return "SlowRequestCounter{" +
                "slowCount=" + slowCount +
                ", totalCount=" + totalCount +
                '}';
        }
    }

    static class SlowRequestLeapArray extends LeapArray<SlowRequestCounter> {

        public SlowRequestLeapArray(int sampleCount, int intervalInMs) {
            super(sampleCount, intervalInMs);
        }

        @Override
        public SlowRequestCounter newEmptyBucket(long timeMillis) {
            return new SlowRequestCounter();
        }

        @Override
        protected WindowWrap<SlowRequestCounter> resetWindowTo(WindowWrap<SlowRequestCounter> w, long startTime) {
            w.resetTo(startTime);
            w.value().reset();
            return w;
        }
    }
}
复制代码

代码比较简单,就不做解释了。

6. 规则设置的参数

这些参数的设置,我们再直接贴官网的吧。

Field 说明 默认值
resource 资源名,即规则的作用对象
grade 熔断策略,支持慢调用比例/异常比例/异常数策略 慢调用比例
count 慢调用比例模式下为慢调用临界 RT(超出该值计为慢调用);异常比例/异常数模式下为对应的阈值
timeWindow 熔断时长,单位为 s
minRequestAmount The minimum number of requests triggered by the circuit breaker. When the number of requests is less than this value, the circuit breaker will not be broken even if the exception ratio exceeds the threshold (introduced in 1.7.0) 5
statIntervalMs Statistics duration (unit is ms), such as 60*1000 for minutes (introduced in 1.8.0) 1000 ms
slowRatioThreshold Slow call scale threshold, only slow call scale mode is valid (introduced in 1.8.0)

Reference article

Sentinel1.8.5 source code github address (note)
Sentinel source code analysis
Sentinel official website
Sentinel DegradeSlot fuse source code analysis

Guess you like

Origin juejin.im/post/7150475442263162893