Hystrix在网关Zuul使用中遇到问题

Zuul默认隔离策略级别是信号量隔离，默认最大隔离信号量是100

信号量隔离和线程隔离的区别：https://blog.csdn.net/liaojiamin0102/article/details/94394956
默认的设置如源码：

//在ZuulProperties 类下游对应hystrix配置的信息
private HystrixSemaphore semaphore = new HystrixSemaphore();
@Data
@AllArgsConstructor
@NoArgsConstructor
public static class HystrixSemaphore {
	/**
	* The maximum number of total semaphores for Hystrix.
	*/
	private int maxSemaphores = 100;	
}

zuul里隔离是按照服务隔离，也就是一个服务一个信号量，非接口级别的

第一个注意点Zuul服务本身的线程池大小，后端服务线程池大小，以及隔离信号量或者线程池的线程池大小，防止一个线程被占光
默认情况下，所有服务是公用一个线程池的
需要开启每个服务分别有自己的线程池需要配置如下：

zuul.threadPool.useSeparateThreadPools=true

但是如果配置了如下配置，则配置线程池大小时候线程池的可以需要加入前缀，不然无法指定。

zuul.threadPool.threadPoolKeyPrefix:zhenai

在Zuul里面重新封装了Hystrix的一些配置名称，所有Hystrix的原生配置会失效

具体Hystrix参数的Setter如下：
如上源码中需要通过Zuulproperties重新设置属性如下几个：
- 隔离级别指定：zuul.ribbonlsolationStrategy:SEMAPHORE
- 信号隔离的默认隔离大小：semaphore.maxSemaPhores=20
- 指定服务的信号隔离级别大小：zuul.eureka.serviceId.semaphore.maxSemaphores=20
而原生的hystrix.command.defauilt.execution.isolation.strategy 和maxConcurrentRequests的配置都会失效，会被这2个覆盖

如果用的是信号量隔离级别，那么hystrix的超时将会失效

单我们使用线程池隔离时候，应为多了一层线程池，而且用的是RxJava实现的，故可以直接支持Hystrix的超时调用，如果使用的是信号量隔离，那么hystrix的超时将会失效，但是ribbon或者socket本身的超时机制还是有效的，而且超时之后会释放掉信号

但是如果是用的信号量隔离，一人得注意Hystrix设置的超时时间，应为他涉及到信号量的释放

先看下hystrix信号量的实现原理：
信号量的设置在AbstractCommand里面,用了一个ConcuttentHashMap是存储这个计算器的设置，key对应的是CommandKey,TryableSemaphore（TryableSemaphoreActual）对应计数器的实现，用java的AtiomicInter实现的，tryAcquire的时候进行原子incrementAndGet，如果大于设置的MaxConcurrentRequests，则进行阻塞

//AbstractCommand类中
 /* each circuit has a semaphore to restrict concurrent fallback execution */
    protected static final ConcurrentHashMap<String, TryableSemaphore> executionSemaphorePerCircuit = new ConcurrentHashMap<String, TryableSemaphore>();

//还是在AbstractCommand中
private Observable<R> applyHystrixSemantics(final AbstractCommand<R> _cmd) {
        // mark that we're starting execution on the ExecutionHook
        // if this hook throws an exception, then a fast-fail occurs with no fallback.  No state is left inconsistent
     .....
            if (executionSemaphore.tryAcquire()) {
                try {
                    /* used to track userThreadExecutionTime */
                    executionResult = executionResult.setInvocationStartTime(System.currentTimeMillis());
                    return executeCommandAndObserve(_cmd)
                            .doOnError(markExceptionThrown)
                            .doOnTerminate(singleSemaphoreRelease)
                            .doOnUnsubscribe(singleSemaphoreRelease);
                } catch (RuntimeException e) {
                    return Observable.error(e);
                }
            } else {
                return handleSemaphoreRejectionViaFallback();
            }
        } else {
            return handleShortCircuitViaFallback();
        }
    }

如上代码中doOnTerminate(singleSemaphoreRelease)这句，如果你配置了超时1s，比如
- hystrix.command.default.execution.timeout.enabled=true
- hystrix.command.default.execution.isolation.thread.timeoutinMill=1000
如上配置时候你的信号量生效将是1s以内，也就是过了1s不管socket是否超时，hystrix都会释放信号量。

在zuul里面，线程池隔离情况下，是异步访问的

这一点如下源码体现了：

hystrix3

调用的是hystrix command的execute方法，hystrix的官网原文说明如下：

— blocks, then returns the single response received from the dependency (or throws an exception in case of an error)
execute是一个阻塞方法，也就是说，如果不合理的设置线程池的大小，和超时时间，还是有可能把zuul的线程消耗完。从而失去对服务的保护作用

7.我理解中zuul的复杂度大多是因为集成了hystrix，ribbon导致设置超时，线程，隔离都有一定复杂度，本身文档并没有清楚表达，很多地方需要自己去读源码看原因。

Hystrix在网关Zuul使用中遇到问题

Hystrix在网关Zuul使用中遇到问题

猜你喜欢