Study Notes Hystrix fuse

Hystrix principle

Background
in distributed systems environment, similar dependencies between services are very common, usually rely on a business call more basic services. As shown, for synchronous calls, when the inventory service is not available, merchandise service request thread is blocked, when there are large quantities of inventory service call request, it may eventually lead to the entire merchandise service resources are exhausted, unable to continue to provide services. And such use may not be passed up along the call chain, this phenomenon is called avalanche effect requested.

Avalanche effect common scenario
hardware fault: The server is down, room power, fiber Waduan like.
A surge in traffic: such as abnormal traffic, retry increase traffic.
Cache Penetration: generally occurs in the application restart, when all of a cache miss, and a short time when a large number of cache invalidation. A lot of cache miss, the request Watch back-end services, resulting in a service provider overloaded, causing service is unavailable.
Program BUG: The program logic causes a memory leak, JVM long FullGC like.
Synchronization wait: synchronous mode between calls using the service, resource depletion caused by synchronization wait.
Avalanche effect coping strategies
for different scenarios caused by the avalanche effect, you can use different tactics, not a strategy common to all scenarios, refer to the following:

Hardware failure: Multi-room disaster recovery, remote live and so on.
A surge in traffic: automatic service expansion, flow control (limiting, closed retry) and the like.
Cache penetration: Cache preloading, caching asynchronous loading and so on.
Program BUG: modify the program bug, timely release of resources.
Synchronization wait: resource isolation, MQ decoupling, unavailable service call fails and so fast. Resource isolation generally refers to the different service calls using different thread pool; not to fail fast with the service call is usually through a combination of fuse mode timeout mechanism.
In summary, if an application can not isolate the fault from dependence, and that the application itself is at risk of collapse. Therefore, in order to build a stable, reliable distributed systems, our service should have the ability to protect themselves when the dependent services are not available, the current service to start self-protection, so as to avoid the avalanche effect occurs. This article focuses on the use of avalanche Hystrix solve the problem of synchronization wait.

Preliminary Hystrix
Hystrix [hɪst'rɪks], Chinese meaning porcupine, its back covered with thorns, which has the capacity for self-protection. As referred to herein Hystrix open source Netflix is a fault-tolerant framework, it has the same ability to protect themselves. In order to achieve fault tolerance and self-protection, let's look at how Hystrix design and implementation.

Hystrix design goals:

Delay and failure depend on for protection from and control - these are usually dependent on the network access
to prevent a chain reaction of failures
fast failure recovery and rapid
rollback and graceful degradation
provides near real-time monitoring and alerting
design principles Hystrix follow:

Prevent any individual dependent depletion of resources (threads)
overload cut off immediately and quickly fail to prevent queuing
to provide fallback to protect users against possible failures
using isolation technology (such as partitions, lanes and cut-out mode) to limit any dependence impact
through near real-time indicators, monitoring and alerting to ensure timely detection of fault is
configuration properties dynamically modified to ensure that the failure to recover in time
to prevent the entire dependency client fails, network traffic, not just
how Hystrix achieve these design goals?

Use the command mode for all external services (or dependence) calls packaged in HystrixCommand or HystrixObservableCommand object, and the object is placed in a separate thread execution;
each dependent maintains a thread pool (or semaphore), thread pool is exhausted the request is rejected (rather than request queue).
Record request successes, failures, timeouts and thread rejected.
Service percentage of error exceeds the threshold value, the fuse switch automatically opens, stopping all requests for service within a period of time.
The request failed is rejected, downgrade or timeout logic fuse.
Near real-time monitoring metrics and modify the configuration.
Getting Hystrix
Hystrix simple example
before the start of in-depth Hystrix principle, let's look at a simple example.

The first step, object inheritance HystrixCommand realize their command, the method in the configuration of command parameters required configuration request needs to be performed, and the combination of the actual transmission request code is as follows:

public class QueryOrderIdCommand extends HystrixCommand {
private final static Logger logger = LoggerFactory.getLogger(QueryOrderIdCommand.class);
private OrderServiceProvider orderServiceProvider;

public QueryOrderIdCommand(OrderServiceProvider orderServiceProvider) {
    super(Setter.withGroupKey(HystrixCommandGroupKey.Factory.asKey("orderService"))
            .andCommandKey(HystrixCommandKey.Factory.asKey("queryByOrderId"))
            .andCommandPropertiesDefaults(HystrixCommandProperties.Setter()
                    .withCircuitBreakerRequestVolumeThreshold(10)//至少有10个请求,熔断器才进行错误率的计算
                    .withCircuitBreakerSleepWindowInMilliseconds(5000)//熔断器中断请求5秒后会进入半打开状态,放部分流量过去重试
                    .withCircuitBreakerErrorThresholdPercentage(50)//错误率达到50开启熔断保护
                    .withExecutionTimeoutEnabled(true))
            .andThreadPoolPropertiesDefaults(HystrixThreadPoolProperties
                    .Setter().withCoreSize(10)));
    this.orderServiceProvider = orderServiceProvider;
}

@Override
protected Integer run() {
    return orderServiceProvider.queryByOrderId();
}

@Override
protected Integer getFallback() {
    return -1;
}

}
The second step is called HystrixCommand method of execution to initiate actual request.

@Test
public void testQueryByOrderIdCommand () {
Integer = R & lt new new QueryOrderIdCommand (orderServiceProvider) .execute ();
logger.info ( "Result: {}", R & lt);
}
Hystrix process flow
Hystrix flowchart is as follows:

                            图片来源Hystrix官网https://github.com/Netflix/Hystrix/wiki

Hystrix entire workflow is as follows:

Configuration or a HystrixCommand HystrixObservableCommand object encapsulates request and arranged in the constructor parameter request needs to be performed;
Run, MAMMALIA, provides four methods of execution command, described in detail later;
determining whether to use the cache in response to the request, if enabled cache, and the cache is available, the response request directly using the cache. Hystrix caching support request, but requires the user to customize the boot;
determining whether the fuse opens, if open, skip to step 8;
Analyzing thread pool / queue / semaphore is full, the full skip to step 8;
performed HystrixObservableCommand .construct () or HystrixCommand.run (), if the execution fails or times out, skip to step 8; otherwise, skip to step 9;
statistical monitoring index fuse;
walk logic Fallback alternate
returned in response to a request
from the flowchart know , step 5 thread pool / queue / semaphore is full, there will be a logical step 7, updating statistics fuse, and step 6 whether successful or not, the fuse will update the statistics.

Several methods of Run
Hystrix provides four types of methods for executing commands, Execute () and Queue () HystrixCommand suitable objects, while the observe () and toObservable () applies HystrixObservableCommand object.

execute ()
in a synchronized manner clogging execution run (), supports only receives a value object. hystrix will take a thread from the thread pool to execute run (), and wait for the return value.

Queue ()
asynchronous non-blocking execution run (), supports only receives a value object. Call queue () returns a Future object directly. Available () run by The Future.get get () returns the result, but The Future.get () are blocked performed. If performed successfully, Future.get () returns a single return value. When fails, if not rewrite fallback, Future.get () throws an exception.

the observe ()
before performing event registration run () / construct (), supports a plurality of values received objects, depending on the emission source. Call to observe () returns a hot Observable, that is, calling observe () automatically triggers the execution run () / construct (), regardless of the presence subscribers.

If inheritance is HystrixCommand, hystrix will take from a thread pool thread to perform non-blocking run (); if inherited HystrixObservableCommand, blocking the calling thread will execute construct ().

the observe () method:

Call to observe () returns an Observable object is
invoked subscribe to this Observable object () method to complete the event registration, in order to get the results
toObservable ()
execution run () / construct () after event registration, support for receiving a plurality of target values, depending on launch source. Call toObservable () returns a cold Observable, that is, call toObservable () does not trigger immediate execution run () / construct (), there must be a subscriber will be executed when the subscription Observable.

If inheritance is HystrixCommand, hystrix will take from a thread pool thread to perform non-blocking run (), the calling thread does not have to wait for the run (); if inherited HystrixObservableCommand, the calling thread will be blocked execution construct (), the calling thread wait construct () implementation of End to continue to go down.

toObservable () using the method:

Call to observe () returns an Observable object is
calling the Observable object's subscribe () method to complete the event registration, in order to obtain the results
must be noted that, HystrixCommand also supports toObservable () and observe (), but even if HystrixCommand converted into Observable, it It can only transmit a value object. Only HystrixObservableCommand only supports launching multiple value objects.

Several methods of relationship

execute () is actually called Queue (). GET ()
Queue () actually called toObservable (). toBlocking (). toFuture ()
the observe () actually call toObservable () to get a cold Observable, and then create a ReplaySubject objects Subscribe Observable, the source Observable into hot Observable. So calling observe () automatically triggers the execution run () / construct ().
Observable form Hystrix always returned as a response, but a different method of executing the command for the corresponding conversion.

Fault-tolerant Hystrix
Hystrix mainly by adding fault tolerance permitted delay and fault-tolerant methods to help control the interaction between these distributed services. Through access point service between isolation and prevent cascading failures between them and provide fallback options to achieve this, thereby improving the overall resilience of the system. Hystrix mainly provides the following fault-tolerant method:

Resource isolation
fuse
downgrade
Here we elaborate on these types of fault tolerance.

Resource isolation
resource isolation means the isolation of the main thread. Hystrix isolation mode provides two threads: the thread pool and semaphores.

The thread spacer - thread pool
Hystrix by decoupling the transmission request command mode of the target object and the execution request, the service request different types of packages for the corresponding command request. Goods such as order service query, the query request goods -> Goods Command; Goods Services check inventory, check inventory request -> Inventory Command. A thread pool and configured for each type of the Command, the Command when first created, to create a thread pool depending on the configuration of ConcurrentHashMap and placed, such as commodity Command:

of ConcurrentHashMap static Final <String, HystrixThreadPool> = new new threadPools of ConcurrentHashMap <String, HystrixThreadPool> ();
...
IF {(threadPools.containsKey (Key)!)
threadPools.put (Key, new new HystrixThreadPoolDefault (threadPoolKey, propertiesBuilder));
}
subsequent query request goods created Command, will reuse the thread pool has been created. After isolation of the thread pool service dependencies:

Thread by sending a request to the execution request of the thread separation, can effectively prevent the occurrence of cascading failure. When thread pool saturated or request queue, the MAMMALIA, denial of service, so that the requesting thread may fail quickly, thereby avoiding dependence proliferation.

Thread pool isolation advantages and disadvantages of
advantages:

Protection from the effects from the application to rely on the failure to specify the thread pool dependent saturation without affecting the rest of the application.
When introducing a new client lib, even if the problem occurs also in this lib, and does not affect other content.
When depend recover from a fault, the application will immediately return to normal performance.
When an application error some configuration parameters, operational status of the thread pool will soon detects this (by increasing errors, delays, overtime, denial, etc.), and can correct errors in real-time parameter configuration via dynamic properties.
If the performance of the service has changed, you need real-time adjustments, such as increasing or decreasing the timeout, change the number of retries, you can modify the dynamic properties of the thread pool index, and will not affect the other call request.
In addition to the advantages of isolation outside, hystrix has a dedicated thread pool to provide built-in concurrent function makes it possible to build asynchronous facade (exterior mode) on a synchronous call, provides support for asynchronous programming (Hystrix introduced Rxjava asynchronous frame).
Note: Although the thread pool thread isolation provides our clients the underlying code must also have a time-out settings or in response to thread interruption, can not be unlimited blocking thread pool that has been saturated.

Disadvantages:

The main disadvantage of the thread pool is increased computational overhead. Each command are completed in a separate thread, it increases the overhead queuing, scheduling, and context switching. Therefore, to use Hystrix, it must accept the cost it brings in return for the benefits it provides.

Under normal circumstances, the cost of introducing the thread pool is small enough, there will be no significant cost or performance impact. But for some very low latency access to services, such as only rely on memory cache, the overhead introduced by the thread pool is more obvious, this time using the thread pool is not suitable for the isolation, we need to consider a more lightweight fashion, such as signal the amount of isolation.

Thread isolation - Semaphore
above mentioned disadvantages of the thread pool isolation, when dependent on extremely low latency services, the thread pool isolation technology into cost more than the benefits it brings. This time may be used instead of semaphores isolation techniques, to limit the amount of any given call-dependent signal by setting the amount of concurrency. The following diagram illustrates the major differences semaphore thread pool isolation and isolation:

                        图片来源Hystrix官网https://github.com/Netflix/Hystrix/wiki

When using the thread pool, and sends the request thread in the implementation dependent services is not the same, and the use of semaphores, threads and threads of execution depends on the service sends a request to the same, are the thread that initiated the request. An example of semaphore isolation look threads used:

public class QueryByOrderIdCommandSemaphore extends HystrixCommand {
private final static Logger logger = LoggerFactory.getLogger(QueryByOrderIdCommandSemaphore.class);
private OrderServiceProvider orderServiceProvider;

public QueryByOrderIdCommandSemaphore(OrderServiceProvider orderServiceProvider) {
    super(Setter.withGroupKey(HystrixCommandGroupKey.Factory.asKey("orderService"))
            .andCommandKey(HystrixCommandKey.Factory.asKey("queryByOrderId"))
            .andCommandPropertiesDefaults(HystrixCommandProperties.Setter()
                    .withCircuitBreakerRequestVolumeThreshold(10)////至少有10个请求,熔断器才进行错误率的计算
                    .withCircuitBreakerSleepWindowInMilliseconds(5000)//熔断器中断请求5秒后会进入半打开状态,放部分流量过去重试
                    .withCircuitBreakerErrorThresholdPercentage(50)//错误率达到50开启熔断保护
                    .withExecutionIsolationStrategy(HystrixCommandProperties.ExecutionIsolationStrategy.SEMAPHORE)
                    .withExecutionIsolationSemaphoreMaxConcurrentRequests(10)));//最大并发请求量
    this.orderServiceProvider = orderServiceProvider;
}

@Override
protected Integer run() {
    return orderServiceProvider.queryByOrderId();
}

@Override
protected Integer getFallback() {
    return -1;
}

}
Since Hystrix default thread pool threads do isolation, use of semaphores to be displayed to the isolation property execution.isolation.strategy to ExecutionIsolationStrategy.SEMAPHORE, configure the number of semaphores, default is 10. When a client needs to initiate a request dependent services, you must first acquire a semaphore can really initiate calls, due to the limited number of semaphores, when the amount exceeds the number of concurrent requests semaphores, subsequent requests will be rejected directly into the fallback procedure.

Semaphore isolation mainly by controlling the amount of concurrent requests, requesting thread to prevent blocking a large area, so as to achieve the purpose of limiting and preventing avalanche.

Thread isolation summed up
the thread pool and semaphores can be done thread isolation, but each have their own advantages and disadvantages and support scenario, the following comparison:

线程切换	支持异步	支持超时	支持熔断	限流	开销

Semaphore No No No Yes Yes small
thread pool yes yes yes yes yes large
thread pool and semaphores are supported by fusing and current limiting. Compared to the thread pool, the amount of unwanted thread switching signal, thus avoiding unnecessary overhead. However, the amount does not support asynchronous signal, does not support time-out, that is when the requested service is not available, the signal will control the amount requested exceeds the limit returns immediately, but already holds the semaphore thread can wait for service or response from the timeout return that may appear a long wait. Under the thread pool mode, when the service exceeds the specified time is not responding, Hystrix thread will be notified by way of response to interrupts and ends immediately returned.

Blown
fuse Introduction to
real life, and we may have noted that domestic circuits will usually install a fuse box, when the load is overloaded, the fuse box fuse will automatically fuse to protect the circuit and all kinds of home appliances, which is a common example of a fuse. Hystrix the fuse (Circuit Breaker) also play a similar role, Hystrix during operation will correspond to each commandKey fuse report success, failure, and the state of denial of overtime, fuses and maintain statistical data, and based on these statistical information to decision-making fuse switch is turned on. If open, the fuse subsequent request, the quick return. After time to time (the default is 5s) fuse try half-open, put part of the flow request comes in, which is equivalent to depend on the service to conduct a health check, if the request is successful, the fuse is closed.

Fuses disposed
Circuit Breaker include the following six parameters:

1、circuitBreaker.enabled

Fuse is enabled, the default is TRUE.
2, circuitBreaker.forceOpen

Fuse forced open, and always remain open, does not care about the actual state of the fuse switch. The default value FLASE.
3, circuitBreaker.forceClosed
fuse forced to close, and always kept closed, does not care about the actual state of the fuse switch. The default value FLASE.

4, circuitBreaker.errorThresholdPercentage
error rate, the default value of 50%, for example 100 requests over time (10s), of which 54 out or abnormal, then the error rate in this period was 54% greater than the default value of 50 % in this case triggers the fuse opens.

5、circuitBreaker.requestVolumeThreshold

The default value of 20. Meaning that at least 20 before the request for errorThresholdPercentage calculated over time. For example, for some time there are 19 requests, and all these requests fail, the error rate is 100%, but the fuse does not open, the total number of requests does not meet the 20.

6、circuitBreaker.sleepWindowInMilliseconds

Half-open state of temptation sleep time, the default value of 5000ms. Such as: When the fuse is turned 5000ms, will try to put the last part of traffic heuristics to determine whether to resume the dependent services.

Fuse works
following figure shows the operating principle of HystrixCircuitBreaker:

                                图片来源Hystrix官网https://github.com/Netflix/Hystrix/wiki

Detailed process fuses the work as follows:

The first step, calling allowRequest () to determine whether to allow the request to be submitted to the thread pool

If the fuse is forcibly opened, circuitBreaker.forceOpen is true, does not allow the release and return.
If the fuse is forced to close, circuitBreaker.forceClosed true, enables the release. Also without paying attention to the actual state of the fuse, the fuse that is still maintains statistical data and switch status, but it does not take effect.
A second step of calling the isOpen () determines whether the switch is opened fuses

If the fuse switch is turned to the third step, else continue;
if the total number of requests within a period of less than a value circuitBreaker.requestVolumeThreshold, allowing release request, else continue;
if the error rate is less than a value circuitBreaker.errorThresholdPercentage period, allowing request release. Otherwise, open the fuse switch into the third step.
The third step, invoked allowSingleTest () determines whether to allow the passage of a single request, checks whether the restoration dependent services

If the fuse is opened, the release of the request for a tentative value, and more than circuitBreaker.sleepWindowInMilliseconds open fuse time or distance, the fuse enters a half-open state, allowing a probe request release; otherwise, does not allow release.
In addition, in order to provide a basis for decision making, each fuse the default maintenance of 10 bucket, a bucket per second, when a new bucket is created, the oldest bucket will be abandoned. Each blucket maintenance request successes, failures, timeouts, refused to counter, Hystrix responsible for collecting statistics and counters.

Fuse test
1 to test command QueryOrderIdCommand

2, the configuration orderServiceProvider 500ms and does not retry timeout

<Dubbo: Reference ID = "orderServiceProvider" interface = "com.huang.provider.OrderServiceProvider"
timeout = "500" retries = "0" />
. 3, OrderServiceProviderImpl very simple to achieve, before the request 10, the server hibernation 600ms, such that client calls timeout.

@Service
public class OrderServiceProviderImpl implements OrderServiceProvider {
private final static Logger logger = LoggerFactory.getLogger(OrderServiceProviderImpl.class);
private AtomicInteger OrderIdCounter = new AtomicInteger(0);

@Override
public Integer queryByOrderId() {
    int c = OrderIdCounter.getAndIncrement();
    if (logger.isDebugEnabled()) {
        logger.debug("OrderIdCounter:{}", c);
    }
    if (c < 10) {
        try {
            Thread.sleep(600);
        } catch (InterruptedException e) {
        }
    }
    return c;
}

@Override
public void reset() {
    OrderIdCounter.getAndSet(0);
}

}
4, a single code is detected

@Test
public void testExecuteCommand() throws InterruptedException {
orderServiceProvider.reset();
int i = 1;
for (; i < 15; i++) {
HystrixCommand command = new QueryByOrderIdCommand(orderServiceProvider);
Integer r = command.execute();
String method = r == -1 ? “fallback” : “run”;
logger.info(“call {} times,result:{},method:{},isCircuitBreakerOpen:{}”, i, r, method, command.isCircuitBreakerOpen());
}
//等待6s,使得熔断器进入半打开状态
Thread.sleep(6000);
for (; i < 20; i++) {
HystrixCommand command = new QueryByOrderIdCommand(orderServiceProvider);
Integer r = command.execute();
String method = r == -1 ? “fallback” : “run”;
logger.info(“call {} times,result:{},method:{},isCircuitBreakerOpen:{}”, i, r, method, command.isCircuitBreakerOpen());
}
}
5、输出结果

2018-02-07 11:38:36,056 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 1 times,result:-1,method:fallback,isCircuitBreakerOpen:false
2018-02-07 11:38:36,564 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 2 times,result:-1,method:fallback,isCircuitBreakerOpen:false
2018-02-07 11:38:37,074 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 3 times,result:-1,method:fallback,isCircuitBreakerOpen:false
2018-02-07 11:38:37,580 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 4 times,result:-1,method:fallback,isCircuitBreakerOpen:false
2018-02-07 11:38:38,089 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 5 times,result:-1,method:fallback,isCircuitBreakerOpen:false
2018-02-07 11:38:38,599 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 6 times,result:-1,method:fallback,isCircuitBreakerOpen:false
2018-02-07 11:38:39,109 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 7 times,result:-1,method:fallback,isCircuitBreakerOpen:false
2018-02-07 11:38:39,622 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 8 times,result:-1,method:fallback,isCircuitBreakerOpen:false
2018-02-07 11:38:40,138 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 9 times,result:-1,method:fallback,isCircuitBreakerOpen:false
2018-02-07 11:38:40,647 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 10 times,result:-1,method:fallback,isCircuitBreakerOpen:true
2018-02-07 11:38:40,651 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 11 times,result:-1,method:fallback,isCircuitBreakerOpen:true
2018-02-07 11:38:40,653 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 12 times,result:-1,method:fallback,isCircuitBreakerOpen:true
2018-02-07 11:38:40,656 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 13 times,result:-1,method:fallback,isCircuitBreakerOpen:true
2018-02-07 11:38:40,658 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:36 call 14 times,result:-1,method:fallback,isCircuitBreakerOpen:true
2018-02-07 11:38:46,671 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:44 call 15 times,result:10,method:run,isCircuitBreakerOpen:false
2018-02-07 11:38:46,675 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:44 call 16 times,result:11,method:run,isCircuitBreakerOpen:false
2018-02-07 11:38:46,680 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:44 call 17 times,result:12,method:run,isCircuitBreakerOpen:false
2018-02-07 11:38:46,685 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:44 call 18 times,result:13,method:run,isCircuitBreakerOpen:false
2018-02-07 11:38:46,691 INFO [main] com.huang.test.command.QueryByOrderIdCommandTest:testExecuteCommand:44 call 19 times,result:14,method:run,isCircuitBreakerOpen:false
前9个请求调用超时,走fallback逻辑;

10-14 request, fuse switch opens directly down fast fallback logic fails;

15-19 requests, fuses into the half-open state, released a tentative request call succeeds, the fuse is closed, subsequent requests recovery.

Fallback downgrade
downgrade, usually refers to the peak traffic period, in order to ensure the normal operation of core services need to be stopped less important business, or some service is unavailable, failure to perform fast backup logic from the service failure or return quickly to protect main business is not affected. Hystrix downgrade primarily to provide fault tolerance, to ensure that services are not dependent on the current service-affecting failures, thereby improving the robustness of the service. To support the back or downgrade, getFallBack method HystrixCommand or HystrixObservableCommand of resumeWithFallback method can be overridden.

Hystrix will go downgraded logic in the following cases:

Execution construct () or run () throws an exception
fuses short circuit causes the command to open
the thread pool and command queue or semaphore excess capacity, the command is rejected
command execution timeout
downgrade fallback mode
Fail Fast rapid failure
fast failure is the most common and executing an instruction, the command does not override logic degraded. If the command to perform any type of failure occurs, it will directly thrown.

Fail Silent silent failure
refers accomplished by returning null, empty Map, or other similar empty List degraded response process.

@Override
protected Integer getFallback() {
return null;
}

@Override
protected List getFallback() {
return Collections.emptyList();
}

@Override
protected Observable resumeWithFallback () {
return Observable.empty ();
}
Fallback: the Static
refers to static default return value demotion process. This does not cause service to be deleted "silent failure" approach, but cause the default behavior. Such as: the application according to the command execution returns true / false logic performs a corresponding, but the command fails, then the default is true

@Override
protected Boolean getFallback() {
return true;
}
@Override
protected Observable resumeWithFallback() {
return Observable.just( true );
}
Fallback: Stubbed

When the command returns a composite target comprising a plurality of fields, in a suitable manner Stubbed backoff.

@Override
protected MissionInfo getFallback () {
return new new MissionInfo ( "missionName", "error");
}
Fallback: Via Network Cache
Sometimes, if the call fails dependent services, you can query the old version of the data from the cache service (such as redis) in. Since the remote will initiate a call, it is proposed to re-package a Command, using different ThreadPoolKey, were isolated from the main thread pool.

@Override
protected getFallback Integer () {
return new new RedisServiceCommand (redisService) .execute ();
}
Primary + Secondary with Fallback
system sometimes has two behaviors - primary and secondary, and primary or failover. Primary and secondary logic calls directed to a different network and service logic, it needs to be encapsulated in different primary and secondary Command logic using the thread pool isolation. In order to achieve the main switch from logic may be encapsulated in the primary and secondary run command appearance HystrixCommand method, the combined center switch configuration settings from the main switching logic. Since the primary and secondary logical thread pool are isolated through HystrixCommand, so the appearance HystrixCommand semaphore isolation can be used, without the necessity of using thread pool isolation introduce unnecessary overhead. Diagram is as follows:

                      图片来源Hystrix官网https://github.com/Netflix/Hystrix/wiki

Primary and secondary model is still a lot of usage scenarios. When the system upgrade as new features, the new version features if there is a problem, call the older version of the downgrade function through the switch. Sample code is as follows:

public class CommandFacadeWithPrimarySecondary extends HystrixCommand {

private final static DynamicBooleanProperty usePrimary = DynamicPropertyFactory.getInstance().getBooleanProperty("primarySecondary.usePrimary", true);

private final int id;

public CommandFacadeWithPrimarySecondary(int id) {
    super(Setter
            .withGroupKey(HystrixCommandGroupKey.Factory.asKey("SystemX"))
            .andCommandKey(HystrixCommandKey.Factory.asKey("PrimarySecondaryCommand"))
            .andCommandPropertiesDefaults(
                    // 由于主次command已经使用线程池隔离,Facade Command使用信号量隔离即可
                    HystrixCommandProperties.Setter()
                            .withExecutionIsolationStrategy(ExecutionIsolationStrategy.SEMAPHORE)));
    this.id = id;
}

@Override
protected String run() {
    if (usePrimary.get()) {
        return new PrimaryCommand(id).execute();
    } else {
        return new SecondaryCommand(id).execute();
    }
}

@Override
protected String getFallback() {
    return "static-fallback-" + id;
}

@Override
protected String getCacheKey() {
    return String.valueOf(id);
}

private static class PrimaryCommand extends HystrixCommand<String> {

    private final int id;

    private PrimaryCommand(int id) {
        super(Setter
                .withGroupKey(HystrixCommandGroupKey.Factory.asKey("SystemX"))
                .andCommandKey(HystrixCommandKey.Factory.asKey("PrimaryCommand"))
                .andThreadPoolKey(HystrixThreadPoolKey.Factory.asKey("PrimaryCommand"))
                .andCommandPropertiesDefaults(                          HystrixCommandProperties.Setter().withExecutionTimeoutInMilliseconds(600)));
        this.id = id;
    }

    @Override
    protected String run() {
        return "responseFromPrimary-" + id;
    }

}

private static class SecondaryCommand extends HystrixCommand<String> {

    private final int id;

    private SecondaryCommand(int id) {
        super(Setter
                .withGroupKey(HystrixCommandGroupKey.Factory.asKey("SystemX"))
                .andCommandKey(HystrixCommandKey.Factory.asKey("SecondaryCommand"))
                .andThreadPoolKey(HystrixThreadPoolKey.Factory.asKey("SecondaryCommand"))
                .andCommandPropertiesDefaults(  HystrixCommandProperties.Setter().withExecutionTimeoutInMilliseconds(100)));
        this.id = id;
    }

    @Override
    protected String run() {
        return "responseFromSecondary-" + id;
    }

}

public static class UnitTest {

    @Test
    public void testPrimary() {
        HystrixRequestContext context = HystrixRequestContext.initializeContext();
        try {
            ConfigurationManager.getConfigInstance().setProperty("primarySecondary.usePrimary", true);
            assertEquals("responseFromPrimary-20", new CommandFacadeWithPrimarySecondary(20).execute());
        } finally {
            context.shutdown();
            ConfigurationManager.getConfigInstance().clear();
        }
    }

    @Test
    public void testSecondary() {
        HystrixRequestContext context = HystrixRequestContext.initializeContext();
        try {
            ConfigurationManager.getConfigInstance().setProperty("primarySecondary.usePrimary", false);
            assertEquals("responseFromSecondary-20", new CommandFacadeWithPrimarySecondary(20).execute());
        } finally {
            context.shutdown();
            ConfigurationManager.getConfigInstance().clear();
        }
    }
}

}
In general, recommended to rewrite or resumeWithFallback getFallBack provide their own backup logic, it is not recommended to perform any operation may fail in the rollback logic.

Summary
article describes Hystrix and how it works, also introduced Hystrix thread pool isolation, semaphore isolation and works fuse, and how to use the resource isolation Hystrix, fuse and downgrade services such as fault-tolerant technology to improve overall system robustness.

Published 33 original articles · won praise 0 · Views 856

Guess you like

Origin blog.csdn.net/ninth_spring/article/details/104009699