Fusing Hystrix to use early adopters
When the service has many external dependencies, if one of the services is unavailable, the entire cluster will be affected (such as timeout, causing a large number of requests to be blocked, resulting in the inability of external requests to come in). In this case, hystrix is used. very useful
For this purpose, I learned about the hystrix framework and recorded the new process of the framework.
I. Principle Inquiry
Through the official website and related blog posts, we can briefly talk about this working mechanism. The general process is as follows
The first is to request -> determine whether the circuit breaker is on -> service call -> fallback if abnormal, failure count +1 -> end
The following is the main flow chart
graph LR
A(请求)-->B{熔断器是否已开}
B --> | 熔断 | D[fallback逻辑]
B --> | 未熔断 | E[线程池/Semphore]
E --> F{线程池满/无可用信号量}
F --> | yes | D
F --> | no | G{创建线程执行/本线程运行}
G --> | yes | I(结束)
G --> | no | D
D --> I(结束)
The circuit breaker mechanism mainly provides two kinds, one is based on the isolation method of the thread pool; the other is based on the preemption of the semaphore
Thread pool method : support asynchronous, support timeout setting, support current limit
Semaphore mode : This thread executes, no asynchronous, no timeout, supports current limiting, and consumes less
After basically having the above simple concept, we start to enter our usage testing process
II. Use early adopters
1. Introduce dependencies
<dependency>
<groupId>com.netflix.hystrix</groupId>
<artifactId>hystrix-core</artifactId>
<version>1.5.12</version>
</dependency>
2. Simple to use
From the official documentation, two Command methods are supported, one is the ObserverCommand based on the observer mode, and the other is the basic Command, first use a simple look at the following
public class HystrixConfigTest extends HystrixCommand<String> {
private final String name;
public HystrixConfigTest(String name, boolean ans) {
// 注意的是同一个任务,
super(Setter.withGroupKey(
// CommandGroup是每个命令最少配置的必选参数,在不指定ThreadPoolKey的情况下,字面值用于对不同依赖的线程池/信号区分
HystrixCommandGroupKey.Factory.asKey("CircuitBreakerTestGroup"))
// 每个CommandKey代表一个依赖抽象,相同的依赖要使用相同的CommandKey名称。依赖隔离的根本就是对相同CommandKey的依赖做隔离.
.andCommandKey(HystrixCommandKey.Factory.asKey("CircuitBreakerTestKey_" + ans))
// 当对同一业务依赖做隔离时使用CommandGroup做区分,但是对同一依赖的不同远程调用如(一个是redis 一个是http),可以使用HystrixThreadPoolKey做隔离区分
.andThreadPoolKey(HystrixThreadPoolKey.Factory.asKey("CircuitBreakerTest_" + ans))
.andThreadPoolPropertiesDefaults( // 配置线程池
HystrixThreadPoolProperties.Setter()
.withCoreSize(12) // 配置线程池里的线程数,设置足够多线程,以防未熔断却打满threadpool
)
.andCommandPropertiesDefaults( // 配置熔断器
HystrixCommandProperties.Setter()
.withCircuitBreakerEnabled(true)
.withCircuitBreakerRequestVolumeThreshold(3)
.withCircuitBreakerErrorThresholdPercentage(80)
// .withCircuitBreakerForceOpen(true) // 置为true时,所有请求都将被拒绝,直接到fallback
// .withCircuitBreakerForceClosed(true) // 置为true时,将忽略错误
// .withExecutionIsolationStrategy(HystrixCommandProperties.ExecutionIsolationStrategy.SEMAPHORE) // 信号量隔离
.withExecutionIsolationSemaphoreMaxConcurrentRequests(20)
.withExecutionTimeoutEnabled(true)
.withExecutionTimeoutInMilliseconds(200)
.withCircuitBreakerSleepWindowInMilliseconds(1000) //熔断器打开到关闭的时间窗长度
// .withExecutionTimeoutInMilliseconds(5000)
)
);
this.name = name;
}
@Override
protected String run() throws Exception {
System.out.println("running run():" + name + " thread: " + Thread.currentThread().getName());
int num = Integer.valueOf(name);
if (num % 2 == 0 && num < 10) { // 直接返回
return name;
} else if (num < 40) {
Thread.sleep(300);
return "sleep+"+ name;
} else { // 无限循环模拟超时
return name;
}
}
//
// @Override
// protected String getFallback() {
// Throwable t = this.getExecutionException();
// if(t instanceof HystrixRuntimeException) {
// System.out.println(Thread.currentThread() + " --> " + ((HystrixRuntimeException) t).getFailureType());
// } else if (t instanceof HystrixTimeoutException) {
// System.out.println(t.getCause());
// } else {
// t.printStackTrace();
// }
// System.out.println(Thread.currentThread() + " --> ----------over------------");
// return "CircuitBreaker fallback: " + name;
// }
public static class UnitTest {
@Test
public void testSynchronous() throws IOException, InterruptedException {
for (int i = 0; i < 50; i++) {
if (i == 41) {
Thread.sleep(2000);
}
try {
System.out.println("===========" + new HystrixConfigTest(String.valueOf(i), i % 2 == 0).execute());
} catch (HystrixRuntimeException e) {
System.out.println(i + " : " + e.getFailureType() + " >>>> " + e.getCause() + " <<<<<");
} catch (Exception e) {
System.out.println("run()抛出HystrixBadRequestException时,被捕获到这里" + e.getCause());
}
}
System.out.println("------开始打印现有线程---------");
Map<Thread, StackTraceElement[]> map = Thread.getAllStackTraces();
for (Thread thread : map.keySet()) {
System.out.println("--->name-->" + thread.getName());
}
System.out.println("thread num: " + map.size());
System.in.read();
}
}
}
It is relatively simple to use, the general steps are as follows:
- Inherit
HsytrixCommand
class - Overloading the constructor, you need to specify various configurations internally
- Implement the run method, which mainly implements the method of fuse monitoring
Writing the above code is relatively simple, but there are several places that are not easy to deal with
- The specific meaning of the configuration item, and how does it take effect?
- What if some exceptions do not enter the fuse logic?
- How to obtain monitoring data?
- How to simulate various cases (timeout? Service exception? Circuit breaker enabled? Thread pool full? No semaphore available? Retry with half circuit breaker?)
3. Measured understanding
According to the deletion, deletion and modification of the above code, it seems that I understand the following points, I don’t know if it’s right or wrong
a. Configuration related
- groupKey is used to distinguish thread pools from semaphores, that is, one group corresponds to one
- commandKey is very important, this is used to distinguish business
- In simple terms, group is similar to an app that provides services, and command corresponds to the service provided by the app. An app can have multiple services. Here, all requests of an app are placed in a thread pool (or share a semaphore)
- Turn on the fusing mechanism, specify the minimum number of requests to trigger fusing (within 10s), and specify the conditions for turning on fusing (failure rate)
- Set the circuit breaker strategy (thread pool or semaphore)
- Set the retry time (5s after the default fuse is turned on, put a few requests in to see if the service is restored)
- Set the thread pool size, set the semaphore size, set the queue size
- Set the timeout time, set the allowable timeout setting
b. Use related
The run method is the core execution service call. If some services are required to not count the failure rate of the fuse (for example, because the calling posture is incorrect, an exception is thrown inside the service, but the service itself is normal), at this time, it needs to be packaged. Calling logic, wrapping unwanted exceptions into HystrixBadRequestException
classes
Such as
@Override
protected String run() {
try {
return func.apply(route, parameterDescs);
} catch (Exception e) {
if (exceptionExcept(e)) {
// 如果是不关注的异常case, 不进入熔断逻辑
throw new HystrixBadRequestException("unexpected exception!", e);
} else {
throw e;
}
}
}
c. How to get the reason for failure
When a failure occurs, hystrix will wrap the native exception into HystrixRuntimeException
this class, so we can handle it as follows in the calling place
try {
System.out.println("===========" + new HystrixConfigTest(String.valueOf(i), i % 2 == 0).execute());
} catch (HystrixRuntimeException e) {
System.out.println(i + " : " + e.getFailureType() + " >>>> " + e.getCause() + " <<<<<");
} catch (Exception e) {
System.out.println("run()抛出HystrixBadRequestException时,被捕获到这里" + e.getCause());
}
When the fallback logic is defined, the exception will not be thrown to the specific caller, so in the fallback method, it is necessary to obtain the corresponding exception information
// 获取异常信息
Throwable t = this.getExecutionException();
Then the next step is to obtain the corresponding abnormal cause, and use the FailureType to indicate the root cause of the failure
((HystrixRuntimeException) t).getFailureType()
d. How to get statistics
hystrix provides a set of monitoring plug-ins. Basically, everyone will have their own monitoring statistics. Therefore, this data needs to be customized and customized. I have not yet thought of how to handle these statistics gracefully.
4. Summary
The main thing is to see how this thing can be played. The whole feeling of using it is that the design is more interesting, but there are too many configuration parameters, many of which are not fully understood.
Secondly, when some special cases (such as monitoring, alarm, special case filtering) need to be handled, it is not very easy to use. The main problem is that the internal working mechanism of this framework is not clearly understood.
III. Other
Personal blog: Z+|blog
A personal blog based on hexo + github pages, recording all blog posts in study and work, welcome to visit
statement
It is not as good as a letter of faith. The above content is purely from the family. Because of my average ability and limited knowledge, if you find bugs or have better suggestions, you are welcome to criticize and correct them at any time.
- Weibo address: Xiaohuihui Blog
- QQ: A gray gray / 3302797840