6-Spring cloud’s Hystrix fault tolerance processing (Part 1)
- 1 Introduction
- 2. Hystrix fault tolerance processing
-
- 2.1 Project construction (Ribbon integrates Hystrix)
- 2.2 Service downgrade
- 2.3 Service circuit breaker
- 2.4 Fuzzy use (fuse? Downgrade?)
- 3. Feign integrates Hystrix fault tolerance processing
1 Introduction
1.1 About avalanches
1.1.1 What is a catastrophic avalanche?
- Microservices call each other. Because a service failure in the call chain makes a series of services impossible, that is, the entire link becomes inaccessible.
1.1.2 Causes of service avalanche
- The service provider is unavailable.
For example: hardware failure, program bug, cache breakdown, excessive concurrent requests (such as Double Eleven), etc.
in,Cache breakdownIt usually occurs when the caching application is restarted, all caches are cleared, and when a large number of caches expire in a short period of time. A large number of cache misses cause requests to directly access the backend, causing the service provider to be overloaded and causing the service to become unavailable. - The service caller is unavailable.
For example: resource exhaustion caused by blocking synchronous requests. - Try again with increased traffic. like:
- User retry:
After the service provider becomes unavailable, the user cannot bear the long wait on the interface and keeps refreshing the page or even submitting the form. - Code retry logic:
The calling end of the service has a large number of retry logic after service exceptions.
- User retry:
1.1.3 How to prevent catastrophic avalanche effects
- Service downgrade
- Service circuit breaker
- request cache
1.2 Previous introduction
- This article is a continuation of the previous article, so some clusters or services are not introduced below. For example, the Eureka cluster is still started on the basis of the previous article. The introduction to the previous articles is as follows:
- 1-Eureka service registration and discovery and Eureka cluster construction (practical) .
- 2-Spring cloud's Eureka quickly eliminates invalid services
- 3-Building Ribbon load balancing in Spring cloud - practical operation on the server (Part 1)
- 4-Building Ribbon load balancing in Spring cloud - practical operation on the server (Part 2)
- 5-Usage of Feign in Spring cloud - practical operation on the server .
2. Hystrix fault tolerance processing
2.1 Project construction (Ribbon integrates Hystrix)
2.1.1 Project structure
- In order to make the overall project look sequential and have a clearer hierarchy, this project directly creates a new Module. It can be understood that this project is an upgraded version of dog-consumer-80, which simply adds Hystrix to the Ribbon.
- The project structure is as follows:
2.1.2 pom file
-
as follows:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <groupId>com.liu.susu</groupId> <artifactId>dog-cloud-parent</artifactId> <version>1.0-SNAPSHOT</version> </parent> <artifactId>dog-consumer-ribbon-hystrix-80</artifactId> <packaging>jar</packaging> <name>dog-consumer-ribbon-hystrix-80</name> <url>http://maven.apache.org</url> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> </properties> <dependencies> <dependency> <groupId>com.liu.susu</groupId> <artifactId>dog-po</artifactId> <version>${project.version}</version> </dependency> <!--版本同${spring-boot.version}--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <!--引入ribbon相关依赖,ribbon是客户端的负载均衡,ribbon需要和eureka整合--> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-eureka</artifactId> </dependency> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-ribbon</artifactId> </dependency> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-config</artifactId> </dependency> <!--引入Hystrix相关依赖--> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-hystrix</artifactId> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <scope>test</scope> </dependency> </dependencies> </project>
2.1.3 yml file
-
as follows:
server: port: 80 spring: application: name: dog-consumer-ribbon-hystrix eureka: client: # 客户端注册进eureka服务列表内 register-with-eureka: false # false表示不向注册中心注册自己 service-url: defaultZone: http://IP1:2886/eureka/,http://IP2:2886/eureka,http://IP3:2886/eureka/
2.1.4 Configuration class
- as follows:
2.1.5 Startup class
- as follows:
2.1.6 controller
-
as follows:
package com.liu.susu.controller; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.bind.annotation.PathVariable; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; import org.springframework.web.client.RestTemplate; /** * @Description * @Author susu */ @RestController public class DogConsumerController { private static final String REST_URL_PREFIX = "http://DOG-PROVIDER"; @Autowired private RestTemplate restTemplate; @RequestMapping(value = "/consumer/dog/hello") public String hello(){ System.out.println("=============================="); String url = REST_URL_PREFIX + "/dog/hello"; return restTemplate.getForObject(url, String.class); } /** * http://localhost:80/consumer/dog/getDogByNum/1 * http://localhost/consumer/dog/getDogByNum/1 */ @RequestMapping("/consumer/dog/getDogByNum/{dogNum}") public Object getDogByNum(@PathVariable("dogNum") Long dogNum){ String url = REST_URL_PREFIX + "/dog/getDogByNum/" + dogNum; return restTemplate.getForObject(url, Object.class); } }
2.1.7 Start to ensure the service is available
- Because the Eureka cluster and several service providers have not been stopped, so check it out, as follows:
- Start the newly created service consumer and access it as follows:
- After ensuring that there are no problems, you can continue to simulate service degradation and circuit breaker, continue...
2.1.8 Demo remote service is not available
2.1.8.1 Simple demonstration
-
Now stop all those service providers (when they were first stopped, they were still in the registration center, see the set heartbeat time)
-
Then visit again to see the effect (process changes):
- First
Connection refused (Connection refused)
(This is the service is not available)
- Then, again.
No instances available
But this situation is normal because you have stopped the service.
- First
2.1.8.2 Remote service is normal and abnormal
- Remote service is normal
- There are returned results
- with exception passing
- remote service error
- Remote service unavailable (Access Denied)
- Remote service response timed out
2.2 Service downgrade
2.2.1 Simply put, service downgrade
-
Service degradation means that when a service fails or is abnormal, in order to ensure the stability of the core service, some less important services are temporarily shut down, or some simple cached data is returned, etc., to ensure the normal operation of the core service. Service degradation can effectively reduce service response time and error rate and improve system availability.
That is: service degradation is a reasonable allocation of overall system resources. Distinguish between core services and non-core services. Make an estimate of the access delay time, abnormality, etc. of a certain service and provide a method to avoid it. This is a global consideration, considering the overall load.
2.2.2 Simulating service degradation
- Next, we simulate a timeout
- When the service is not downgraded, if we deliberately design a timeout, no matter how long the timeout is, SpringCloud will wait when making remote calls.. You can try it yourself, but I won’t demonstrate it here.
- The core code for simulating service timeout and downgrade processing is as follows:
-
Startup class
@EnableHystrix //开启Hystrix容错处理能力
-
The method in the controller
@HystrixCommand(fallbackMethod = "errorHello")
/** * @EnableHystrix 启动类上注解,开启Hystrix容错处理能力 * @HystrixCommand 代表当前方法是一个需要做容错处理的方法 * @EnableHystrix 结合 @HystrixCommand,默认地配置了一个远程服务超时配置,默认设置超时是1秒 */ @RequestMapping(value = "/consumer/dog/hello") @HystrixCommand(fallbackMethod = "errorHello") public String hello(){ System.out.println("=============================="); //默认情况下,SpringCloud远程调用时,不管多久都会等 try { Thread.sleep(3000); } catch (InterruptedException e) { throw new RuntimeException(e); } String url = REST_URL_PREFIX + "/dog/hello"; return restTemplate.getForObject(url, String.class); } /** * 降级方法 * 1、返回类型要和对应的服务方法的返回类型一致 * 2、参数和对应的服务方法要一致 * 3、返回的内容:远程服务访问错误时(比如超时),返回的拖底数据 */ public String errorHello(){ return "服务器忙,请稍后再试!"; }
-
2.2.3 See the effect
-
as follows:
2.3 Service circuit breaker
2.3.1 Simply put, service circuit breaker
- Service circuit breaker refers to disconnecting from the service when a service fails or is abnormal, preventing the service from continuing to accept requests, thereby preventing the spread of the fault and quickly restoring the availability of the service. Service circuit breakers usually detect service availability based on certain policies. If the response time or error rate of a service exceeds a certain threshold, the circuit breaker mechanism will be triggered, disconnecting from the service, and then retrying the connection regularly until the service is normal. .
- That is: service circuit breaker is generallyA service ( downstream service ) failurecaused, and service degradation is generally considered from the overall load
2.3.2 Simulate service circuit breaker 1 (stop service)
2.3.2.1 Downgrade code
-
as follows:
@RequestMapping("/consumer/dog/getDogByNum/{dogNum}") @HystrixCommand(fallbackMethod = "errorGetDogByNum",commandProperties = { //是否开启熔断机制 @HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_ENABLED,value = "true"), //一个时间窗内,发生远程访问错误的次数阈值,达到开启熔断 @HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_REQUEST_VOLUME_THRESHOLD,value = "3"), //一个时间窗内,发生远程访问错误的百分比,达到则开启熔断 @HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_ERROR_THRESHOLD_PERCENTAGE,value = "20"), //开启熔断后,多少毫秒内,不发起远程服务访问 @HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_SLEEP_WINDOW_IN_MILLISECONDS,value = "3000") }) public Object getDogByNum(@PathVariable("dogNum") Long dogNum){ System.out.println("本地测试熔断...."); String url = REST_URL_PREFIX + "/dog/getDogByNum/" + dogNum; return restTemplate.getForObject(url, Object.class); } public Object errorGetDogByNum(Long dogNum){ System.out.println("进入熔断,dogNum是:===>"+dogNum); return "熔断——服务器忙,请稍后再试!"; }
2.3.2.2 See the effect
-
Stop the service provider as follows:
2.3.3 Simulation service circuit breaker 1 (server simulation exception)
2.3.3.1 Modify service provider
-
Modify the code as follows
System.out.println("进入服务提供者,模拟异常...."); System.out.println(2 / dogNum);
-
Start the service, start the local directly for convenience
2.3.3.2 See the effect
2.3.3.2.1 Comment out the fuse processing
- 500 headache, I still don’t know what caused it, I have to go to the service provider to open the log, as follows:
2.3.3.2.2 Add fuse processing
- The service provider log is as follows:
- Serve consumers
- Page, if you are a customer, it looks much more comfortable
2.4 Fuzzy use (fuse? Downgrade?)
- Fuzzy concept: Service circuit breaker is an enhanced version of service degradation. Therefore, if the error occurs on the consumer end, you can also use service circuit breaker, and the effect is similar to the above.
- The following is the error in the service circuit breaker on the consumer side, as follows:
2.4.1 Code (server code is normal, consumer code simulates exception)
-
as follows:
@RequestMapping("/consumer/dog/getDogByNum/{dogNum}") @HystrixCommand(fallbackMethod = "errorGetDogByNum",commandProperties = { //是否开启熔断机制 @HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_ENABLED,value = "true"), //一个时间窗内,发生远程访问错误的次数阈值,达到开启熔断 @HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_REQUEST_VOLUME_THRESHOLD,value = "3"), //一个时间窗内,发生远程访问错误的百分比,达到则开启熔断 @HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_ERROR_THRESHOLD_PERCENTAGE,value = "20"), //开启熔断后,多少毫秒内,不发起远程服务访问 @HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_SLEEP_WINDOW_IN_MILLISECONDS,value = "3000") }) public Object getDogByNum(@PathVariable("dogNum") Long dogNum){ System.out.println("本地测试熔断...."); System.out.println(2 / dogNum); String url = REST_URL_PREFIX + "/dog/getDogByNum/" + dogNum; return restTemplate.getForObject(url, Object.class); } public Object errorGetDogByNum(Long dogNum){ System.out.println("进入熔断,dogNum是:===>"+dogNum); return "熔断——服务器忙,请稍后再试!"; }
2.4.2 Effect
2.4.2.1 Effect when fuse is not turned on
- as follows:
2.4.2.2 Effect after turning on fuse
-
First test the dog_num that will not cause errors, as follows:
-
An error will occur when retesting, as follows:
-
There should be no error when clicking again, as follows:
3. Feign integrates Hystrix fault tolerance processing
- In order to facilitate modifying the code directly in the dog-api project, there are two ways:
- Look at the original code first
- Look at the original code first
3.1 Core code
3.1.1 Implementation classes and annotations
3.1.1.1 Method 1
- Add the implementation class, as follows
- Modify the annotation as follows
@FeignClient(value = "DOG-PROVIDER",fallback = DogClientApiImpl.class) //方式1
3.1.1.2 Method 2
-
Add the implementation class as follows:
package com.liu.susu.api; import com.liu.susu.pojo.Dog; import feign.hystrix.FallbackFactory; import org.springframework.stereotype.Component; import java.util.List; /** * @Description * @Author susu */ @Component public class DogClientApiFallbackFactory implements FallbackFactory<DogClientApi> { @Override public DogClientApi create(Throwable throwable) { return new DogClientApi() { @Override public String hello() { System.out.println("进入 DogClientApiFallbackFactory 服务降级--->hello"); return "DogClientApiFallbackFactory 服务降级处理,请稍后再试"; } @Override public Object getDogByNum(Long dogNum) { return null; } @Override public List<Dog> getAllDog() { return null; } }; } }
-
Modify the annotation as follows:
@FeignClient(value = "DOG-PROVIDER",fallbackFactory = DogClientApiFallbackFactory.class) //方式2
3.1.2 Consumer yml
- Configure the consumer's yml file (don't forget!!!)
feign: hystrix: enabled: true
3.1.3 See the effect
- Start the service and see the effect
- Then disconnect the service provider and see the effect
. Okay, that's it. There is another way with the same effect. If you are interested, go and try it yourself.