Detailed explanation and practice of Hystrix---SpringCloud component (4)

1. Introduction to Hystrix

Hystix means porcupine in English. Its whole body is covered with thorns. It looks like it is not easy to mess with. It is a protective mechanism.
Hystrix is ​​also a component of Netflix.
Home page: https://github.com/Netflix/Hystrix/
Insert image description here

Hystrix is ​​a switching device, similar to a blown fuse. Install a Hystrix fuse on the consumer side. When
Hystrix monitors that a service fails, the fuse will open and the service access link will be disconnected. However, Hystrix does not block the consumer of the service or throw an exception to the consumer. Instead, it returns an expected alternative response (FallBack) to the consumer. Through Hystrix's circuit breaker and downgrade functions, service avalanches are avoided, while user experience is also considered. Therefore, Hystrix is ​​a defense mechanism of the system.

2. Avalanche problem

If Dependency-I fails for a service we need to access, at this time, the service that calls Dependency-I in our application will also fail, causing blocking. At this time, other businesses appear to be unaffected.
Insert image description here

For example, an exception occurs in microservice I, the request is blocked, and the user will not get a response, then the tomcat thread will not be released, so more and more user requests come, and more and more threads will be blocked:
[The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-vZ7TdMdY-1615796261981)(assets/1604375397407.png)]

The number of threads and concurrency supported by the server is limited, and the request is always blocked, which will cause the server resources to be exhausted, resulting in all other services being unavailable, forming an avalanche effect.

This is like a car production line that produces different cars and requires the use of different parts. If a certain part cannot be used for various reasons, it will cause the entire car to be unable to be assembled and fall into a state of waiting for parts. Until the parts are in place, the Continue assembling. At this time, if many models require this part, the entire factory will be in a waiting state, causing all production to be paralyzed. The scope of a part continues to expand.

3. Service degradation, thread isolation, principles

Insert image description here

Hystrix allocates a small thread pool for each service call function. If the thread pool is full, the call will be rejected immediately. Queuing is not used by default to speed up the failure determination time.

The user's request will no longer directly access the service, but will access the service through the idle thread in the thread pool. If the thread pool is full , or the request times out , it will be degraded: an error prompt or alternative result will be returned to the user .

Although service downgrade will cause request failure, it will not cause blocking, and it will occupy up to the thread resources of the service and will not cause the entire container resources to be exhausted. The impact of the failure is isolated in the thread pool.

3.1. Service downgrade practice (implemented on the basis of feign)

step:

1. Add Hystrix dependency
2. Enable circuit breaker function in yml
3. Write downgrade logic

1 Add Hystrix dependency

Since Feign also integrates Hystix by default, there is no need to add dependencies separately.
Insert image description here

2. Turn on the fusing function in yml

Enable circuit breaker configuration in feign-hystrix-consumer-8080 service

feign:
  hystrix:
    enabled: true # 开启Feign的熔断功能

Insert image description here

3. Write downgrade logic

  1. Add the DepartServiceFallback class to feign-provider-api to implement the feign interface, and write specific downgrade methods in each method.

    Insert image description here

/**
 1. @ClassName: DepartServiceFallback
 2. @Description: 失败回调类
 3. @Author: wang xiao le
 4. @Date: 2023/05/11 21:56
 **/
@Component
public class DepartServiceFallback implements DepartService {
    
    
    @Override
    public boolean saveDepart(DepartVO depart) {
    
    
        return false;
    }

    @Override
    public boolean removeDepartById(int id) {
    
    
        return false;
    }

    @Override
    public boolean modifyDepart(DepartVO depart) {
    
    
        return false;
    }

    @Override
    public DepartVO getDepartById(int id) {
    
    
        DepartVO departVO = new DepartVO();
        departVO.setId(id);
        departVO.setName("查询异常");
        return departVO;
    }

    @Override
    public List<DepartVO> listAllDeparts() {
    
    
        return null;
    }
}
  1. Specify the implementation class just written in DepartService

    Insert image description here

// 注意,接口名与方法名可以随意
// 参数指定了要访问的提供者微服务名称
//@FeignClient(url ="http://127.0.0.1:8081", value="abcmsc-provider-depart", path = "/provider/depart")
//@FeignClient(url ="${feign.client.url}", value="abcmsc-provider-depart", path = "/provider/depart")
@FeignClient(value="feign-provider", path = "/provider/depart",fallback = DepartServiceFallback.class)
public interface DepartService {
    
    
    @PostMapping("/save")
    boolean saveDepart(@RequestBody DepartVO depart);

    @DeleteMapping("/del/{id}")
    boolean removeDepartById(@PathVariable("id") int id);

    @PutMapping("/update")
    boolean modifyDepart(@RequestBody DepartVO depart);

    @GetMapping("/get/{id}")
    DepartVO getDepartById(@PathVariable("id") int id);

    @GetMapping("/list")
    List<DepartVO> listAllDeparts();
}

4. Restart the test

We only restart the consumer, do not start the provider service, and simulate the provider service downtime.
Insert image description here
Accessing http://localhost:8080/consumer/depart/get/1
displays the result of the failed callback.
Insert image description here

4. Service circuit breaker (Circuit Breaker), principle

4.1. Fusing principle

Although isolation can avoid cascading failures in services, for other services that access service I (the faulty service) , each processing request has to wait several seconds until fallback, which is obviously a waste of system resources.

Therefore, when Hystix determines that a dependent service has a high failure rate, it will perform circuit breaker processing on it: intercept requests for failed services, fail quickly, and no longer block waiting, just like the circuit breaker of the circuit is disconnected to protect the circuit.

Fuse, also called circuit breaker, its English word is: Circuit Breaker Insert image description here
Hystix's fuse state machine model:
Insert image description here
The state machine has 3 states:

  • Closed: closed state (circuit breaker is closed), all requests are accessed normally.
  • Open: Open state (circuit breaker is open), all requests will be downgraded. Hystix will count requests. When the percentage of failed requests reaches the threshold within a certain period of time, the fuse will be triggered and the circuit breaker will be opened. The default failure ratio threshold is 50%, and the number of requests is at least 20.
  • Half Open: Half open state. The open state is not permanent. It will enter sleep time after opening (default is 5S). The circuit breaker will then automatically enter the half-open state. At this time, one request will be released. If the request is healthy, the circuit breaker will be closed. Otherwise, it will remain open and the 5-second sleep timer will be performed again.

4.2.Hands-on practice

In order to accurately control the success or failure of the request, add a piece of logic to the calling business of feign-provider-modules:

Insert image description here

    @GetMapping("/get/{id}")
    public DepartVO getHandle(@PathVariable("id") int id) {
    
    
        if(id == 1){
    
    
            throw new RuntimeException("太忙了");
        }
        return new DepartVO();
    }

In this way, if the parameter id is 1, it will definitely fail, and in other cases, it will succeed.

We prepare two request windows:

  • A request: http://localhost:8080/consumer/1, doomed to fail
  • A request: http://localhost:8080/consumer/2, definitely successful

The default triggering threshold of the fuse is 20 requests, which is not easy to trigger. The sleep time is 5 seconds, which is too short and difficult to observe. It is for the convenience of testing. We modify the circuit breaker policy in the consumer service configuration:
Insert image description here

  hystrix:
    enabled: true # 开启Feign的熔断功能
    command:
      default:
        execution.isolation.thread.timeoutInMilliseconds: 2000
        circuitBreaker:
          errorThresholdPercentage: 50 # 触发熔断错误比例阈值,默认值50%
          sleepWindowInMilliseconds: 10000 # 熔断后休眠时长,默认值5秒
          requestVolumeThreshold: 10 # 触发熔断的最小请求次数,默认20

Interpretation :

  • requestVolumeThreshold: The minimum number of requests to trigger the circuit breaker, the default is 20, here we set it to 10 for easy triggering
  • errorThresholdPercentage: Minimum proportion of failed requests that trigger circuit breaker, default 50%
  • sleepWindowInMilliseconds: Sleep duration, the default is 5000 milliseconds, here it is set to 10000 milliseconds to facilitate observation of the fuse phenomenon

When we access the request with ID 1 crazily (about 10 times), the circuit breaker will be triggered. The circuit breaker will enter the open state and all requests will be downgraded.

Insert image description here

At this time, when you access the request with ID 2, you will find that the response is also failed, and the failure time is very short, only about 20 milliseconds:

Insert image description here

5.Hystrix core source code analysis

Guess you like

Origin blog.csdn.net/weixin_43811057/article/details/130630265