Spring Cloud 8: Introduction and use of Hystrix circuit breaker downgrade

1. Concept

1. Avalanche effect

​ Every service sends an HTTP request to start a new thread in the service. When downstream services are hung up or the network is unreachable, threads usually block until Timeout. If there is a little more concurrency, these blocked threads will take up a lot of resources, and it is very likely that they will exhaust the resources of the machine where their own microservice is located, causing themselves to hang up.

​ If the service provider responds very slowly, then the service consumer calls this provider and will wait until the provider responds or times out. In a high-concurrency scenario, this situation, if no processing is done, will lead to the exhaustion of the resources of the service consumer and even the collapse of the entire system. Layers of crashes caused all systems to crash.

Avalanche: The phenomenon of cascading failures caused by basic service failures. The description is: the unavailability of the provider leads to the unavailability of consumers, and the process of gradually enlarging the unavailability. Like snowballing, more and more unavailable services. The impact is getting worse.

In short: A cascading failure caused by a basic service failure is an avalanche.

2. Fault tolerance mechanism

  • Set a timeout for network requests:

A timeout must be set for network requests. General calls generally respond within tens of milliseconds. If the service is unavailable or there is a problem with the network, the response time will become very long. It grows to tens of seconds.

Each call corresponds to a thread or process. If the response time is long, the thread will not be released for a long time, and the thread corresponds to the system resources, including CPU and memory. The more threads that cannot be released, the more resources are consumed. More will eventually cause the system to crash.

Therefore, the timeout period must be set to release the resources as soon as possible.

  • Use circuit breaker mode:

Think about the fuse at home, tripping. If there is a short circuit or high-power electrical appliances in the home, when the load exceeds the circuit load, it will trip. If it does not trip, the circuit will burn out and spread to other households, making other households unusable. The circuit is safe by tripping. When the short circuit problem or high power problem is solved, the circuit is closed.

The circuit in your own home does not affect the circuits of every household in the entire community.

3. Circuit breaker

If there are a large number of timeouts for a microservice request (indicating that the service is unavailable), there is no point in allowing new requests to access the service, and it will only waste resources. For example, if a timeout period of 1s is set, if a large number of requests cannot be responded to within 1s in a short period of time, there is no need to request dependent services.

  1. The circuit breaker is a proxy for operations that can easily lead to errors. This kind of agent can count the number of failures in a period of time, and based on the number of times, decide whether to request the service that it depends on normally or return directly.

  2. The circuit breaker can achieve rapid failure. If it detects many similar errors (timeout) within a period of time, it will force the call to the service to fail quickly in a period of time, that is, no longer request the called service. This eliminates the need for consumers to waste CPU waiting for long timeouts.

  3. The circuit breaker can also automatically diagnose whether the dependent services are back to normal. If it is found that the dependent service has returned to normal, then it will resume requesting the service. The re-closing of the circuit breaker is determined by the reset time.

    In this way, the "self-healing" of microservices is realized: when the dependent service is unavailable, the circuit breaker is opened to let the service fail quickly, thereby preventing an avalanche. When the dependent service returns to normal, the request is resumed.

The logic of the circuit breaker state transition:

关闭状态:正常情况下,断路器关闭,可以正常请求依赖的服务。

打开状态:当一段时间内,请求失败率达到一定阈值,断路器就会打开。服务请求不会去请求依赖的服务。调用方直接返回。不发生真正的调用。重置时间过后,进入半开模式。

半开状态:断路器打开一段时间后,会自动进入“半开模式”,此时,断路器允许一个服务请求访问依赖的服务。如果此请求成功(或者成功达到一定比例),则关闭断路器,恢复正常访问。否则,则继续保持打开状态。

断路器的打开,能保证服务调用者在调用异常服务时,快速返回结果,避免大量的同步等待,减少服务调用者的资源消耗。并且断路器能在打开一段时间后继续侦测请求执行结果,判断断路器是否能关闭,恢复服务的正常调用。

4. Downgrade

In order to appropriately abandon some services when the overall resources are insufficient, put the main resources into the core services, and then restart the closed services after the difficulties are overcome to ensure the stability of the core services of the system. When the service is stopped, it will automatically enter the fallback to replace the main method.

Use the fallback method instead of the main method to execute and return the result to downgrade the failed service. When the number of failed calls to the service exceeds the threshold of the circuit breaker within a period of time, the circuit breaker will be opened and no real calls will be made, but will fail quickly and directly execute the fallback logic. Service degradation protects the logic of the service caller.

Fuse and downgrade:

共同点:
	1、为了防止系统崩溃,保证主要功能的可用性和可靠性。
	2、用户体验到某些功能不能用。
不同点:
	1、熔断由下级故障触发,主动惹祸。
	2、降级由调用方从负荷角度触发,无辜被抛弃。

Two, Hystrix

1. What is Hystrix

Hystrix is ​​a fault-tolerant component that implements a timeout mechanism and circuit breaker mode.

Hystrix is ​​an open source library of Netflix, used to isolate remote systems, services, or third-party libraries to prevent cascading failures, thereby improving system availability and fault tolerance. Mainly have the following functions:

  1. Provide a protection mechanism for the system. Provide protection and control for the system in the event of high latency or failure of dependent services.
  2. Prevent avalanches.
  3. Package request: Use HystrixCommand (or HystrixObservableCommand) to wrap the calling logic for dependencies, and each command runs in a separate thread.
  4. Tripping mechanism: When the failure rate of a service reaches a certain threshold, Hystrix can automatically trip and stop requesting the service for a period of time.
  5. Resource isolation: Hystrix maintains a small thread pool for each dependency of the request. If the thread pool is full, the request sent to the dependency will be rejected immediately instead of waiting in line, thus speeding up the failure judgment. Prevent cascade failure.
  6. Fast failure: Fail Fast. At the same time, it can recover quickly. The focus is: (Do not actually request service, return after an exception occurs), but directly fail.
  7. Monitoring: Hystrix can monitor operating indicators and configuration changes in real time, providing near real-time monitoring, alarming, and operation and maintenance control.
  8. Fallback mechanism: fallback, when the request fails, times out, is rejected, or when the circuit breaker is opened, the fallback logic is executed. We customize the fallback logic to provide elegant service degradation.
  9. Self-repair: After the circuit breaker is opened for a period of time, it will automatically enter the "half-open" state, and it can be switched between open, closed and half-open states. It was introduced earlier.

2. Use Hystrix with RestTemplate

2.1, Eureka registration center eureka-server

application.xml

spring:
  application:
    name: eureka-server
eureka:
  client:
    service-url:
      defaultZone: http://euk-server1:7001/eureka/
  instance:
    hostname: euk-server1
server:
  port: 7001

Startup class:

@SpringBootApplication
@EnableEurekaServer
public class EurekaServerApplication {
    
    

    public static void main(String[] args) {
    
    
        SpringApplication.run(EurekaServerApplication.class, args);
    }

}

2.2, Eureka service provider eureka-provider

application.yaml

spring:
  application:
    name: eureka-provider
eureka:
  client:
    service-url:
      defaultZone: http://euk-server1:7001/eureka/
  instance:
    hostname: euk-client1
server:
  port: 7002

Service order class

@RestController
public class OrderController {
    
    

    @Value("${server.port}")
    private int value;

    @RequestMapping("/getOrder")
    public String getOrder() {
    
    
        return "我是订单,我的端口为:" + value;
    }
}

Start class

@SpringBootApplication
@EnableEurekaClient
public class EurekaProviderApplication {
    
    

    public static void main(String[] args) {
    
    
        SpringApplication.run(EurekaProviderApplication.class, args);
    }
}

3.3, Eureka service consumer eureka-cunsumer

First add Hystrix dependency

<dependency>
     <groupId>org.springframework.cloud</groupId>
     <artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
     <version>2.2.5.RELEASE</version>
</dependency>

application.yaml

spring:
  application:
    name: eureka-consumer
eureka:
  client:
    service-url:
      defaultZone: http://euk-server1:7001/eureka/
  instance:
    hostname: euk-client2
server:
  port: 7008

RestTemplate configuration class RestTemplateConfiguration.java

@Configuration
public class RestTemplateConfiguration {
    
    
    @Bean
    @LoadBalanced
    public RestTemplate restTemplate(){
    
    
        return new RestTemplate();
    }
}

Obtain the order service class OrderController.java, where @HystrixCommand(fallbackMethod = "fallback") fails to be downgraded.

@RestController
public class OrderController {
    
    

    @Autowired
    private RestTemplate restTemplate;

    @GetMapping("/order")
    @HystrixCommand(fallbackMethod = "fallback")
    public String getOrder() {
    
    
        return restTemplate.getForObject("http://eureka-provider/getOrder", String.class);
    }

    public String fallback(){
    
    
        return "很抱歉,请求订单失败!";
    }
}

Startup class, enable Hystrix function through @EnableHystrix:

@EnableHystrix
@SpringBootApplication
public class EurekaConsumerApplication {
    
    

    public static void main(String[] args) {
    
    
        SpringApplication.run(EurekaConsumerApplication.class, args);
    }
}

3.4. Start normally and initiate an order service call: http://euk-client2:7008/order

Insert picture description here

3.5, set a timeout period of 1s for calling the order service, and then let the order service sleep for 2s

Service setting timeout time 1s

@RestController
public class OrderController {
    
    

    @Autowired
    private RestTemplate restTemplate;

    @GetMapping("/order")
    @HystrixCommand(fallbackMethod = "fallback",commandProperties = {
    
    
            @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "1000")
    })
    public String getOrder() {
    
    
        return restTemplate.getForObject("http://eureka-provider/getOrder", String.class);
    }

    public String fallback(){
    
    
        return "很抱歉,请求订单失败!";
    }
}

Order service sleep 2s

@RestController
public class OrderController {
    
    

    @Value("${server.port}")
    private int value;

    @RequestMapping("/getOrder")
    public String getOrder() {
    
    
        try {
    
    
            TimeUnit.SECONDS.sleep(2);
        } catch (InterruptedException e) {
    
    
            e.printStackTrace();
        }
        return "我是订单,我的端口为:" + value;
    }
}

3.6, then run access to get the order

http://euk-client2:7008/order

Insert picture description here

It is found that the content in fallback() is executed, and the service is degraded.

3. Hystrix works with Feign

Still keep the eureka-server and eureka-provider in the above example unchanged, create a new model, eureka-consumer-feign

3.1, the first step is to import Hystrix's dependencies

<dependency>
      <groupId>org.springframework.cloud</groupId>
      <artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>

3.2, Feign comes with Hystrix, but it is not turned on by default, first open Hystrix

application.yaml

spring:
  application:
    name: eureka-consumer-feign
eureka:
  client:
    service-url:
      defaultZone: http://euk-server1:7001/eureka/
  instance:
    hostname: localhost
server:
  port: 7009
feign:
  hystrix:
    enabled: true

3.3, configure Feign's remote service call interface, and downgrade processing class

@FeignClient(name = "eureka-provider", fallback = OrderFallback.class)
public interface OrderInterface {
    
    
    @GetMapping("/getOrder")
    String order();
}

3.4, Create a new downgrade processing class OrderFallback.java

@Component
public class OrderFallback implements OrderInterface {
    
    
    @Override
    public String order() {
    
    
        return "很抱歉,请求订单失败!";
    }
}

3.5, remote service calls to obtain orders

@RestController
public class OrderController {
    
    

    @Autowired
    private OrderInterface orderInterface;

    @GetMapping("/order")
    public String order(){
    
    
        return orderInterface.order();
    }
}

3.6, configure the startup class

@SpringBootApplication
@EnableEurekaClient
@EnableFeignClients
@EnableCircuitBreaker
public class EurekaConsumerFeignApplication {
    
    

    public static void main(String[] args) {
    
    
        SpringApplication.run(EurekaConsumerFeignApplication.class, args);
    }

}

3.7, call the service

Because hystrix default timeout is 1s, and then we call the getOrder service to sleep for 2 seconds, so we will go to fallback at this time:

Insert picture description here

4. FallbackFactory captures the abnormal information of the circuit breaker

We need to make the following modifications to the above example:

OrderFallbackFactory.java

@Component
public class OrderFallbackFactory implements FallbackFactory<OrderFallback> {
    
    
    @Override
    public OrderFallback create(Throwable throwable) {
    
    
        return new OrderFallback(throwable);
    }
}

OrderFallback.java

@Component
public class OrderFallback implements OrderInterface {
    
    

    private Throwable throwable;

    public OrderFallback() {
    
    
    }

    public OrderFallback(Throwable throwable) {
    
    
        this.throwable = throwable;
    }

    @Override
    public String order() {
    
    
        if (throwable != null) {
    
    
            return throwable.toString();
        }
        return "很抱歉,请求订单失败!";
    }
}

OrderInterface.java

//@FeignClient(name = "eureka-provider", fallback = OrderFallback.class)
@FeignClient(name = "eureka-provider", fallbackFactory = OrderFallbackFactory.class)
public interface OrderInterface {
    
    
    @RequestMapping("/getOrder")
    String order();
}

Invoke the service:

Insert picture description here

5, Hystrix placement

5.1 , HystrixProperty

1、Execution:
用来控制HystrixCommand.run()的执行
具体意义:
execution.isolation.strategy:该属性用来设置HystrixCommand.run()执行的隔离策略。默认为THREAD。
execution.isolation.thread.timeoutInMilliseconds:该属性用来配置HystrixCommand执行的超时时间,单位为毫秒。
execution.timeout.enabled:该属性用来配置HystrixCommand.run()的执行是否启用超时时间。默认为true。
execution.isolation.thread.interruptOnTimeout:该属性用来配置当HystrixCommand.run()执行超时的时候是否要它中断。
execution.isolation.thread.interruptOnCancel:该属性用来配置当HystrixCommand.run()执行取消时是否要它中断。
execution.isolation.semaphore.maxConcurrentRequests:当HystrixCommand命令的隔离策略使用信号量时,该属性用来配置信号量的大小。当最大并发请求达到该设置值时,后续的请求将被拒绝。

2、Fallback:
用来控制HystrixCommand.getFallback()的执行
fallback.isolation.semaphore.maxConcurrentRequests:该属性用来设置从调用线程中允许HystrixCommand.getFallback()方法执行的最大并发请求数。当达到最大并发请求时,后续的请求将会被拒绝并抛出异常。
fallback.enabled:该属性用来设置服务降级策略是否启用,默认是true。如果设置为false,当请求失败或者拒绝发生时,将不会调用HystrixCommand.getFallback()来执行服务降级逻辑。

3、Circuit Breaker:用来控制HystrixCircuitBreaker的行为。
circuitBreaker.enabled:确定当服务请求命令失败时,是否使用断路器来跟踪其健康指标和熔断请求。默认为true。
circuitBreaker.requestVolumeThreshold:用来设置在滚动时间窗中,断路器熔断的最小请求数。例如,默认该值为20的时候,如果滚动时间窗(默认10秒)内仅收到19个请求,即使这19个请求都失败了,断路器也不会打开。
circuitBreaker.sleepWindowInMilliseconds:用来设置当断路器打开之后的休眠时间窗。休眠时间窗结束之后,会将断路器设置为“半开”状态,尝试熔断的请求命令,如果依然时候就将断路器继续设置为“打开”状态,如果成功,就设置为“关闭”状态。
circuitBreaker.errorThresholdPercentage:该属性用来设置断路器打开的错误百分比条件。默认值为50,表示在滚动时间窗中,在请求值超过requestVolumeThreshold阈值的前提下,如果错误请求数百分比超过50,就把断路器设置为“打开”状态,否则就设置为“关闭”状态。
circuitBreaker.forceOpen:该属性默认为false。如果该属性设置为true,断路器将强制进入“打开”状态,它会拒绝所有请求。该属性优于forceClosed属性。
circuitBreaker.forceClosed:该属性默认为false。如果该属性设置为true,断路器强制进入“关闭”状态,它会接收所有请求。如果forceOpen属性为true,该属性不生效。

4、Metrics:该属性与HystrixCommand和HystrixObservableCommand执行中捕获的指标相关。
metrics.rollingStats.timeInMilliseconds:该属性用来设置滚动时间窗的长度,单位为毫秒。该时间用于断路器判断健康度时需要收集信息的持续时间。断路器在收集指标信息时会根据设置的时间窗长度拆分成多个桶来累计各度量值,每个桶记录了一段时间的采集指标。例如,当为默认值10000毫秒时,断路器默认将其分成10个桶,每个桶记录1000毫秒内的指标信息。
metrics.rollingStats.numBuckets:用来设置滚动时间窗统计指标信息时划分“桶”的数量。默认值为10。
metrics.rollingPercentile.enabled:用来设置对命令执行延迟是否使用百分位数来跟踪和计算。默认为true,如果设置为false,那么所有的概要统计都将返回-1。
metrics.rollingPercentile.timeInMilliseconds:用来设置百分位统计的滚动窗口的持续时间,单位为毫秒。
metrics.rollingPercentile.numBuckets:用来设置百分位统计滚动窗口中使用桶的数量。
metrics.rollingPercentile.bucketSize:用来设置每个“桶”中保留的最大执行数。
metrics.healthSnapshot.intervalInMilliseconds:用来设置采集影响断路器状态的健康快照的间隔等待时间。

5、Request Context:涉及HystrixCommand使用HystrixRequestContext的设置。
requestCache.enabled:用来配置是否开启请求缓存。
requestLog.enabled:用来设置HystrixCommand的执行和事件是否打印到日志的HystrixRequestLog中。

5.2, the default configuration value in the class

HystrixCommandProperties.java

/* --------------统计相关------------------*/ 
// 统计滚动的时间窗口,默认:5000毫秒(取自circuitBreakerSleepWindowInMilliseconds)   
private final HystrixProperty metricsRollingStatisticalWindowInMilliseconds;   
// 统计窗口的Buckets的数量,默认:10个,每秒一个Buckets统计   
private final HystrixProperty metricsRollingStatisticalWindowBuckets; // number of buckets in the statisticalWindow   
// 是否开启监控统计功能,默认:true   
private final HystrixProperty metricsRollingPercentileEnabled;   
/* --------------熔断器相关------------------*/ 
// 熔断器在整个统计时间内是否开启的阀值,默认20。也就是在metricsRollingStatisticalWindowInMilliseconds(默认10s)内至少请求20次,熔断器才发挥起作用   
private final HystrixProperty circuitBreakerRequestVolumeThreshold;   
// 熔断时间窗口,默认:5秒.熔断器中断请求5秒后会进入半打开状态,放下一个请求进来重试,如果该请求成功就关闭熔断器,否则继续等待一个熔断时间窗口
private final HystrixProperty circuitBreakerSleepWindowInMilliseconds;   
//是否启用熔断器,默认true. 启动   
private final HystrixProperty circuitBreakerEnabled;   
//默认:50%。当出错率超过50%后熔断器启动
private final HystrixProperty circuitBreakerErrorThresholdPercentage;  
//是否强制开启熔断器阻断所有请求,默认:false,不开启。置为true时,所有请求都将被拒绝,直接到fallback 
private final HystrixProperty circuitBreakerForceOpen;   
//是否允许熔断器忽略错误,默认false, 不开启   
private final HystrixProperty circuitBreakerForceClosed; 
/* --------------信号量相关------------------*/ 
//使用信号量隔离时,命令调用最大的并发数,默认:10   
private final HystrixProperty executionIsolationSemaphoreMaxConcurrentRequests;   
//使用信号量隔离时,命令fallback(降级)调用最大的并发数,默认:10   
private final HystrixProperty fallbackIsolationSemaphoreMaxConcurrentRequests; 
/* --------------其他------------------*/ 
//使用命令调用隔离方式,默认:采用线程隔离,ExecutionIsolationStrategy.THREAD   
private final HystrixProperty executionIsolationStrategy;   
//使用线程隔离时,调用超时时间,默认:1秒   
private final HystrixProperty executionIsolationThreadTimeoutInMilliseconds;   
//线程池的key,用于决定命令在哪个线程池执行   
private final HystrixProperty executionIsolationThreadPoolKeyOverride;   
//是否开启fallback降级策略 默认:true   
private final HystrixProperty fallbackEnabled;   
// 使用线程隔离时,是否对命令执行超时的线程调用中断(Thread.interrupt())操作.默认:true   
private final HystrixProperty executionIsolationThreadInterruptOnTimeout; 
// 是否开启请求日志,默认:true   
private final HystrixProperty requestLogEnabled;   
//是否开启请求缓存,默认:true   
private final HystrixProperty requestCacheEnabled; // Whether request caching is enabled.

HystrixThreadPoolProperties.java

/* 配置线程池大小,默认值10个 */ 
private final HystrixProperty corePoolSize; 
/* 配置线程值等待队列长度,默认值:-1 建议值:-1表示不等待直接拒绝,测试表明线程池使用直接决绝策略+ 合适大小的非回缩线程池效率最高.所以不建议修改此值。 当使用非回缩线程池时,queueSizeRejectionThreshold,keepAliveTimeMinutes 参数无效 */
private final HystrixProperty maxQueueSize; 

6. Hystrix Dashboard monitoring system

Hystrix provides quasi-real-time call monitoring (Hystrix DashBoard). Hystrix will continuously record the execution information of requests initiated by Hystrix, and display it to customers in the form of statistical reports and graphics, including how many executed per second, how many requests are successful, and requests fail. How much wait.

Netflix has realized the monitoring of the above indicators through the Hystrix-metics-event-stream project. SpringCloud also provides the integration of Hystrix DashBoard, which transforms the monitoring content into a visual interface so that users can directly see the status of services and clusters. In actual use, we often combine Turbine to use.

The construction of Hystrix Dashboard is actually very simple, divided into three steps:

6.1, add Hystrix Dashboard dependency

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-netflix-hystrix-dashboard</artifactId>
</dependency>

6.2, set the port application.yaml

server:
  port: 7010

6.3, set the startup class

@SpringBootApplication
@EnableHystrixDashboard
public class HystrixDashboardApplication {
    
    

    public static void main(String[] args) {
    
    
        SpringApplication.run(HystrixDashboardApplication.class, args);
    }

}

Visit http://localhost:7010/hystrix

Insert picture description here

Three, write to the end

The example codes used in this article have been uploaded to gitee

https://gitee.com/songbozhao/spring-cloud-hystrix-test

Guess you like

Origin blog.csdn.net/u013277209/article/details/110975722