Introduction to Spring Cloud and detailed explanation of load balancing Ribbon and service fault-tolerant Hystrix components

Overview of Spring Cloud

official website

Introduction

Spring Cloud is a microservice architecture development framework based on Spring Boot. It provides a simple development method for operations such as configuration management, service governance, circuit breakers, intelligent routing, micro-agents, control buses, global locks, decision-making campaigns, distributed sessions, and cluster state management involved in the microservice architecture.

Spring Cloud consists of multiple sub-projects (for a number of different open source products involved in distributed systems, and may also add new ones), as described below.

  • Spring Cloud Config: a configuration management tool that supports the use of Git to store configuration content. It can be used to implement external storage of application configuration, and supports client configuration information refresh, encryption/decryption configuration content, etc.

  • Spring Cloud Netflix: core components, integrating multiple Netflix OSS open source suites

    • Eureka: The service governance component, including the implementation of the service registry, service registration and discovery mechanism.
    • Ribbon: The service invocation component of client load balancing.
    • Hystrix: A fault-tolerant management component that implements the circuit breaker pattern to help delays in service dependencies and provide strong fault tolerance for failures.
    • Feign: A declarative service call component based on Ribbon and Hystrix.
    • Zuul: a gateway component that provides functions such as intelligent routing and access filtering.
  • Spring Cloud Bus: event, message bus, used to propagate state changes or events in the cluster to trigger subsequent processing, such as dynamically refreshing configuration, etc.


Release Notes

Spring Cloud is not relatively independent like some other projects in the Spring community. It is a large-scale comprehensive project with many sub-projects. It can be said to be a comprehensive suite of micro-service architecture solutions. Each sub-project it contains is also independently updating and iterating, and each maintains its own release version number. Therefore, each version of Spring Cloud will contain multiple sub-projects of different versions. In order to manage the list of sub-projects of each version and avoid confusion between the version number of Spring Cloud and the version number of its sub-projects, the version number is not used, but the way of naming.

The names of these versions adopt the names of London subway stations, and correspond to the chronological order of versions according to the order of the alphabet. For example, the earliest Release version is Angel, and the second Release version is Brixton.

When the release content of a version of the Spring Cloud project accumulates to a critical point or a serious bug solution is available, a "service releases" version, referred to as the SRX version, will be released, where X is an increasing number, so Brixton.SR5 is the fifth Release version of Brixton

Spring Cloud Version Spring Boot Version
Hoxton 2.2.x
Greenwich 2.1.x
Finchley 2.0.x
Edgware 1.5.x
Dalston 1.5.x

The difference between springcloud and SpringBoot:

  • The effect is different:

    • SpringBoot is a rapid development framework that simplifies xml configuration using annotations. It has a built-in Servlet container and executes it as a Java application. Its function is to provide default configuration and simplify the configuration process.
    • SpringCloud is a collection of frameworks based on SpringBoot, which serves to provide a comprehensive management framework for microservices
  • Use it differently:

    • SpringBoot can be used alone
    • SpringCloud must be used on the premise of SpringBoot.
  • Both SpringBoot and SpringCloud are software development frameworks derived from the spring ecosystem, but their original intentions are completely different:

    • SpringBoot is designed to simplify configuration files and improve work efficiency during microservice development
    • Spring Cloud is designed to manage microservices in the same project

    Therefore, the two are completely different software development frameworks.


Load Balancing Ribbon

Ribbon is a client load balancing tool

In the actual environment, microservices are generally deployed in clusters, and there will be multiple instances in the service list. In this case, it is necessary to write a load balancing algorithm to select from multiple instance lists.

Ribbon configuration items

Global configuration uses ribbon.=

ribbon:
  ReadTimeout: 2500 	# 数据通信超时时长,单位:ms。默认为1000
  ConnectTimeout: 500 	# 连接超时时长,单位:ms。默认为1000
  OkToRetryOnAllOperations: false 	# 是否对所有的异常请求(连接异常和请求异常)都重试。默认为false
  MaxAutoRetriesNextServer: 1 		# 最多重试多少次连接服务(实例)。默认为1。不包括首次调用
  MaxAutoRetries: 0 	# 服务的单个实例的重试次数。默认为0。不包括首次调用
  NFLoadBalancerRuleClassName: com.netflix.loadbalancer.RandomRule	# 切换负载均衡策略为随机。默认为轮询策略

Specify service configuration <service name>.ribbon.=

serverName:	# 单独给某⼀服务配置。serverName是服务名,使⽤的时候要⽤服务名替换掉这个
  ribbon:
    connectTimeout: 5000
    readTimeout: 5000

Getting Started Case

  1. Dependency import

    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-netflix-ribbon</artifactId>
    </dependency>
    
  2. Add the @LoadBalanced annotation to the configuration method of RestTemplate:

    @Bean
    @LoadBalanced
    public RestTemplate restTemplate(){
          
          
        return new RestTemplate();
    }
    
  3. Call directly by service name

    @GetMapping("user/{id}")
    public User getUserById(@PathVariable long id) {
          
          
        String url = "http://user-service" + "/user/" + id;
        User user = restTemplate.getForObject(url, User.class);
        return user;
    }
    

Principle of load balancing

  1. After the @LoadBalanced annotation is added to the RestTemplate, the LoadBalancerClient will be used to configure the RestTemplate
  2. The @ConditionalOnBean(LoadBalancerClient.class) condition in Spring Cloud Ribbon's automatic configuration class LoadBalancerAutoConfiguration is established
  3. LoadBalancerInterceptor is added to the automatic configuration. This interceptor will intercept the request, obtain the address list of the service through the service ID, and then select an address to call through the load balancing algorithm.

insert image description here


Service Fault Tolerance Hystrix

concept

Service failure avalanche effect:

In the microservice architecture, it is divided into individual services according to the business, and services can call each other through RPC. In Spring Cloud, RestTemplate + LoadBalanceClient and Feign can be used to call. In order to ensure its high availability, a single service is usually deployed in a cluster. Due to network reasons or its own reasons, the service cannot be guaranteed to be 100% available. If there is a problem with a single service, thread blocking will occur when calling this service. The dependency between services, faults will propagate, and will cause catastrophic serious consequences to the entire microservice system, which is the avalanche effect of service faults .


Service downgrade:

In a distributed system, some "errors" occur when multiple services are called. If these "errors" cannot be handled in a friendly manner, the entire project will be paralyzed, which must never happen. Therefore, service downgrade is needed to solve this problem, similar to the try-catch mechanism, when an error occurs, the code in the catch is executed.

When the pressure on the server increases sharply, some services and pages are strategically downgraded according to the current business situation and traffic, so as to release server resources and ensure the normal operation of core tasks.

Although the service degradation will cause the request to fail, it will not cause blocking, and at most will affect the resources in the thread pool corresponding to the dependent service, and will not respond to other services.


Service fuse:

Generally, it refers to a protective measure adopted in order to prevent the failure of the entire system when the service is overloaded due to some reasons in the software system. Therefore, fuses are also called overload protection in many places. In many cases, there may only be local and small-scale failures in the system at the beginning. However, due to various reasons, the scope of the failure is getting wider and wider, and finally leads to global consequences.

In a distributed system, when calling a system, within a certain period of time, when the number of errors in calling this service reaches a certain value, the circuit breaker is turned on, and the downgrade method is directly called instead of calling the past. After a period of time, when a call is made and the service is found to be connected, the circuit breaker is changed to a "half-open" state, and the calls pass slowly one by one. If there is no error, the circuit breaker is closed to allow all services to pass.

Applicable scenarios: prevent applications from directly calling remote services or shared resources that are likely to fail


Hystrix

Hystrix is ​​a delay and fault-tolerant library open sourced by Netflix, which is used to isolate access to remote services and third-party libraries to prevent cascading failures

The means of Hystix to solve the avalanche problem mainly include:

  • thread isolation
  • Service circuit breaker

Hystrix configuration items

hystrix:
  command:
    default:	# 全局默认配置
      execution:	# 线程隔离相关
        timeout:
          enabled: true		# 是否给方法执行设置超时时间,默认为true。一般不改。
        isolation:
          strategy: THREAD	# 配置请求隔离的方式,这里是默认的线程池方式。还有一种信号量的方式semaphore,
          thread:
            timeoutlnMilliseconds: 10000	# 方式执行的超时时间,默认为1000毫秒,在实际场景中需要根据情况设置
      circuitBreaker:	# 服务熔断相关
        requestVolumeThreshold: 10			# 触发熔断的最小请求次数,默认20
        sleepWindowInMilliseconds: 10000	# 休眠时长,单位毫秒,默认是5000毫秒
        errorThresholdPercentage: 50		# 触发熔断的失败请求最小占比,默认50%
    serverName:	# 单独给某⼀服务配置
      execution:
        timeout:
          enabled: true
        isolation:
          strategy: THREAD
          thread:
            timeoutlnMilliseconds: 10000

thread isolation

principle

Hystrix allocates a small independent thread pool for each dependent service call. The user's request will no longer directly access the service, but will access the service through the idle thread in the thread pool. If the thread pool is full, the call will be rejected immediately. Otherwise, the thread will be used to process the request. You can set a timeout period in the main thread.


Getting Started Case

  1. Dependency import

    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
    </dependency>
    
  2. Add annotations to the startup class: @EnableCircuitBreaker , turn on the fuse

    Note: Combined annotation @SpringCloudApplication = @SpringBootApplication + @EnableDiscoveryClient + @EnableCircuitBreaker

    // @SpringBootApplication
    // @EnableDiscoveryClient
    // @EnableCircuitBreaker
    @SpringCloudApplication
    public class ConsumerApplication {
          
          
        public static void main(String[] args) {
          
          
            SpringApplication.run(ConsumerApplication.class, args);
        }
    
        @Bean
        @LoadBalanced
        public RestTemplate restTemplate(){
          
          
            return new RestTemplate();
        }
    }
    
  3. Write downgrade logic

    When the invocation of the target service fails, an error response is returned. It is necessary to write the downgrade processing logic in case of failure in advance, and use the annotation @HystixCommond to specify the downgrade processing method

    @GetMapping("user/{id}")
    @HystrixCommand(fallbackMethod = "getUserByIdFallback")		// 指定降级处理方法
    public User getUserById(@PathVariable long id) {
          
          
        // 使用服务ID替换真实地址
        String url = "http://user-service" + "/user/" + id;
        User user = restTemplate.getForObject(url, User.class);
        return user;
    }
    
    /**
     * 降级逻辑处理方法,方法参数、返回类型与原始方法一致
     */
    private User getUserByIdFallback(long id) {
          
          
        User user = new User(1L,"我是备份",18,new Date());
        return user;
    }
    
  4. Test Results:

    • When the called service provides the service normally, the access is as before.

    • When the called service is down, the call returns the return value of the degraded logic processing method


Default Fallback

You can use the annotation @DefaultProperties to add Fallback configuration to the class to achieve the default fallback

  • The return type of the default method must be consistent with the calling method and cannot have parameters
@RestController
@RequestMapping("/consumer")
@DefaultProperties(defaultFallback = "defaultFallback")		// 默认的Fallback
public class ConsumerController {
    
    
    
    @Autowired  // 注入RestTemplate
    private RestTemplate restTemplate;

    @GetMapping("user/{id}")
    @HystrixCommand
    public User getUserById(@PathVariable long id) {
    
    
        // 使用服务ID替换真实地址
        String url = "http://user-service" + "/user/" + id;
        User user = restTemplate.getForObject(url, User.class);
        return user;
    }

    /**
     * 降级逻辑处理方法,方法参数、返回类型与原始方法一致
     */
    private User getUserByIdFallback(long id) {
    
    
        User user = new User(1L,"我是备份",18,new Date());
        return user;
    }
    
    /**
     * 默认降级逻辑处理方法。返回类型必须与调用方法一致,并且不能有参数
     */
    private User defaultFallback(){
    
    
        User user = new User(1L,"我是默认备份",18,new Date());
        return user;
    }
}

Service circuit breaker

If a service instance often fails when accessing, you can temporarily isolate the service instance, which is a fuse

Fuse principle

Fuse (Circuit Breaker), also called circuit breaker

The circuit breaker itself is a switching device used to protect the circuit from overload. When there is a short circuit in the circuit, the "circuit breaker" can cut off the faulty circuit in time to prevent serious consequences such as overload, heat and even fire.

In a distributed architecture, the function of the circuit breaker mode is similar. When a service unit fails (similar to a short circuit of an electrical appliance), an error response is returned to the caller through the fault monitoring of the circuit breaker (similar to a blown fuse), instead of waiting for a long time. In this way, the thread will not be occupied for a long time due to calling the faulty service and will not be released, avoiding the spread of faults in the distributed system.

Hystix's fuse state machine model:

insert image description here

The state machine has 3 states:

  • Closed: Closed state (circuit breaker closed), all requests can be accessed normally.

  • Open: Open state (circuit breaker open), all requests will be downgraded.

    Hystrix will count the request status. When the percentage of failed requests reaches the threshold within a certain period of time, a fuse will be triggered and the circuit breaker will be fully opened. The default failure ratio threshold is 50%, and the number of requests should be at least 20.

    Assume that within a certain period of time, a total of 19 requests fail, as long as it does not reach 20 times, the fuse will not be opened. If it reaches 20 times, and the failure ratio is greater than 50%, that is, as long as the number of failures reaches 10 times, the fuse will be triggered.

    After the fuse is triggered, the circuit breaker will remain in the Open state for a certain statistical time (default is 5 seconds), and all requests will be directly downgraded during this period

  • Half Open: half open state, the Open state is not permanent, it will enter the sleep time after opening (the default is 5S). After the sleep time is over, the circuit breaker will automatically enter the half-open state. At this time, a request will be released to pass. If the request is healthy, the circuit breaker will be closed, otherwise it will remain open, and the 5-second sleep timer will be performed again.

Conditions for the circuit breaker to enter the Open state:

  1. The number of request failures reaches the threshold, the default is 20 times
  2. The proportion of request failures reaches the threshold, the default is 50%

expand

Comparison between Sentinel and Hystrix

Sentinel is open sourced by the Alibaba middleware team. It is a lightweight and highly available traffic control component for distributed service architecture. It mainly uses traffic as the entry point to help users protect the stability of services from multiple dimensions such as traffic control, circuit breaker degradation, and system load protection.

Compare content Sentinel Hystrix
isolation policy Semaphore isolation Thread pool isolation/semaphore isolation
Circuit breaker downgrade strategy Based on response time or failure ratio Based on failure rate
Real-time indicator implementation sliding window Sliding window (based on RxJava)
rule configuration Support multiple data sources Support multiple data sources
Scalability multiple extension points plug-in form
Annotation-based support support support
Limiting Based on QPS, support current limiting based on call relationship not support
traffic shaping Support slow start, speed stabilizer mode not support
System load protection support not support
console Out of the box, you can configure rules, view second-level monitoring, machine discovery, etc. imperfect
Adaptation to Common Frameworks Servlet、Spring Cloud、Dubbo、gRPC 等 Servlet、Spring Cloud Netflix

Guess you like

Origin blog.csdn.net/footless_bird/article/details/128469102