[Micro] Spring Cloud service architecture of Hystrix (c)

First, the avalanche effect

  In the micro-service architecture, since each call between the service and the service to complete a job may be a plurality of micro-dependent call service modules, but the reason the network or its own reasons, the service does not guarantee a 100% available, if a single service problems, call the service will be thread is blocked, at this time if the influx of a large number of requests, Servlet container thread resources will be consumed completed, resulting in paralysis of service; plus the dependencies between services and service, paralysis It spreads rapidly, causing serious consequences for the entire micro-services system, which is "vaginal bleeding" effect service failure.

Services avalanche effect formation stage

1, the service provider is not available

  • hardware malfunction. Hardware damage causes the server host is down, network hardware failure or inaccessible service provider.
  • Program Bug
  • Cache breakdown. Generally occurs in the cache application restart, when all the cache is cleared, and a short time when a large number of cache invalidation. A lot of cache miss, the request Watch the back-end, resulting in a service provider overloaded, causing service is unavailable.
  • A large number of user requests

2, increase the flow of retries

  • Users Retry
  • Retry code logic

3, the service call is unavailable

  • Synchronization wait caused by resource exhaustion. When synchronous call, a large amount of system resources waiting thread.

Second, the service response avalanche

For different reasons for service avalanches, different responses may be used.

1, Flow Control

  • Gateway limiting
  • Limiting user interaction, such as: 1. Using animation to load, improve the user's patience waiting time. 2. Submit button to add forced to wait mechanism.
  • Close Retry

2, an improved cache mode

  • Cache preloading
  • Synchronous asynchronous refresh changed

3, automatic service expansion

  • AWS的auto scaling

4, the service caller downgrade

  • Resource isolation. Mainly for calling service thread pool isolation.
  • Reliance on service classification. Depending on the business, the dependent services are divided into strong and weak dependence dependence, strong dependence of the current service is not available will lead to suspension of business, the weak will not depend on the service unavailable.
  • Unavailable service calls quickly fail. Usually by time-out mechanism , fuse and fuse after the downgrade method to achieve.

Third, the use of preventive services avalanche Hystrix

 Design Principles 3.1, Hystrix include:

  • Resource isolation (do not put all your eggs in one basket)
  • Fuse (stop in time)
  • Command Mode

Resource isolation

  In a highly service-oriented systems, a business logic often need to rely on more than one service module, for example, the figure below left, view product details of the business need to rely on product description, price inquiries, product reviews three service module, will call three dependent services share listings service thread pool, if one module is not available, all thread pool threads will be blocked because of waiting for a response, causing an avalanche service;

  Hystrix for each dependent services by allocating a separate thread pool resource isolation, thus avoiding the entire service avalanches. Right figure below, even if a service module error, only lead to the thread assigned to it in a wait state, without affecting call other dependent services.

  The cost of doing so is to bring the need to maintain multiple thread pool performance overhead.

Fuse

  Fuse fuse pattern defines the opening-closing conversion.

  Define a service module's health = the number of failed requests / total requests.

  Fuse works, we need to set a threshold for health services:

  1. Fuse switch is closed, then the request can directly reach back through the fuse;
  2. If the current state of health services is higher than the set threshold, the switch remains closed, otherwise the switch into the open state;
  3. When the fuse switch is open, the request is prohibited through the fuse, direct return failure;
  4. When the switch is open for a period of time fuses (fuse time window), automatically enters the half-open state, then allowed to pass a request, the request when the call is successful, fuses the resume-off, if the request fails, remain open.

  Fuse switch to ensure service call exception in the caller's service time, return results quickly to avoid causing a large number of clients waiting for synchronization and a large number of invalid requests affect system throughput, and fuses can automatically detect a request to perform after a period of time the results, which might have restored service call.

Command Mode

  Hystrix using the command mode (inherited from HystrixCommand class) to wrap the specific service invocation logic (run method), and adding the relegation logic (getFallback) after a service call fails in command mode, but can also be defined in the constructor Command in the current parameters of the thread pool and fuses.

3.2, Hystrix responses

  • The dependence of each isolate, all the call-dependent or packaged HystrixCommand HystrixObservableCommand
  • Reliance on time-consuming to set thresholds calls exceeds the threshold value is determined directly timeout
  • 对每个依赖维护一个连接池,如果连接池满直接拒绝访问
  • 评估服务模块的健康状态,超过指定的阈值的话直接熔断处理,对依赖的请求访问直接fallback处理(由开发者自己实现)
  • 熔断生效后,时间窗后放出一个请求探测,决定是否要恢复服务
  • 近乎实时生效

3.3、Hystrix的内部处理逻辑

  1. 构建Hystrix的Command对象, 调用执行方法.
  2. Hystrix检查当前服务的熔断器开关是否开启, 若开启, 则执行降级服务getFallback方法.
  3. 若熔断器开关关闭, 则Hystrix检查当前服务的线程池是否能接收新的请求, 若超过线程池已满, 则执行降级服务getFallback方法.
  4. 若线程池接受请求, 则Hystrix开始执行服务调用具体逻辑run方法.
  5. 若服务执行失败, 则执行降级服务getFallback方法, 并将执行结果上报Metrics更新服务健康状况.
  6. 若服务执行超时, 则执行降级服务getFallback方法, 并将执行结果上报Metrics更新服务健康状况.
  7. 若服务执行成功, 返回正常结果.
  8. 若服务降级方法getFallback执行成功, 则返回降级结果.
  9. 若服务降级方法getFallback执行失败, 则抛出异常

Guess you like

Origin www.cnblogs.com/iUtopia/p/11510371.html