1. Why do we need to downgrade fuse
( 1) demand background
It is the system load is too high, abnormal conditions or traffic bursts introduced networks, common solution.
In a distributed system, a service depends on more than one service, there may be a service call fails, such as overtime, abnormal, how can we ensure that in the case of a dependency problem, and will not lead to overall service failure.
For example: a micro-services business logic complexity, in the case of a timeout under heavy load conditions.
Internal conditions: The program bug cause an infinite loop, there is a slow query, the program logic does not lead to run out of memory
External conditions: hacker attacks, promotions, third-party systems respond slowly.
(2) Solutions
The core idea of the failure to solve the interface-level priority is to protect the core business and give priority to the vast majority of users. Such as login function is very important, when traffic is too high, turning off the registration function, login to free up resources.
(3) resolution strategy
Fuse, demotion, current limiting, queuing.
2. What is the fuse
General is a service fault, or is caused by abnormal, similar to the real world 'fuse', when an abnormal condition is triggered, directly fuse the entire service rather than service have been waiting for this time out, in order to prevent to prevent failure of the entire system.
The use of a number of protective measures. Overload protection. A service functions such as X-dependent B services in an interface, when the response is very slow service interface B, X A service function response will be slowed down, resulting in a further thread A services are stuck in the X function
Other features on, A service will slow down or master card. At this point we need fusing mechanism, i.e., A B service is not requested this interface, but may downgrade directly.
3. What is downgraded
Server when pressure surge, according to the current business situation and flow of some services and strategically pages of downgraded. In order to ease pressure on the server resources to ensure the normal operation of the core business, while maintaining the customer and
Get the correct corresponding most customers.
Automatically downgrade : Timeout, the number of failures, the fault current limiter
(1) configured timeout (asynchronous mechanism to detect reply circumstances);
(2) api instability of the number of calls to a certain number downgrade (asynchronous mechanism to detect cases reply);
(3) calls a remote service failure (dns, http service error status code, network failure, Rpc service exception), direct downgrade.
Artificial downgrade : spike, double XI promote downgrade non-essential services.
4, fuse and downgrade the similarities and differences
The same point :
1) From the availability and reliability of the trigger, in order to prevent system crashes
2) allow end-users to experience some of the features that temporarily can not be used
difference:
1) Service fuse failure is generally caused by downstream services, service degradation and system load is generally considered as a whole, is controlled by the caller
2) trigger different reasons explained above, the color of the font
5, fuse to explain the process of downgrade
Hystrix provides several key parameters as follows to configure a fuse:
CircuitBreaker.requestVolumeThreshold // sliding window size, default is 20 circuitBreaker.sleepWindowInMilliseconds // too long, the fuse is detected again whether to open, the default is 5000, that is, 5s bell circuitBreaker.errorThresholdPercentage // error rate, the default 50%
Three parameters together, the meaning of the expression is:
Whenever 20 requests, 50% have failed, the fuse will open, this time calling this service, will return to direct failure, no adjusting remote service. 5s until after the bell, re-testing of the trigger conditions, to determine whether the fuse is closed, or continue to open.
This is inside a very critical point, after reaching the fuse, then go behind it directly tune the micro service. Since then the abnormal service or not to adjust the tone micro occur when this happens is not possible directly to the first error information to the user, so for fusing
We can consider downgrading strategy. The so-called demotion, when a service is blown, the server will no longer be called, this time a client can prepare their own local fallback callback returns a default value.
In doing so, although the reduction in the level, but whatever the outcome is available, stronger than hang directly, of course, this also depends on appropriate business scenarios.