What is a service circuit breaker? What is service degradation?

Reprinted from  the comic: What is a service circuit breaker?

 

What is a service circuit breaker

The concept of fusing comes from the Circuit Breaker in electronic engineering. In the Internet system, when the downstream service responds slowly or fails due to excessive access pressure, the upstream service can temporarily cut off the call to the downstream service in order to protect the overall availability of the system.

 

This measure of sacrificing the part and preserving the whole is called fusing.

 

What would happen to our system if no circuit breaker measures were taken? Let's look at a chestnut.

 

There are three services A, B, and C in the current system. Service A is upstream, service B is midstream, and service C is downstream. Their call chain is as follows:

 

 

 

 

Once the downstream service C becomes unavailable for some reason, a large number of requests are backlogged, and the request thread of service B is also blocked. Thread resources are gradually exhausted, making Service B also unavailable. Immediately afterwards, service A also becomes unavailable, and the entire calling link is dragged down.

 

 


 

Cascading failures like this call link are called avalanches.

 

 

As the saying goes, bone scraping cures poison, and a strong man cuts off his wrist. At this time, our circuit breaker mechanism is needed to save the entire system. The general process of the fuse mechanism is exactly the same as the exam strategy just mentioned:

 

 

 

 

 

Two points need to be explained here:

 

1. Turn on the fuse

Within a fixed time window, when the interface call timeout ratio reaches a threshold, the fuse will be turned on. After entering the fuse state, subsequent calls to the service interface will no longer go through the network, and directly execute the local default method to achieve the effect of service degradation.

 

2. Circuit breaker recovery

Fusing cannot be permanent. When the specified time has elapsed, the service will recover from the blown state and accept the remote call from the caller again.

 

 

 

 

 

 

 

 

Practical application of service circuit breaking

 

 

Spring Cloud Hystrix is ​​implemented based on Netflix's open source framework Hystrix, which implements a series of service protection functions such as service fusing and thread isolation.

 

For the implementation of the fuse mechanism, Hystrix has designed three states:

 

1. Fuse closed state (Closed)

When the service is not faulty, the state of the circuit breaker does not impose any restrictions on the calls of the caller.

 

 

2. Fuse open state (Open)

Within a fixed time window (Hystrix defaults to 10 seconds), the interface call error rate reaches a threshold (Hystrix defaults to 50%), and the fuse will enter the open state. After entering the fuse state, subsequent calls to the service interface will not go through the network, and directly execute the local fallback method.

 

 

3. Half-open state (Half-Open)

After entering the fuse-on state for a period of time (Hystrix defaults to 5 seconds), the fuse will enter the half-blown state. The so-called semi-fuse is to try to restore the service call, allow limited traffic to call the service, and monitor the success rate of the call. If the success rate meets expectations, it means that the service has been restored and enters the fuse-off state; if the success rate is still low, re-enter the fuse-off state.

 

 

The transformation relationship of the three states is as follows:

 


 

 

 

 

 

What is service degradation


Service downgrade is to perform low-priority processing on less important services. To put it bluntly, it is to give up system resources to high-priority services as much as possible. Resources are limited, but requests are unlimited. If the service is not downgraded during the peak period of concurrency, on the one hand, it will definitely affect the performance of the overall service, and in severe cases, it may lead to downtime and the unavailability of some important services. Therefore, generally during the peak period, in order to ensure the availability of the core functional services of the website, some services must be downgraded.

Service degradation means
Denial of service
Determine the source of the application, reject the service request of the low-priority application during peak hours, and ensure the normal operation of the core application. You can also randomly reject the request and directly return that the server is busy to avoid too many requests at the same time, which is especially used in e-commerce spikes.

Shutting down services
Since it is a peak period, some unpopular or marginally unimportant services can be shut down to give up resources for core services. For example, Taobao shuts down some services unrelated to the core business of ordering, such as evaluation and confirmation of delivery, to ensure that users place orders and pay normally during Double 11 every year. Basically the server is busy.

Guess you like

Origin blog.csdn.net/qq_39436397/article/details/115003188