Microservice Protection -- Getting to Know Sentinel (Avalanche Problem, Quick Start Sentinel)

Hello everyone, today we are going to learn about Alibaba's open source flow control and circuit breaker degradation framework – Sentinel.

1. Avalanche problems and solutions

First of all, let's understand the avalanche problem and its solution. We learn this microservice protection to deal with service failures similar to the avalanche problem.

1.1 What is an avalanche problem?

So what is an avalanche problem? Let's take a look at this scenario.

image-20230206151805010

The above are some of the services in our microservices. We know that the businesses in microservices are often more complex. One business may rely on multiple other microservices.

For example, in our service A, there is such a business, which depends on service B.

image-20230206152030612

Service A may have other businesses, for example, it depends on service D.

image-20230206152211277

Now suppose that service D fails!image-20230206152241457

So can the business request we rely on service D within service A still have normal access?

Obviously it is impossible, because when it accesses service D, it must wait for the result of service D. Because service D fails, it must not be able to return the result, and the result is that it will be blocked here.

Will this cause the business within service A to be blocked here? Blocking will cause it to not release the server connection.

Of course, at this time, the internal business of service A that depends on service b or c will not be affected, but if you think about it, if there is a first business request that depends on service D, there will also be a second and third one. Three.

As time goes by, there will be more and more business requests in service A that depend on service D, and they will not release the connection. Will all the connections of service A be occupied later?

image-20230206153109233

The final result is that Tomcat resources are exhausted, because Tomcat resources are limited. If new requests come in at this time, even if I am not accessing service D and I rely on service D, will I not be able to get in?

So can we think that service A is also faulty now?

image-20230206153333120

Then isn't this caused by a service?

As a result, services that depend on it also malfunctioned. Then we need to know that in microservices, this calling relationship is not only that simple, but also very complicated!

image-20230206153527861

So if service A is brought down because of service D, then there must be other services that also depend on service D. Will they also be brought down eventually?

If there are other services that depend on service A, will these businesses that depend on service A also be brought down?

image-20230206153811353

In the end, more and more services will fail, and the entire microservice group will become unavailable. Isn't this an avalanche?

So what is an avalanche problem?

A service failure in the microservice call link causes all microservices in the entire link to be unavailable, which is an avalanche.

This is very scary, so, in microservices, the avalanche problem is a problem that must be solved!

So what is the solution?

1.2 Solutions to Avalanche Problems

There are four common solutions to avalanche problems:

1.2.1 Timeout processing

Set a timeout, and an error message will be returned if the request exceeds a certain period of time without response. It will not wait endlessly.

image-20230206154600501

For example, there are service B and service C, as well as service A. Some services in service A depend on service B, and some services depend on service C.

image-20230206154848632

As service C fails, all services within service A that depend on service C will be blocked. Over time, service A will also fail.

So what will happen with timeout processing? It will add a timeout when calling the business. For example, one second.

image-20230206155045703

That is to say, when service A depends on service C for its business, it will wait up to one second. If service C fails, the wait exceeds one second.

, sorry, it will end the request immediately and no longer return a prompt message to the user.

Just telling you, sorry, the request failed. Will this request be released, so that it will not occupy Tomcat resources all the time? Will this avalanche problem be alleviated to a certain extent?

But why is avalanche mitigation a problem? Instead of 100% solving the avalanche problem?

Let's think about it, it just waits for one second and then releases the resource. That is to say, is the longest waiting time one second? We can understand it as releasing one resource per second, then if you release the request at a rate of one second, .

But now assuming that the new request rate is two per second, and your release speed is not as fast as the entry speed, is it possible that one day the resources of service A will be exhausted? So the timeout you set only plays a mitigating role and does not fundamentally solve the problem.

1.2.2 Bulkhead mode

Limit the number of threads used by each business to avoid exhausting the entire Tomcat resources, so it is also called thread isolation.

The bulkhead mode is a design derived from the cabin in our real life.

image-20230206160601528

Some large ships will use this kind of partition to divide the hull into independent small spaces, and the partition is equivalent to the bulkhead.

Well, because these spaces are isolated from each other, assuming that a certain part of the hull hits an iceberg and leaks water, the most that can be done is to fill this part of the cabin with water.

Because it is isolated, other cabins will not be affected. So can this boat still withstand a certain amount of water in the cabin? In fact, it also improves the disaster tolerance of our entire ship. .

So this pattern continues into our programming, how is it done?

image-20230206161629622

Still in the case just now, service A can be regarded as the entire ship, but how can we prevent the entire Tomcat from hanging up?

We divide the resources in Tomcat, that is, threads, into independent thread pools.image-20230206162049499

For example, 10 thread pools are allocated to business 1, and 10 thread pools are allocated to business 2.

So now after business 1 comes in, it depends on service B.

image-20230206162156606

Then it uses up to 10 threads to access business 2, which depends on service C.

image-20230206162314892

It also uses up to 10 threads.

Now that service C fails, this business will be blocked and occupy our threads. However, it can occupy up to 10 threads.

Are the Tomcat resources it can use at this time limited? Is this fault isolated to 10 threads?

This mode is also called thread isolation mode. It actually avoids the situation of exhaustion of the entire Tomcat resource.

Of course, this model does solve the problems left by the timeout processing scheme, but there may be some waste of resources.

For example, if service C is down, you will still try to request service C. Even though you know it is down, you still try to access it and temporarily use my 10 threads. Isn’t it a waste? ?

1.2.3 Fuse degradation

The circuit breaker counts the abnormal proportion of business execution. If the threshold is exceeded, the business will be cut off and all requests to access the business will be intercepted.

In this mode, there will be a circuit breaker in it, which can count the abnormal proportion of business execution.

In other words, what is the ratio between faulty requests and normal requests in your business? If the threshold is exceeded, the business will be fused. What is a fused means to intercept all requests to access the business.

image-20230206170958243

Let’s use a diagram to represent it. This is what we said before. One situation where an avalanche occurs is that all the business degrees in service A are stuck here, and then the resources are exhausted.

So how does circuit breaker solve this problem?

image-20230206171301335

It will count the business in service A. For example, there is a business in service A that accesses service D, and then it is normal for the first time.

As a result, failures occurred in the next two times. At this time, our circuit breaker will count your abnormal ratio. From this look, with three requests and two failures, does the abnormal ratio reach 60%?

Assuming that our threshold is 50%, does that exceed the threshold, then a circuit breaker will occur at this time.image-20230206171541212

Once there is a circuit breaker, service A still wants to access service D, that is, the business that depends on service D will no longer be able to access service D.

As long as it sees that you are requesting service D, it will say: "Go away!" and resources will be released quickly. How can they be exhausted? Doesn't this solve the problem of resource exhaustion? And since I know that immediate service D is faulty, I won't let you access it at all. Doesn't this waste of resources no longer exist?

1.2.4 Flow control

QPS that restricts business access to avoid service failures due to traffic surges.

What does that mean? For example, there is a protected service here. The maximum QPS it can withstand is 2, which means it can process up to two requests per second. But now there are countless requests coming, do you think it can withstand it? That's definitely not possible. It won't be sieved.

And once this service fails, will other services that rely on this service also fail? Then there will also be an avalanche situation. Therefore, we must try our best to avoid services caused by excessive traffic. Fault.

So what should we do? This is where our Sentinel is used.

image-20230206180418742

Now, if there are really countless requests coming in, and Sentinel can release requests at a frequency that the service can bear, will our microservices be able to handle these requests calmly soon?

image-20230206183222410

This will prevent it from malfunctioning. If it does not malfunction, the fault will not be transmitted, and there will be no avalanche problem.

So you see, we have nipped the avalanche problem in the cradle. So flow control is avalanche prevention control.

2. Comparison of service protection technologies

Okay, everyone, above, we have learned about the avalanche problem and common solutions, and the best way to implement these solutions is definitely to use existing frameworks.

So now we are going to compare common frameworks for implementing service protection and their differences. Here, we mainly compare Sentinel and Hystrix.

Because Hystrix has just become popular in Spring Cloud in the past few years, it is recommended for everyone to use. It was produced by Netflix, but as Netflix announced that it would stop upgrading and maintaining Hystrix, it is now gradually declining.

People are also trying to find a new solution, and at this time, Alibaba has open sourced a project called Sentinel, which has become a service protection component in SpringCloudAlibaba, and is now widely used in domestic Internet companies.

image-20230207102337924

Looking at the table above, we mainly focus on the red part.

2.1 Isolation strategy

Both Sentinel and Hystrix support semaphores for isolation, and Hystrix also supports thread pool isolation, but by default they use thread pools for isolation.

What is the difference between these two types of isolation?

You may be familiar with the thread pool, because when we talk about avalanche problem solutions, the bulkhead mode is an example of thread pool isolation.

We said that after a business request enters TomCat, it will create an independent thread pool for each isolated business, and each isolated business will have an independent thread pool, so there will naturally be independent threads .

Therefore, it will have many, many more threads than TomCat's direct processing method, so it can be considered that the number of thread pools will increase exponentially. Although this method has better isolation, as the number of threads increases, we know that the CPU It will bring some additional context switching consumption, so there will be a certain loss in the performance of the entire service.

And what is the scheme adopted for semaphore isolation?

When a business request enters TomCat, I will not create an independent thread pool for you. Instead, I will make statistics on how many threads the current business has used. Then I will limit you and say that you can only use 10. When you have used 10 threads and there is a new business need to acquire threads, I will stop you.

In other words, it will limit the number of threads that each business can use. The pool is that pool. TomCat's default thread pool does not create new threads or new thread pools.

This reduces the creation of threads without affecting performance on an isolated basis. However, its isolation is a little worse than that of the thread pool, because it is in the same pool after all, but now there is a big pot of rice, and everyone takes a separate bowl to serve it.

This is the difference between the two isolation methods.

2.2 Circuit breaker degradation

Circuit breaker downgrade actually counts the proportion of exceptions, and then triggers the threshold of the exception proportion, and I will break the circuit breaker for you. However, in Sentinel, in addition to counting the proportion of abnormal requests, it can also count the proportion of slow calls.

What is slow call?

It is just a business. It takes a long time in most cases. Then there may be problems with this business, which may slow down my entire service and may bring me down, so I can cut it off.

However, in Hystrix, circuit breaker and downgrade are performed based on this abnormal method by default, so the circuit breaker strategy in Sentinel will be richer.

2.3 Current limiting

Current limiting is what we call flow control. In Sentinel, it supports current limiting based on QPS and call relationships. It can even limit current based on hotspot parameters. There are many ways to limit this flow.

In Hystrix, it does not have a special current limit control. It is actually based on this thread pool. If your thread pool is set to 10, then the maximum you can get is 10. It is based on this method to limit the current. Therefore, this current limiting ability is relatively weak.

2.4 Traffic Shaping

Traffic shaping is to change burst traffic into stable and uniform traffic. How?

It can support slow start, that is, preheating mode, as well as uniform queuing and so on.

So in this way, fluctuating requests become uniform requests, and it will be easier for my microservices to handle.

Such functionality is not supported in Hystrix.

2.5 Console

The console is what we call the UI interface. It gives you a visual interface to facilitate you to view operations.

In Sentinel, it has an out-of-the-box console, in which you can not only monitor microservices and check the running status of microservices, but also configure our downgrade rules, which will take effect dynamically as soon as they are configured.

In Hystrix, its console only supports the service status function and does not have the function of dynamically modifying rules.

3. Understanding Sentinel and installation

Next we will get to know Sentinel and install its console.

3.1. First introduction to Sentinel

First of all, we all know that Sentinel is a microservice flow control component open sourced by Alibaba.

Official website address: https://sentinelguard.io/zh-cn/index.html

image-20230215113756822

Then the following is an introduction to some features of Sentinel:

​ • Rich application scenarios : Sentinel has undertaken the core scenarios of Alibaba’s Double Eleven promotion traffic in the past 10 years, such as flash sales (that is, burst traffic is controlled within the range that the system capacity can bear), message peak-shaving and valley-filling, and clustering Flow control, real-time circuit breaker for downstream unavailable applications, etc.

​ • Complete real-time monitoring : Sentinel also provides real-time monitoring functions. You can see the second-level data of a single machine connected to the application in the console, and even the summary operation status of a cluster of less than 500 machines.

​ • Extensive open source ecosystem : Sentinel provides out-of-the-box integration modules with other open source frameworks/libraries, such as integration with Spring Cloud, Dubbo, and gRPC. You only need to introduce the corresponding dependencies and perform simple configuration to quickly connect to Sentinel.

​ • Complete SPI extension points : Sentinel provides an easy-to-use, complete SPI expansion interface. You can quickly customize logic by implementing extension interfaces. For example, customized rule management, adapting dynamic data sources, etc.

3.2 Install Sentinel

1. Download

Sentinel officially provides a UI console to facilitate us to set current limit on the system. You can download it at [ Releases · alibaba/Sentinel (github.com) ](https://github.com/alibaba/Sentinel/releases).

image-20230215114505133

2. Run

Place the jar package in any non-Chinese directory, open the terminal and execute the command:

image-20230215114823457

java -jar sentinel-dashboard-1.8.6.jar

If you want to modify Sentinel's default port, account, and password, you can configure it through the following:

Configuration items Defaults illustrate
server.port 8080 service port
sentinel.dashboard.auth.username sentinel Default username
sentinel.dashboard.auth.password sentinel default password

For example, to modify the port:

java -Dserver.port=8090 -jar sentinel-dashboard-1.8.6.jar

3.3 Access

Visit the http://localhost:8080 page and you can see the sentinel console: the account and password are both sentinel

image-20230217152351557

4. Microservice integration with Sentinel

To use Sentinel, it must be combined with microservices. Here we use the SpringCloud project we have used before. Friends in need can download this project

springCloud case demonstration: springcloud learning demonstration (gitee.com)

The project structure is as follows:

image-20230217155001204

Then everyone remember to start Nacos

image-20230217161211340

Then start these three projects:

image-20230217164625461

Now that our microservices are ready, the next step is to integrate Sentinel.

We will explain based on order-service.

4.1 Introducing dependencies

<!--sentinel-->
<dependency>
    <groupId>com.alibaba.cloud</groupId> 
    <artifactId>spring-cloud-starter-alibaba-sentinel</artifactId>
</dependency>

4.2 Configuration Console

Modify the application.yaml file and add the following content:

image-20230217165817510

After finishing, restart the service! ! !

4.3 Access any endpoint (interface) of order-service

Open the browser and visit http://localhost:8088/order/101, so as to trigger the monitoring of sentinel.

Then visit the sentinel console to see the effect:

image-20230217170038899

5. Current limiting rules

Next we are going to learn how to use Sentinel to solve the avalanche problem we mentioned before.

We mentioned four solutions when explaining the avalanche problem before, and Sentinel mainly implements three of them.

They are:

  1. Limiting
  2. thread isolation
  3. downgrade circuit breaker

5.1 Quick Start

First, let’s learn about the basic usage of Sentinel’s current limiting through a quick introduction.

Let’s first understand a concept.

Cluster point link:

​ When a request enters a microservice, it first accesses the DispatcherServlet, and then enters the Controller, Service, and Mapper. Such a call chain is called a cluster point link . Each interface monitored in the cluster link is a resource .

By default, sentinel will monitor each endpoint (Endpoint, that is, the method in the controller) of SpringMVC, so each endpoint (Endpoint) of SpringMVC is a resource in the call link.

For example, the endpoint in the OrderController in the order-service we just accessed: /order/{orderId}

image-20230217171312093

Flow control, circuit breaker, etc. are all set for the resources in the cluster point link, so we can click the button behind the corresponding resource to set the rules:

  • Flow control: flow control
  • Downgrade: downgrade circuit breaker
  • Hotspot: Hotspot parameter current limiting is a type of current limiting.
  • Authorization: Requested permission control

That means that in the future we can perform various operations such as downgrading resources.

So how to do it specifically?

For example, if we click the flow control button behind the resource/order/{orderId}, the form will pop up.

image-20230217171558742

This form has several things you need to fill out.

The first is the resource name. Because you clicked on the resource /order/{orderId}, the default resource name is it, which means that this flow control rule is for this request.

The second one is for the source, which means that the requests coming from where need to be flow-limited. The default means that all incoming requests will be flow-limited. In general, we don’t need to specify the source for flow-limiting, that is, all requests must be flow-limited. flow.

The third is the threshold type. Generally, QPS is selected. QPS refers to the concurrency, that is, the number of requests per second, and the click threshold specified later refers to the upper limit of this QPS. 1 means that only one request is allowed per second, and excessive requests will be intercepted and an error will be reported. As for how high the click threshold will be set in the future? Just set the maximum concurrency of your interface. How do you know the concurrency of your own service? Do a stress test.

5.1.1 Demonstration

Next, we set flow control rules for the resource /order/{orderId}. QPS cannot exceed 5, and then use Jemeter to perform stress testing.

image-20230220113421407

Use Jemeter.

image-20230220115344481

We can see that the request was sent, but only five passed each time, and the others failed.

We can also take a look at the Sentinel console.

image-20230220115632355

5.2 Flow control mode

Through the quick start, we have learned the basic usage of Sentinel. Next, let's take a look at the advanced configuration of current limiting.

When adding a flow-limiting rule, click Advanced Options to choose from three flow-control modes :

image-20230220144633433

  • Direct: Statistical requests for current resources, and directly limit the flow of current resources when the threshold is triggered, which is also the default mode
  • Association: Statistics on another resource related to the current resource. When the threshold is triggered, the current resource is limited. That is to say, I now have two resources, A and B. A triggers the threshold, but I limit the current resource on B.
  • Link: Counts the requests to access this resource from the specified link. When the threshold is triggered, the specified link will be limited, not the current resource. For example, now I have three resources A, B, and C. A and B has to access resource C, but when I count resource C, I only count the requests from A, and I don’t care about B.

We won’t introduce too much about the direct mode. Let’s mainly look at the associated mode and the link mode. Under what circumstances should they be used?

5.2.1 Association mode

Usage scenario : We have a user payment business. After the user has paid, he needs to modify the order status. At the same time, there may be users on the other side who want to query the order. Everyone knows that they will fight for the query and modification actions. Grab the lock of the database, which will cause competition. If the writing operation is too frequent, it will naturally affect the reading operation, and vice versa.

But we know that in the business, the payment business update order must have a higher priority. We must give users priority to pay, and the priority of the query business is relatively low, so we hope that when the business of updating the order triggers the threshold, It means that it has higher requirements, so we need to limit the query business to prevent it from affecting our modification business.

Case:

Next, we use a case to demonstrate how to use the association mode.

Let's first add two interfaces to the code.

image-20230220150458021

Restart the service and check the cluster point link of the sentinel console:

image-20230220150946774

After a reboot, everything is gone and we have to revisit the interface.

localhost:8082/order/query

localhost:8082/order/update

image-20230220151124405

Now that our query and update have appeared in the Sentinel console, our current requirement is to limit the flow of Query when the QPS of update reaches 5.

Excuse me, who should I add flow control rules to?

We have to remember that we have to add rules to whomever we restrict traffic.

Then we need to add flow control rules to the query.

image-20230220152214404

Next, we will use Jmeter to do some testing.

image-20230220152949100

There is no flow limit for normal Jmeter requests. Let’s take a look at the query.

image-20230220153141146

It is indeed restricted.

in conclusion:

We can use association mode if the following two conditions are met.

  1. Two competing resources
  2. One with higher priority and one with lower priority

5.2.2 Link mode

Next let's learn about link mode. Here we learn directly through an example

For example, I have two request links:

  • /test1 --> /common

  • /test2 --> /common

One is to access common from test1, and the other is to access common from test2. ,

And we have such a configuration below.

image-20230220155606813

The configured resource name is common, but what about the flow control method? It's a link method. The entry resource is test2.

So what does the configuration of the entry resource mean?

That is, when I do current limiting statistics, I only count the requests entering common from test2, and I ignore the requests coming in from test1.

So this kind of statistics is a kind of statistics on the source of requests.

The question is, under what circumstances? Should we use this model?

​ I have a business of querying orders and creating orders, and both of these businesses need to query products. Doesn't this form two links?

From querying orders to checking products and creating orders to checking products. We know that the concurrency of the order query business is often relatively high, but the product query business will definitely have its own concurrency upper limit. If the concurrency of the query business is too high, it will inevitably affect the order creation business. Therefore, we should put the query How to set a limit on business concurrency?

step:

  1. Add a queryGoods method in OrderService without implementing the business

  2. In OrderController, transform the /order/query endpoint and call the queryGoods method in OrderService

  3. Add an endpoint of /order/save in OrderController and call the queryGoods method of OrderService

  4. Set current limiting rules for queryGoods. The method of entering queryGoods from /order/query must limit QPS to less than 2.

Now that we know the steps, let’s implement it:

First, in the order-service service, add a queryGoods method to the OrderService class

image-20230220162150603

In the OrderController of order-service, modify the business logic of the /order/query endpoint:

image-20230220162350687

Then in the OrderController of order-service, modify the /order/save endpoint to simulate adding a new order:

image-20230220162556957

By default, the methods in OrderService are not monitored by Sentinel. We need to mark the methods to be monitored through annotations.

Add the @SentinelResource annotation to the queryGoods method of OrderService:

image-20230220162842890

In link mode, two links from different sources are monitored. However, sentinel will set the same root resource for all requests entering SpringMVC by default, which will cause the link mode to fail.

We need to turn off this resource aggregation for SpringMVC and modify the application.yml file of the order-service service:

image-20230220163243591

Restart the service and access /order/query and /order/save. You can see that new resources have appeared in sentinel’s cluster point link rules:

image-20230220165021883

We can see that order/save and order/query have become two independent links. When the context integration was not closed before, they still belong to two sub-links under the same root link.

After solving it, we can add flow control rules to this goods. There are two above, and we can choose one at will. These two goods are actually the same.

image-20230220170006945

In this way, the link rules are configured. Next, we will use Jmeter to test. (Now we only limit query)

image-20230220170306967

You can see that there are 200 users here, and the message is sent within 50 seconds. The QPS is 4, which exceeds the threshold 2 we set.

An http request is to access /order/save:

image-20230220170339124

An http request is to access /order/query:

image-20230220170349160

The result of running:

sava:

image-20230220172122074

query:

image-20230220172141809

We can see that save is not affected in any way, and both queries fail. This is current limiting based on source requests.

5.3 Flow control effect

We have already learned the flow control mode before, and next we will learn the flow control effect. What is the flow control effect?

Flow control effect refers to the measures that should be taken when the request reaches the flow control threshold, including three types:

  • Fail fast: After reaching the threshold, new requests will be rejected immediately and a FlowException will be thrown. This is the default processing method. So when we were doing testing before, we found that once the service triggers flow limiting, it will get an exception status code (429), and the message thrown out is flow limiting, which means it is flow limited.

  • warm up: Warm-up mode, similar to fast fail, rejects requests that exceed the threshold and throws an exception. But what makes them different is that the mode threshold changes dynamically, with the warm-up mode gradually increasing from a smaller value to the maximum threshold.

  • Queuing and waiting: It is different from the previous mode. It will not throw an exception immediately, but queue up all requests in order. The interval between two requests cannot be less than the specified time. If it is greater, it will be rejected and thrown. abnormal.

5.3.1 warm up: warm-up mode

Let’s first learn about warm up: preheating mode.

In this preheating mode, it will also directly reject requests that exceed the threshold and throw an exception, but the special thing is that its threshold is not static. This solution is to deal with service cold start. So what is service cold start?

It's like a person. Before you do some strenuous exercise, you must do some stretching exercises to warm up your body. If you don't do it, you will easily injure yourself during exercise. muscles.

The same goes for the server. The maximum QPS of the server can reach, for example, 10, but as soon as it starts up, you immediately fill up the QPS. It may not even react before it is stunned by you. It hung up, so when our service just started, it couldn't reach full QPS.

How to do it?

In the preheating mode, the initial request threshold initial value = maxThreshold (maximum threshold) / coldFactor (cold start factor). After continuing for a specified period of time, it gradually increases to the maxThreshold value. The default value of coldFactor is 3.

The preheating mode is to avoid the moment of cold start and avoid excessive concurrency causing failures.

For example, if I set the maxThreshold of QPS to 10 and the warm-up time to 5 seconds, then the initial threshold is 10 / 3, which is 3, and then gradually increases to 10 after 5 seconds.

image-20230220180143994

Case:

We set the current limit for the resource /order/{orderId}. The maximum QPS is 10. We use the warm up effect and the warm-up time is 5 seconds.

image-20230220194556055

Jmeter test:

image-20230220194634140

QPS is 10.

image-20230220194837355

When it was just started, most of the requests failed, and only 3 were successful, indicating that QPS was limited to 3:

As time goes by, the success rate gets higher and higher:

image-20230220194903517

Go to the Sentinel console to view real-time monitoring:

image-20230220194928287

after a while:

image-20230220195039828

5.3.2 Waiting in line

The fast failure and warm up we talked about before will reject new requests and throw exceptions, while queuing is to let all requests enter a queue, and then execute them sequentially according to the time interval allowed by the threshold.

Subsequent requests must wait for previous executions to complete, and will be rejected if the expected waiting time for the request exceeds the maximum duration.

Give a chestnut:

​ QPS is equal to 5 (5 requests per second), which means that a request in the queue is processed every 200ms; then Sentinel will strictly execute this time interval. After the previous request is executed, the second request must wait for 200 millisecond.

​ For example, if the execution of the previous request is less than 200 milliseconds, no matter how many, the second one must wait for 200 milliseconds. So this 200 milliseconds is called the expected waiting time.

​ Then we can convert it, I already have 5 requests in front of me, how long do I have to wait? 1 second. This is called the expected waiting time.

​ timeout = 2000, which means that requests that are expected to wait for more than 2000ms will be rejected and an exception will be thrown.

Now, 10 requests are received at the same time in the first second, but only one request is received in the second second. At this time, the QPS curve looks like this:

image-20230221135854753

If you use the queue mode for flow control, all incoming requests must be queued and executed at a fixed interval of 200ms, and the QPS will become very smooth:

image-20230221135931069

A smooth QPS curve is more friendly to the server.

Let's implement a case next, setting a current limit for the resource /order/{orderId}, with a maximum QPS of 10, using the flow control effect of queuing, and setting the timeout period to 5s.

image-20230221140247042

Then we use Jmeter to test

image-20230221140452265

QPS is 15, which has exceeded the 10 we set.

If it is the previous fast failure and warmup mode, the excess requests should directly report an error.

But let’s look at the results of queue mode:

image-20230221140714503

All passed.

Go to sentinel to view the real-time monitored QPS curve:

image-20230221141247920

Look at the next paragraph, the QPS is very smooth in the middle, and it is consistently maintained at 10. We sent 15, but it was 10, so where did the extra requests go? Is it waiting in the queue for execution, so you can see that the response time (waiting time) will become longer and longer.

When the queue is full, some requests will fail.

5.4 Hotspot parameter current limiting

5.4.1. Global parameter current limiting

In this chapter we will learn about a special current limit, hotspot parameter current limit. So what's so special about it?

The current limiting we learned earlier will count all requests entering the resource when counting the QPS of the resource, and then determine whether it exceeds the QPS threshold.

The hotspot parameter current limit is to separately count requests with the same parameter value to determine whether it exceeds the threshold. What's the meaning?

For example, I currently have a resource that queries products based on ID.

image-20230223164126458

There are now 4 requests coming in.

image-20230223164221953

If it is based on the original statistical method, then my QPS is 4, and according to the hotspot parameter, it will judge according to the parameter value, the ID of the first three requests is 1, and the ID of the last delivery is 2.

image-20230223164433226

Then QPS will be counted separately. The one with ID 1 will be counted as 3, and the one with ID 2 will be counted as 1.

Configuration example:

image-20230223164652755

The key here are these three configurations. The first is the parameter index, where 0 is given, which means the 0th index in the parameter list of the current resource, which is the first parameter.

The stand-alone threshold is 5 and the statistical window length is 1. The combination of these two means a maximum of five requests per second.

The meaning of the entire configuration is to make statistics on the parameter 0 (the first parameter) of the hot resource, and the number of requests for the same parameter value cannot exceed 5 per second .

5.4.2. Hotspot parameter current limiting

In the configuration just now, all products of the interface for querying products are treated equally, and the QPS is limited to 5.

In actual development, some products may be hot products, such as flash sale products. We hope that the QPS limit of these products is different from other products, and is higher. Then you need to configure the advanced options of hotspot parameter current limit:

image-20230223165310702

Combined with the previous configuration, the meaning here is to limit the current of the long type parameter of number 0, and the QPS of the same parameter cannot exceed 5 per second, with two exceptions:

• If the parameter value is 100, the QPS allowed per 1 second is 10

• If the parameter value is 101, the QPS allowed per 1 second is 15

Case:

Add a hotspot parameter current limit to the resource /order/{orderId}, the rules are as follows:

  • The default hotspot parameter rule is that the number of requests per second should not exceed 2

  • Set an exception for the parameter 102: the number of requests per second should not exceed 4

  • Set an exception for the parameter 103: the number of requests per second should not exceed 10

!!!!!!!

Note : Hotspot parameter current limiting is invalid for default SpringMVC resources. You need to use @SentinelResource annotation to mark resources! ! ! ! !

So the first thing we have to do is not to configure the rules, but to modify the code and add annotations.

We first add annotations to the /order/{orderId} resource in the OrderController in order-service.

image-20230223181643468

Restart the service.

After accessing this interface, you can see that the hot resource we marked appears:

image-20230223182112995

Don't click the button behind hot here, there is a bug in the page.

Click the hotspot rules menu in the left menu:

image-20230223182505283

Click to add

image-20230223182527763

Fill out the form

image-20230223182654939

Click Add to open Jenkins.

image-20230223183528386

Contains 3 http requests:

Common parameters, QPS threshold is 2

image-20230223183616405

operation result:

image-20230223183646694

Exceptions, QPS threshold is 4

image-20230223183709675

operation result:

image-20230223183752404

Exceptions, QPS threshold is 10

image-20230223183835184

operation result:

image-20230223183845745

Guess you like

Origin blog.csdn.net/weixin_53041251/article/details/129237061