Spring Cloud Alibaba reading notes _6: service current limit and fuse downgrade

Service current limit and fuse downgrade

Service current limit

1. Function

Protect the stability of the system by limiting the number of concurrent accesses or limiting the number of requests that are allowed to be processed within a time window, and provide stable and reliable services for most users by losing the availability of some users.

Two, common implementation
  • Add a current limiting module to the Nginx layer to limit the average access speed
    • Limit the number of concurrent connections to the same IP address: limit_conn_zone module
# 指令定义一个 zone,该 zone 存储会话的状态
limit_conn_zone $binary_remote_addr zone=test:10m;
server {
    
    
    listen       80;
    server_name  localhost;

    location / {
    
    
        # 该指令用于为一个会话设定最大并发连接数。如果并发请求超过这个限制,那么将返回预定错误
        # 此配置示例中,没有显式配置 limit_conn_status 、limit_conn_log_level ,如果没有配置,则启用它们的默认值。
    	limit_conn test 1;
        root   html;
        index  index.html index.htm;
    }
  • Limit the number of visits of the same IP for a certain period of time: limit_request module
# 指令定义一个 zone,生成一个 10M ,名字为 addr 的内存区域,用来存储访问的频次信息
# rate=1r/s,表示允许相同标识的客户端的访问频次,这里限制的是每秒1次
limit_req_zone $binary_remote_addr zone=addr:10m rate=1r/s;
server {
    
    
    listen       80;
    server_name  localhost;

    location / {
    
    
        # 设置一个大小为5的缓冲区,当有大量请求过来时,
        # 超过访问频次限制 rate=1r/s 的请求可以先放到这个缓冲区内等待,但是这个缓冲区只有5个位置,超过这个缓冲区的请求直接报503并返回。
        # 关于 nodelay
        # 如果设置,会在瞬间提供处理(rate+burst)个请求的能力,请求超时(rat+burst)的时候直接返回503,永远不存在请求需要等待的情况
        # 如果没有设置,则所有请求会依次等待排队;
    	limit_req zone_addr burst=5 nodelay;
        root   html;
        index  index.html index.htm;
    }
  • Limit traffic based on IP through limit_rate
server {
    
    
    listen       80;
    server_name  localhost;

    location / {
    
    
        # 当前请求下载流量到达3m时,触发限速
        limit_rate_after 3m;
        # 用于指定向客户端一个连接传输数据的速度,速度的单位是每秒传输的字节数
    	limit_rate 512k;
        root   html;
        index  index.html index.htm;
    }
  • Configure the database connection pool and thread pool size to limit the total concurrency

  • The RateLimiter tool class provided by the Guava toolkit restricts interface access speed : This tool class implements flow restriction based on the token bucket algorithm. By RateLimiter.create(1);creating a flow limiter, the parameter represents the number of tokens generated per second. By limiter.acquire(i);acquiring tokens in a blocking manner.
    The tryAcquire(int permits, long timeout, TimeUnit unit);token can also be obtained by setting the waiting timeout time. If the timeout is 0, it means non-blocking and returns immediately if it is not obtained.

  • The goal of traffic shaping in the TCP communication protocol is to adjust the average rate of data transmission to prevent sudden traffic surges from causing network congestion and packet loss.

3. Commonly used current limiting algorithms
Counter algorithm

Insert picture description here

  • Description: Accumulate the number of visits in the specified period. When the number of visits reaches the set threshold, the current limiting strategy will be triggered, and the number of visits will be cleared when the next time period is entered.
  • Application: Limit the number of triggering SMS sending by the same user within one minute.
  • Problem: Critical problem, the number of prompt visits reaches the threshold at the time period limit (the time when the number of visits is reset).
Sliding window algorithm (flow control)

Insert picture description here

  • Description: In order to solve the critical problem caused by the counter algorithm, the concept of sliding window is introduced, multiple small time windows are divided into fixed time windows, the number of visits is controlled in each small time window, and then the window is moved forward according to the time And delete the expired small time window. In the end, it is only necessary to count the sum of all small time window visits within the range of the sliding window.
  • Application: Sentinel uses a sliding window algorithm to achieve current limiting. Its essence is to reduce the time interval and avoid boundary problems as much as possible.
Token bucket current limiting algorithm

Insert picture description here

  • Note: The system will put tokens into a fixed-capacity token bucket at a constant speed, and discard them when the token bucket is full. For each request, the execution needs to obtain a token from the token bucket. If the token is not obtained, the current limiting policy is triggered.
  • Scenes:
    • The request speed is greater than the token generation speed, the tokens in the token bucket will be consumed quickly, and subsequent requests will enter the current limit state
    • The request speed is equal to the token generation speed, and the request execution is in a steady state
    • The request speed is lower than the token generation speed, the system concurrency is not high, the requests can be processed normally, and it can also cope with the scene of sudden new requests
Leaky Bucket Current Limiting Algorithm

Insert picture description here

  • Description: This algorithm mainly controls the speed of data injection into the network and smooths the burst traffic on the network. The efficiency and speed of the water flow are constant, no matter how fast the water flow is. Most message middleware adopts the idea of ​​leaky bucket current limiting. No matter how large the producer's request is, the message processing capability depends on the consumer.
  • Scenes:
    • The request speed is greater than the flow rate of the water droplets, that is, the request data exceeds the limit that the current service can process, and the current limit strategy will be triggered
    • The request speed is less than or equal to the speed of the outflow of water droplets, that is, the processing capacity of the server exactly meets the request requirements of the client, and the request is executed normally

Service fusing and downgrading

  • Fuse:
    • In the microservice architecture, due to the finer granularity of the service, the request link will be longer. In a high-concurrency scenario, once a service on the link is unavailable, it is likely that requests will accumulate and cause an avalanche effect. In order to prevent the occurrence of the above situation, it is necessary to temporarily isolate the faulty interface and cut off the connection with the external interface, that is, trigger a fuse. After a period of time, the request of the service caller will directly fail until the target service returns to normal.
  • Degradation indicators:
    • Average response time: The average response time of the service interface exceeds the threshold, and the service request within a fixed time window will be automatically blown after that
    • Exception ratio: If the ratio of the total number of exceptions returned by the service interface exceeds the threshold, the service interface will be automatically degraded, and subsequent requests in a fixed time window will automatically return
    • The number of exceptions: similar to the exception ratio, when the number of exceptions returned by the service interface exceeds the threshold, the service fuse is triggered

Guess you like

Origin blog.csdn.net/Nerver_77/article/details/108403161