Architecture - limiting

What is the current limit

        There in the development of highly concurrent systems used to protect three weapon systems: Cache, demotion and current limiting. The purpose is to enhance the system cache to increase access speed and capacity of the system can handle, can be described as anti-high concurrent flow silver bullet; degrade when service problems or affect the performance of core processes is temporarily masked, or to be peak issue to resolve after the open; and some scenes are not cached and can be demoted to solve, such as scarce resources (spike, buying), writing services (such as reviews, order), frequent complex queries (comments of the last few pages), so a means to limit the need for concurrent these scenes / request amount, i.e. the flow restrictor.

       Current limiting purposes is protected by concurrent access / request or a rate limit time window requests within the system speed, the rate limit is reached once the service may be denied (directed to the wrong page or resource is not informed of a), line or wait (such as spike, comments, orders), downgrade (or default data return data reveal all the details, such as product details page default inventory stock).

       Usually development of high concurrent systems have a common limiting: limits the total number of concurrent (such as a database connection pool, thread pool), to limit the number of concurrent transient (e.g. the nginx limit_conn module, intended to limit the number of concurrent connections instantaneous), within the limit of the time window average rate (such as Guava's RateLimiter, nginx's limit_req module, limiting the average rate per second); there are other restrictions, such as a remote interface call rate, MQ limit consumption rate. Also in accordance with the flow restrictor may be network connections, network traffic, CPU, or other memory load.

       Prior to this cache silver bullet to deal with the limited flow of 618, two-eleven concurrent high traffic, high concurrency issues in dealing with arguably even more powerful, do not worry instantaneous flow causes the system to hang or avalanche, eventually be detrimental to service It does not service; limiting the need to assess good, not indiscriminate use, otherwise the normal flow of some strange problems caused by user complaints.

      In practical applications do not get too tangled algorithm problem, because some limiting algorithm implementation is not the same as just described; limiting technology which still have to be selected according to the specific use of the actual scene, do not blindly go to the best mode, white black cat to solve the problem is a good cat.

      Because many people encountered in practical work to ask how the current limit, this article will detail the various limiting means. So then we flow algorithm, application-level current limit, distributed limiting, limiting access layer from the detailed study of technical means to limit the lower limit flow.

Limiting the algorithm

1, the token bucket algorithm

A token bucket algorithm is the token bucket fixed capacity, which is added to the token bucket at a fixed rate. Token bucket described as follows:

1, is assumed to limit 2r / s, then a fixed rate of 500 milliseconds is added to the token bucket;

2, bucket b tokens stored up, when the bucket is full, the newly added tokens are discarded or rejected;

3, when a size of n bytes of the packet arrives, remove n from the token bucket, then the packet is sent to the network;

4, if the token bucket is less than n number, the token is not deleted, and the packet is flow restrictor (either discarded or waiting buffer).

 

2 , leaky bucket algorithm

When the bucket as a measuring tool (The Leaky Bucket Algorithm as a Meter), may be used for traffic shaping (Traffic Shaping) and flow control (TrafficPolicing), leaky bucket algorithm is described as follows:

1, a fixed capacity of the bucket, the outflow rate of the water droplets in accordance with a fixed constant;

2, if the bucket is empty, you do not need out of the water droplets;

3, at any rate drops may flow to the drain tub;

4, if the water droplets flows exceeds the capacity of the tub, the water drops flowing into the overflow (discarded), the bucket capacity is constant.

Token bucket and bucket comparison:

1, the token bucket is added to the token bucket in accordance with a fixed rate to see whether the request needs to be processed in the token bucket is sufficient, reject the new request when the number of tokens is reduced to zero;

2, the leaky bucket is a request according to a conventional effluent flow at a fixed rate, the inflow rate request requests whenever any accumulation capacity flows to the bucket, then the new request is rejected;

3, the average inflow restricting token bucket rate (bursty case, as long as the token can be processed, to get a support three tokens, tokens 4), and to allow a degree of burst traffic;

4, a conventional bucket limit the outflow rate division (i.e., the outflow rate is a fixed constant value, such as the rate of outflow is 1, 1 is not one, and the next 2), so that a smooth inflow rate of the burst;

5, the token bucket allows for some of the burst, the leaky bucket is mainly smooth inflow rate;

6, two algorithms may be the same, but in the opposite direction, the same effect as for limiting the parameters obtained.

 

       Sometimes additional counter can be used to limit the current, is used to limit the total number of concurrent, such as a database connection pool, thread pool, the number of concurrent spike; global request as long as the total number of requests or a certain period of time to a set threshold to limit the current, the total number of simple and crude limiting, rather than the average rate limiting.

 

Application-level current limit

    For an application system, there will be a limit of concurrent / requests, that there is always a TPS / QPS threshold, if the super-threshold, the system will not respond to user request or response is very slow, so we'd better be overloaded protection against influx of a large number of requests to defeat the system.

 

   1, if you have used Tomcat, which is arranged Connector one has the following parameters:

acceptCount: If Tomcat threads are busy response, the new connection will enter queuing, if the queue size exceeds, connection is refused;

    2, maxConnections: instantaneous maximum number of connections, the excess will be waiting;

    3, maxThreads: Tomcat can start the maximum number of threads to handle requests, if the request handling capacity has been far greater than the maximum number of threads may be dead.

    4, detailed configuration, please refer to the official documentation. Further as the MySQL (eg max_connections), the Redis (such as tcp-backlog) will have a similar configuration to limit the number of connections.

 

Limiting the total number of resources

 

        If there is a resource scarce resources (such as database connections, threads), and may have more than one system will go to use it, you need to limit the application; you can use pooling techniques to limit the total number of resources: the connection pool, thread pool. Example, database connections assigned to each application 100, the application may use this resource up to 100, may wait beyond or throw an exception.

 

Limiting a total number of the interface / the number of concurrent requests

 

      If the interface may be unexpected visits, but worried that the collapse caused by too much traffic, such as buying business; this time on the need to limit the total concurrent interface / requests total number of requests; because relatively small size, can each interfaces are provided corresponding threshold.

     适合对业务无损的服务或者需要过载保护的服务进行限流,如抢购业务,超出了大小要么让用户排队,要么告诉用户没货了,对用户来说是可以接受的。而一些开放平台也会限制用户调用某个接口的试用请求量,也可以用这种计数器方式实现。这种方式也是简单粗暴的限流,没有平滑处理,需要根据实际情况选择使用;

 

限制某个接口时间窗请求数

    

       即一个时间窗口内的请求数,如想限制某个接口/服务每秒/每分钟/每天的请求数/调用量。如一些基础服务会被很多其他系统调用,比如商品详情页服务会调用基础商品服务调用,但是怕因为更新量比较大将基础服务打挂,这时我们要对每秒/每分钟的调用量进行限速;

 

 

平滑限制某个接口的请求数

 

      之前的限流方式都不能很好地应对突发请求,即瞬间请求可能都被允许从而导致一些问题;因此在一些场景中需要对突发请求进行整形,整形为平均速率请求处理(比如5r/s,则每隔200毫秒处理一个请求,平滑了速率)。这个时候有两种算法满足我们的场景:令牌桶和漏桶算法。

 

      有人会纠结如果应用并发量非常大那么redis或者nginx是不是能抗得住;不过这个问题要从多方面考虑:你的流量是不是真的有这么大,是不是可以通过一致性哈希将分布式限流进行分片,是不是可以当并发量太大降级为应用级限流;对策非常多,可以根据实际情况调节;像在京东使用Redis+Lua来限流抢购流量,一般流量是没有问题的。

 

      对于分布式限流目前遇到的场景是业务上的限流,而不是流量入口的限流;流量入口限流应该在接入层完成,而接入层笔者一般使用Nginx。

Guess you like

Origin www.cnblogs.com/ricklz/p/11032343.html