Panic, manufacturers actually asked one last time at the current limit I highly concurrent systems?

Great God opened Tao said in a blog: there are three weapon used to protect the system in the development of highly concurrent systems: Cache, demotion and current limiting . Some experience combined with the author of the article describes the current limit of related concepts, algorithms and conventional implementations.

Cache

Cache is better understood, in large high concurrent systems, if there is no cache database will be every minute burst, the system will instantly paralyzed. Use the cache system not only can improve access speed, increase the amount of concurrent access, but also the protection of databases, effective way to protect the system. Large sites generally are the "read" cache usage can easily be thought of. In the large "write" system, the cache is often played by a very important role. For example, the cumulative number of batch data is written, the memory cache inside the queue (production and consumption), and HBase data write mechanism, etc. are also measures to achieve protection system throughput by caching lift system or of. Even messaging middleware, you may think is a distributed data cache.

Demote

Service degradation when the server is pressure surge, according to the current business situation and traffic downgrade policies of some services and pages, thus freeing server resources to ensure the normal operation of the core tasks. Downgrade tend to specify a different level, face different abnormal levels perform different processing. According to the service: You can reject a service, the service may be delayed, it can sometimes be random service. According to the scope of services: You can cut a feature, you can also cut some modules. In short service degradation requires different strategies depending on the downgrading of business needs. The main purpose is to undermine the service though but better than nothing.

Limiting

Limiting services can be considered a downgrade, is limiting inbound and outbound traffic restriction system has reached the purpose of protecting the system. In general throughput of the system can be estimated, in order to ensure the stable operation of the system, once they reach the required threshold limit, you need to take some measures to limit traffic and restrict flow to complete the goal. For example: delay processing, reject processing, or partially processing refuse and the like.

Limiting the algorithm

Common current limiting algorithm are: counter, leaky bucket and token bucket algorithm.

counter

Counter is the most simple and crude algorithm. For example, a service can only handle 100 requests per second. We can set a 1 second sliding window, the window 10 there is a grid, each grid 100 milliseconds, every 100 milliseconds moves once, each mobile needs to record the number of the current service request. 10 times the number of memory needs to be saved. LinkedList data structure can be implemented. Lattice each move when a judgment that the current visits and LinkedList difference whether the last of more than 100, if you need more than the current limit.

Obviously, when more grid divided sliding window, the more smooth scrolling sliding window, limiting the statistics will be more accurate.

Sample code is as follows:

//服务访问次数,可以放在Redis中,实现分布式系统的访问计数
Long counter = 0L;
//使用LinkedList来记录滑动窗口的10个格子。
LinkedList<Long> ll = new LinkedList<Long>();

public static void main(String[] args)
{
    Counter counter = new Counter();

    counter.doCheck();
}

private void doCheck()
{
    while (true)
    {
        ll.addLast(counter);

        if (ll.size() > 10)
        {
            ll.removeFirst();
        }

        //比较最后一个和第一个,两者相差一秒
        if ((ll.peekLast() - ll.peekFirst()) > 100)
        {
            //To limit rate
        }

        Thread.sleep(100);
    }
}
复制代码

Leaky Bucket Algorithm

I.e. leaky bucket algorithm leaky bucket is a very common limiting algorithm, can be used to implement traffic shaping (Traffic Shaping) and flow control (Traffic Policing). Posted on Wikipedia a schematic diagram to help understand:

The main concept of the leaky bucket algorithm is as follows:

  • A fixed capacity of the bucket, the outflow rate of the water droplets in accordance with a fixed constant;
  • If the bucket is empty, you do not need out of the water droplets;
  • Water droplets can flow into the bucket at any rate;
  • If the inflow exceeds the capacity of the drum drops, droplets flowing into the overflow (discarded), the bucket capacity is constant.

Leaky Bucket algorithm is better implemented in a stand-alone system can be implemented using a queue (.Net in TPL DataFlow can better deal with similar problems, you can find the presentation here), messaging middleware in a distributed environment or Redis are optional programs.

Token bucket algorithm

Token bucket algorithm is a fixed-capacity storage tokens (token) of the tub, a fixed rate is added to the token bucket. Token bucket algorithm with a few basic concepts can be described in the following:

  • The token is placed in a fixed rate token bucket. For example, put 10 per second.
  • B tub store up tokens, when the bucket is full, the newly added tokens are discarded or rejected.
  • When a size of n bytes of the packet arrives, remove n from the token bucket, then the packet is sent to the network.
  • If the token bucket is less than n number, the token is not deleted, and the packet is flow restrictor (either discarded or waiting buffer).

As shown below:

The algorithm is a token rate to control the rate of discharge of the output of the token, that is the rate of to network on FIG. to network we can understand the message handler, perform certain services or invoke a RPC.

Compare drain bucket and token bucket

Adjust the token bucket can be controlled and the rate of data processing at runtime, when a burst flows. The frequency of discharge can be increased to enhance the token overall data processing speed, increased speed or slow down the payment token and reduce the overall processing speed of the data acquired by each token number. Not the bucket, the outflow rate because it is fixed, the program processing speed is fixed. More related algorithms: algorithms polymerization

Overall, the token bucket algorithm is better, but the implementation is more complex.

Limiting algorithm

Guava

Guava is a Google open source project, contains a number of Java projects is widely dependent on Google's core library, which provides RateLimiter token bucket algorithm: Burst Limit smooth flow (SmoothBursty) and smooth warm-limiting (SmoothWarmingUp) to achieve .

1. General rate:

Create a flow restrictor, disposed per second provided number of tokens: 2. RateLimiter returned object can be guaranteed not to more than two tokens one second, and is placed in a fixed rate. Achieve a smooth output

public void test()
{
    /**
     * 创建一个限流器,设置每秒放置的令牌数:2个。速率是每秒可以2个的消息。
     * 返回的RateLimiter对象可以保证1秒内不会给超过2个令牌,并且是固定速率的放置。达到平滑输出的效果
     */
    RateLimiter r = RateLimiter.create(2);

    while (true)
    {
        /**
         * acquire()获取一个令牌,并且返回这个获取这个令牌所需要的时间。如果桶里没有令牌则等待,直到有令牌。
         * acquire(N)可以获取多个令牌。
         */
        System.out.println(r.acquire());
    }
}
复制代码

Result of the above code is executed following FIG., A data base is 0.5 seconds. After processing the data in order to get a token, or data call reaches the output smoothing the interface. acquire () return value is wait time the token, if necessary to treat certain traffic bursts, then the return value may be set to a threshold value, the process according to different situations, such as expired discarded.

2. bursty traffic:

Sudden bursts of traffic can be more, it may be less sudden. First look at an example of a multi-burst. Examples of flow or the above two per data token. Acquire method using the following code, the specified parameters.

System.out.println(r.acquire(2));
System.out.println(r.acquire(1));
System.out.println(r.acquire(1));
System.out.println(r.acquire(1));
复制代码

Similar output obtained as follows.

If you want a new handle more data, we need more tokens. Code first gets two tokens, then the token to the next is not obtained after 0.5 seconds or 1 second later, and later resume normal speed. This is a multi-burst example, if no traffic is bursty, the following code:

System.out.println(r.acquire(1));
Thread.sleep(2000);
System.out.println(r.acquire(1));
System.out.println(r.acquire(1));
System.out.println(r.acquire(1));
复制代码

Similar results were obtained as follows:

After waiting for two seconds, the token bucket inside it has accumulated three tokens, you can not spend time continuous access to it. In fact, the burst processing is outputted in the unit time constant. Subclass SmoothBursty these two methods are used RateLimiter of. Another subclass is SmoothWarmingUp, there is a certain flow rate output buffer solutions it offers.

/**
* 创建一个限流器,设置每秒放置的令牌数:2个。速率是每秒可以210的消息。
* 返回的RateLimiter对象可以保证1秒内不会给超过2个令牌,并且是固定速率的放置。达到平滑输出的效果
* 设置缓冲时间为3秒
*/
RateLimiter r = RateLimiter.create(2,3,TimeUnit.SECONDS);

while (true) {
    /**
     * acquire()获取一个令牌,并且返回这个获取这个令牌所需要的时间。如果桶里没有令牌则等待,直到有令牌。
     * acquire(N)可以获取多个令牌。
     */
    System.out.println(r.acquire(1));
    System.out.println(r.acquire(1));
    System.out.println(r.acquire(1));
    System.out.println(r.acquire(1));
}
复制代码

The output below, since the buffer time is 3 seconds, a token bucket and not 0.5 seconds to start a message, but rather form a smooth gradient decreases linearly, with increasing frequency, at 3 seconds the original set frequency is reached, after a fixed frequency outputs.

FIG red coil out exactly add up to 3 times 3 seconds. This function is suitable system has just started to take a little time to "warm up" the scene.

Nginx

For limiting access layer Nginx Nginx comes use two modules:

  • Limiting module connections ngx_http_limit_conn_module
  • Request the leaky bucket algorithm limiting module ngx_http_limit_req_module

1. ngx_http_limit_conn_module

We often encounter this situation, server traffic anomaly, the load is too large, and so on. For high-volume access to malicious attacks, it will bring the waste of bandwidth, server stress, affect the business, often considered a number of connections to the same ip, with a few restrictions.

ngx_http_limit_conn_module modules to achieve this requirement. The module may limit the number of connections defined for each key according to the key, as a source IP connections. Not all connections are counted the module, only those requests being processed (these requests header information has been completely read) where the connection will be counted.

We can add follows nginx_conf the restrictions implemented http {}:

#限制每个用户的并发连接数,取名one
limit_conn_zone $binary_remote_addr zone=one:10m;

#配置记录被限流后的日志级别,默认error级别
limit_conn_log_level error;
#配置被限流后返回的状态码,默认返回503
limit_conn_status 503;
复制代码

Then add server {} in the following code:

#限制用户并发连接数为1
limit_conn one 1;
复制代码

Then we use ab tests to simulate concurrent requests:

ab -n 5 -c 5 http://10.23.22.239/index.html
复制代码

The following results, it is clear that concurrent been locked out, exceeds the threshold value 503 are displayed:

Also just concurrency limit is configured for a single IP, or may be complicated by restrictions on domain names, client IP configuration and similar.

#http{}段配置
limit_conn_zone $ server_name zone=perserver:10m;
#server{}段配置
limit_conn perserver 1;
复制代码

2. ngx_http_limit_req_module

Above we use to ngx_http_limit_conn_module module, to limit the number of connections. So limit the number of requests is how to do it? This needs to be achieved by ngx_http_limit_req_module module, which may limit the frequency defined by the request processing key value.

In particular, the restriction request can handle frequencies from a single IP address. The method is to limit the use of a funnel algorithm, the number of requests per second fixing process, delayed too many requests. If the requested value exceeds the limit frequency domain configuration, request processing will be delayed or discarded, so that all requests are to be processed at a defined frequency.

Disposed in http {}

#区域名称为one,大小为10m,平均处理的请求频率不能超过每秒一次。

limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;
复制代码

Disposed in the server {}

#设置每个IP桶的数量为5
limit_req zone=one burst=5;
复制代码

Defined above is provided for each IP request processing is limited to only one per second. And the cache server 5 can request for each IP, remove five operation request, the request will be discarded.

Use ab test simulates clients continuous access to 10 times:

ab -n 10 -c 10 http://10.23.22.239/index.html
复制代码

Below, it is provided through the number five. A total of 10 request, the first request is processed immediately. The first 2-6 are stored in the bucket. Since the tub is full, not provided nodelay Thus, the remaining four request is discarded.

Guess you like

Origin juejin.im/post/5de22cd66fb9a071b23b73b5