Thinking limiting program distributed under the micro Services Architecture

1. Micro limiting service

With the stability between the popular micro-services, services, and services is becoming increasingly important. Caching, limiting degradation and is running three weapon to protect the stability of micro-services system. The purpose is to enhance the system cache to increase access speed and capacity of the system can handle, degraded when service problems or affect the performance of core processes is temporarily masked, or solve the problem after the peak to be open, but some scenes and You can not use caching and demoted to solve, such as scarce resources, the database write operation, frequent complex queries, and therefore the need for a means to limit the amount requested these scenarios, namely limiting.

For example, when we design a function, ready on the line, this time this function will consume some resources, processing upper limit is 1 second service 3000 QPS, but if the actual conditions encountered higher than 3000 QPS how to solve it? So limiting purposes should be protected by the concurrent access / request or a rate limit time window requests within the system speed, the rate limit is reached once the service can refuse, waiting degraded.

Learn how to implement a distributed limiting framework, first of all, we need to understand two basic limiting algorithms.

2. limiting algorithm

2.1 leaky bucket algorithm

Leaky Bucket algorithm idea is very simple, water (ie request) to leak into the bucket, bucket of water at a constant speed, when the speed of the water flowing into the overflow directly over the General Assembly, and then rejecting the request, it can be seen leaky bucket algorithm can force data transmission rate limit. A schematic diagram (source network) as follows:

2.2 token bucket algorithm

Token bucket algorithm and drain bucket algorithm effects an opposite direction to the same algorithm, easier to understand. Over time, the system will be a constant 1 / QPS interval (if QPS = 100, the interval is 10ms) was added to the token bucket (and vulnerability to imagine leaking contrary, there is a continuous addition of water in the taps), if the bucket has not a full plus. The new request comes, will each take a token if it took no token may clog or denial of service. A schematic diagram (source network) as follows:

2.3 Algorithm Selection

Difference leaky bucket algorithm and the token bucket algorithm is that the leaky bucket algorithm can impose restrictions on the data transmission rate, the token bucket algorithm can also allow some degree of limitation in the case where the average burst data transmission rate simultaneously. Another advantage is that the token bucket can easily change the speed. Once the need to increase the rate, the token bucket rate of demand increase placed. So, the core algorithm limiting framework or to token bucket algorithm-based.

3. Local limiting

Principle token bucket algorithm is known about the above, how to implement the code?

Benefits may be implemented locally by limiting Long as long integer token bucket, in order to achieve no lock recommended Long AtomicLong atom types, using AtomicLong is very convenient to add them CAS CAS subtraction operation (also token into the token bucket with a pick) to avoid the overhead of context switching thread, the core CAS algorithm is as follows:

private boolean tryAcquireFailed() {
   long l = bucket.longValue();
   while (l > 0) {
      if (bucket.compareAndSet(l, l - 1)) {
          return true;
      }
      l = bucket.longValue();
   }
   return false;
}

According to the token bucket algorithm may be appreciated that the token bucket ScheduledThread a constant need to put token, this part of the code is as follows:

ScheduledThreadExecutor.scheduleAtFixedRate(() -> 
    bucket.set(rule.getLimit()), rule.getInitialDelay(), rule.getPeriod(), rule.getUnit()
);

4. Distributed limiting Overview

Distributed limiting the need to address the problem? I think there are at least a few below:

1. dynamic rules: for example, limiting the QPS we hope can be dynamically modified, limiting function can be open at any time, close, you can follow rules limiting the dynamic business changes and the like.

2. Cluster limiting: For example, for all instances in Spring Cloud micro-services architecture for a unified service restrictor to control the flow subsequent access to the database.

3. blown downgrade: for example, in the call link when a resource unstable state (such as calling a timeout or abnormal increase in the proportion) of the call resource limit, so the request fail-fast, to avoid affecting other resources lead to cascading failures.

Several other optional features, such as real-time monitoring data, the gateway flow control, hot parameter limit, limiting the adaptive system, black and white list control, annotation support, in fact, these functions can easily be extended.

A distributed limiting embodiment

Limiting the idea of ​​distributed I listed the following three options:

1.Redis token bucket

This solution is the simplest one cluster limiting thoughts. Limiting locally, we use the Long-atomic as token bucket, when the number of instances more than one, we will consider Redis serves as a common memory area, read and write. Related to concurrency control, may be implemented using a distributed lock Redis. Obvious drawback scheme, each token will take time for a network overhead, the network overhead is at least millisecond, so this approach concurrency support is very limited.

2.QPS uniform distribution

The idea of ​​this project is to cluster limiting the maximum degree of localization. For example, we have two servers instance, corresponds to the same application (the same Application.name), QPS 100 is set in the program, the application program is connected with a console, the console side depending on the application the average number of instances of QPS be dynamically set for each instance of QPS 50, if the two servers are not experiencing the same configuration, load balancing layer has been allocated on the merits of the traffic server, such as a traffic distribution 70%, 30% flow distribution to another. Faced with this situation, the console may be imposed weighted allocation strategy of QPS. Objectively speaking, this is a cluster limiting implementations, but still there is not a small problem. Allocation proportions of the model is to establish trends in large data traffic is allocated, the actual situation may not be strictly 50-50 or three seven, the error is not controllable, very prone to continuous user access to a server encountered request dismissed while another server at the moment sufficient idle traffic awkward situation.

3. Invoice server

Idea of ​​this project is based on Redis token bucket scheme above. How to solve the tokens are taken each time accompanied by a network overhead, the solution of the program is to create a layer of the control terminal, the control terminal and use Redis token bucket interact only when the client is less than the number of remaining tokens, client it takes the token to the control layer and each taking a number, this idea is similar to the array expansion Java collections framework, setting a threshold value, only when this threshold is exceeded, it will trigger an asynchronous call. The rest of the local access token operation goes limiting. Although the program is still an error exists, but also the maximum number of tokens in a batch errors only.

6. Open Source Project

It says three distributed implementation of the idea of limiting the program recommended here a Distributed limiting project SnowJean (invoice server idea implemented https://github.com/yueshutong/SnowJena ).

I observed through the project source code to the project there are many limiting clever point on the distributed current limiting solution, for example, the use of internal SnowJean observer mode dynamic rule configuration, using the factory pattern to implement the constitution of the flow restrictor, the use of construction who build mode limiting rules. In addressing how the health status of client instance checks, use a heartbeat packet sent Redis expiration time with the client (further delayed when sending heartbeat). Point is relatively good, the project-based view showing the distal end Echarts QPS graph, as in FIG.

Since the author is limited, articles, there must be some loopholes or lack of hope that the experts, chiefs criticized the correction!

Guess you like

Origin www.cnblogs.com/yueshutong/p/11110455.html