Some Thoughts on the current limit of service

Sense limiting value

A self-protection mechanism limiting must be valuable, to form the face of the uncertainty of the external world (bursty traffic, user-than-expected) when insufficient system resources. But the sense of the value is very low, because 99.99% of the time the system is always working under the security line, even all year round not touch a chance of crossing the finish line. This is like the law, it is always present, but most of the time for most people it almost does not exist, or that the perception of its existence.

A software system often there are many hidden bug, the most commonly used functions often little bug. Commonly used functions because for a long time without being concerned about the lack of opportunities to reproduce would have been hidden there waiting to erupt. And with the iterative updating software system functions are not concerned it is likely to be overlooked testers from covering test forgotten. The results in the event of a sudden scene (Murphy's Law), this section neglected a bug in the code logic to be awakened, then it will lead to a system error or even Ben collapse. Limiting function is one of those features are not concerned.

To solve the problem of limiting sense of worth, engineers require periodic drills for current limiting feature, you need to use unit testing and stress testing several times repeated. Anyway, I tried various ways to trigger limiting function crossing the finish line, in order to reproduce else logic that period the production environment hardly be executed.

Limiting algorithm

The industry's more common two basic algorithms are limiting funnel algorithm and token algorithm, two algorithms are similar. Hopper algorithm can be understood as each request consumes a certain air and the air funnel is limited, the rate of leakage is obtained by way of the air is limited. Token algorithm can be understood as each request consumes one token, the token bucket production capacity is limited token, the token bucket storage capacity is limited.

QPS request the production environment in terms curve is smooth because most systems use a statistical smoothing algorithm (mean value over time), but there will be a certain degree of randomness and volatility in the actual operation of the process, there will be Some outlier burst, this is generally called burst traffic.

Limiting algorithm needs to consider this burst traffic, it should be for short periods of forbearance, tolerance is capped, it is the tolerance time must be short. The above two algorithms have the ability to tolerate, this tolerance is reflected in the air funnel savings, tokens accumulated in the token bucket accumulated swallowed reached the upper limit of tolerance.

Distributed limiting

If your application is a single process, then the current limit is very simple, counting algorithm requests can be done in memory. Limiting algorithm almost no losses are calculated pure memory. But the request processing capability of each node applications of the Internet world are multi-node distributed is not necessarily the same. We need to consider is the overall request processing capability of the plurality of nodes.

The processing power of a single process is 1w QPS does not mean that the whole request processing capacity is N * 1w QPS, because the overall handling capacity will limit the ability to share resources. This generally refers to a shared resource database, it can be the same machine multiple processes share the CPU and disk resources, as well as overall network bandwidth factors also restrict the amount requested.

This time counting algorithm requires requested in one place (limiting middleware) to complete. Application before processing the request are required to apply traffic (air, token bucket) to this centralized manager. Every request requires a network IO, from applications to the middleware between a flow restrictor.

For example, we can use Redis + Lua to achieve this limiting function, but Lua is much weaker performance than C, usually the limiting algorithm can reach the top of 1w QPS it. You can also use Redis-Cell module Rust achieve internal use, it can reach about 5w QPS it to the limit. This time they have entered the full load condition, but in a production environment, we do not want them to have full capacity.

How was completed 10w QPS limiting it?

key a simple idea is to limit the flow of sub-barrel, then use Redis cluster expansion to allow application of the directive after limiting the client's hash bucket after the break up of the sub-cluster multiple points, thereby dispersing pressure.

How completed one million QPS (1M) limiting it?

If you use the above method and Redis instance that bandwidth resources required is amazing. We may need dozens of Redis nodes, plus hundreds of M (1M * 20 * 8 bytes) of bandwidth to complete this work.

Then we have to convert ideas, not the use of such centralized control approach to work. Like too many people than a country, it is necessary provincial, city and county to separately managed - decentralization.

We QPS overall weight of the dispersion according to the multi each child node, each single byte restrictor point in memory. If each node is equal, then each child node may share of 1 / n QPS.

It has the advantage of limiting the dispersion pressure, IO operation will become pure in-memory computing, so that you can easily cope with ultra-high QPS limiting. But it also increases the complexity of the system, the need for a centralized distribution center to distribute the QPS threshold value to each child node, the needs of each application point byte register with the center of this configuration, there is a need to come back to the system configuration management the QPS allocation management.

Without a mature open source software to improve ease of use, then, such a control center and SDK often require a small team to complete. This is for small and medium enterprises often can not afford, think again limiting sense of the value is so low, this possibility is even more minimal.

Guess you like

Origin blog.csdn.net/weixin_33759269/article/details/91379678