Java develops high-concurrency current-limiting special effects that kill large-scale Internet companies

introduction

High Concurrency is one of the factors that must be considered in the design of the Internet distributed system architecture. It usually refers to the design to ensure that the system can process many requests in parallel at the same time.

Some commonly used indicators related to high concurrency are Response Time, Throughput, QPS (Query Per Second), and the number of concurrent users.

Response time: The time for the system to respond to the request. For example, it takes 200ms for the system to process an HTTP request. This 200ms is the response time of the system.

Throughput: The number of requests processed per unit time.

QPS: The number of response requests per second. In the Internet field, the distinction between this indicator and throughput is not so obvious.

Number of concurrent users: The number of users who carry the normal use of system functions at the same time. For example, in an instant messaging system, the amount of simultaneous online to some extent represents the number of concurrent users of the system.

High concurrency method

One is to use cache, the other is to use to generate static pages; the other is to optimize our code from the most basic place to reduce unnecessary waste of resources:(

Don't frequently new objects. Use singleton mode for classes that only need to have one instance in the entire application. For String connection operations, use StringBuffer or StringBuilder. For utility classes, access them through static methods.

Avoid using the wrong way, such as Exception can control the launch of the method, but Exception should retain stacktrace consumption performance, unless necessary, do not use instanceof for conditional judgment, try to use the conditional judgment method of ratio. Use the more efficient class in JAVA, such as ArrayList than Vector Good performance. )

When developing a high-concurrency spike system, there are three sharp tools to protect the system: caching, downgrading, and current limiting.

The purpose of caching is to increase the access speed of the system and increase the capacity that the system can handle. It can be described as a silver bullet against high concurrent traffic; and degradation is when the service fails or affects the performance of the core process, it needs to be temporarily blocked, waiting for peaks or Open the problem after the problem is solved; and some scenarios cannot be solved by caching and downgrading, such as scarce resources (seckill, panic buying), writing services (such as comments, ordering), frequent complex queries (the last few pages of the comments), so There needs to be a means to limit the number of concurrent/requests in these scenarios, that is, current limiting.

The purpose of current limiting is to protect the system by limiting the rate of concurrent access/requests or requests within a time window. Once the rate is reached, service can be denied (directed to an error page or notified that resources are gone), and queued Or wait (such as spike, comment, place an order), downgrade (return to bottom data or default data, such as the product details page inventory is available by default).

Current limiting method

Limit the total number of concurrent (such as database connection pool, thread pool)

Limit the number of instantaneous concurrent connections (such as the limit_conn module of nginx, used to limit the number of instantaneous concurrent connections)

Limit the average rate within the time window (such as Guava's RateLimiter, nginx's limit_req module, which limits the average rate per second)

Limit remote interface call rate

Limit the consumption rate of MQ.

The current can be limited according to the number of network connections, network traffic, CPU or memory load, etc.

Cache the silver bullet first, and then limit the flow to deal with the high concurrent traffic of 618 and Double Eleven. It can be said to be more powerful in dealing with the high concurrency problem. Don’t worry about instantaneous traffic causing the system to hang or avalanche, and eventually it will damage the service. It is not non-service; the current limit needs to be evaluated well and can not be used indiscriminately, otherwise some strange problems will occur in the normal traffic and cause users to complain.

In actual applications, don’t get too entangled with algorithmic issues, because some current-limiting algorithms are the same, but the description is different; which current-limiting technology to use should be selected according to the actual scene, don’t blindly find the best mode, white cat The black cat can solve the problem is the good cat.

Because I have encountered many people in my work asking how to limit current, this article will introduce various current limiting methods in detail. Then we will learn in detail the lower current-limiting technical methods from current-limiting algorithms, application-level current-limiting, and access-layer current-limiting.

Application-level current limiting

Current limit total number of concurrent/connection/requests

For an application system, there must be a limit to the number of concurrent/requests, that is, there is always a TPS/QPS threshold. If the threshold is exceeded, the system will not respond to user requests or respond very slowly, so it is best to overload Protection to prevent a flood of requests from crashing the system.

If you have used Tomcat, one of its Connector configurations has the following parameters:

acceptCount: If Tomcat's threads are busy responding, new connections will enter the queue, and if the queue size is exceeded, the connection will be rejected;

maxConnections: instantaneous maximum number of connections, if it exceeds, it will be queued for waiting;

maxThreads: The maximum number of threads that Tomcat can start to process requests. If the request processing volume has been much larger than the maximum number of threads, it may be dead.

For detailed configuration, please refer to the official documentation. In addition, Mysql (such as max_connections) and Redis (such as tcp-backlog) will have similar configurations that limit the number of connections.

Total current limit resources

If some resources are scarce resources (such as database connections, threads), and multiple systems may use it, then the application needs to be limited; pooling technology can be used to limit the total number of resources: connection pools, thread pools. For example, if the database connection allocated to each application is 100, then this application can use up to 100 resources, beyond which it can wait or throw an exception.

Limit the total number of concurrent/requests of an interface

If there may be sudden access to the interface, but you are worried that too much traffic will cause a crash, such as snapping up a business; at this time, you need to limit the total number of concurrent/requests of this interface; the total number of requests; because the granularity is relatively fine, it can be Corresponding thresholds are set for each interface. You can use AtomicLong in Java for current limiting:

=================================

try {

if(atomic.incrementAndGet()> current limit) {

//Reject the request

}

//Process the request

} finally {

atomic.decrementAndGet();

}

=================================

It is suitable for limiting the flow of services that are not destructive to the business or services that require overload protection, such as rush-buying services. If the size exceeds the size, it will either make users queue up or tell users that they are out of stock, which is acceptable to users. And some open platforms will also limit the amount of trial requests for users to call an interface, and this counter method can also be used. This method is also a simple and rude current limit, without smoothing, it needs to be selected according to the actual situation.

Current limiting algorithm

Token bucket algorithm

The token bucket algorithm is a bucket that stores tokens of a fixed capacity, and tokens are added to the bucket at a fixed rate. The description of the token bucket algorithm is as follows:

Assuming a limit of 2r/s, tokens are added to the bucket at a fixed rate of 500 milliseconds;

At most b tokens can be stored in the bucket. When the bucket is full, the newly added tokens are discarded or rejected;

When a data packet with a size of n bytes arrives, n tokens will be deleted from the bucket, and then the data packet will be sent to the network;

If there are less than n tokens in the bucket, the token will not be deleted, and the packet will be flow-limited (either discarded or waited in the buffer).

Leaky bucket algorithm

When the leaky bucket is used as a metering tool (The Leaky Bucket Algorithm as a Meter), it can be used for traffic shaping (Traffic Shaping) and flow control (TrafficPolicing). The description of the leaky bucket algorithm is as follows:

A leaky bucket with a fixed capacity, which flows out water droplets at a constant and fixed rate;

If the bucket is empty, no water droplets need to flow out;

Water droplets can be poured into the leaky bucket at any rate;

If the inflowing water droplets exceed the capacity of the bucket, the inflowing water droplets are overflowed (discarded), and the leaky bucket capacity remains unchanged.

Access layer current limit

The access layer usually refers to the entrance of request traffic. The main purpose of this layer is: load balancing, illegal request filtering, request aggregation, caching, degradation, current limiting, A/B testing, service quality monitoring, etc., you can refer to the author's writing "Using Nginx+Lua (OpenResty) to develop high-performance web applications".

For Nginx access layer current limiting, you can use Nginx's own two modules: the connection number current limiting module ngx_http_limit_conn_module and the request current limiting module ngx_http_limit_req_module implemented by the leaky bucket algorithm.

You can also use the Lua current limiting module lua-resty-limit-traffic provided by OpenResty for more complex current limiting scenarios.

limit_conn is used to limit the total number of network connections corresponding to a certain KEY, which can be limited according to dimensions such as IP and domain names. limit_req is used to limit the average rate of the request corresponding to a KEY, and has two usages: smooth mode (delay) and allow burst mode (nodelay).

ngx_http_limit_conn_module

limit_conn is to limit the total number of network connections corresponding to a certain KEY. You can limit the total number of connections in the IP dimension according to IP, or limit the total number of connections of a domain name according to the service domain name. But remember that not every request connection will be counted by the counter. Only those request connections that are processed by Nginx and have read the entire request header will be counted by the counter.

to sum up

The above is my summary of the high-concurrency current-limiting special effects of Java development spike large Internet companies and their optimization, and I will share with you, I hope you know what is the high-concurrency current-limiting special effects of Java development spike large Internet companies and their optimization. If you feel that you have gained something, you can click to follow the collection and forward a wave, thank you for your support!

  • 1. Write more code and type more code, good code and solid basic knowledge must be practiced

  • 2. You can go to Baidu to search Tencent Classroom Turing Academy's video to learn a practical case of java architecture, which is pretty good.

Guess you like

Origin blog.csdn.net/keepfriend/article/details/113849441