Three sharp tools for high concurrency systems

Table of contents

1. Current limiting

2. Cache

2.1. Cache usage scenarios

3. Downgrade

3.1. What is downgrade?

3.2. Service downgrade method

4. Other high concurrency means 

4.1. Cluster

4.2. Split

4.2.1 Application Splitting

4.2.2 Database

4.3. Staticization

4.4. Peak clipping

4.5. Current limiting

5. Summary

reference


Three powerful tools to protect high-concurrency systems: current limiting, circuit breaker downgrading, and caching

  • Current limiting: Control the request volume of the system to prevent the system from collapsing due to excessive pressure.
  • Cache: Store some commonly used data in memory, reduce the pressure on the database, and improve the response speed of the system.
  • Degradation: When the system cannot bear more requests, some unnecessary functions or services can be consciously shut down to ensure the normal operation of core functions.

1. Current limiting

        Current limiting is one of the three weapons to protect high-concurrency systems. Current limiting is used in many scenarios to limit concurrency and request volume. There are many ways to implement system current limiting, such as token bucket algorithm and leaky bucket algorithm.

Use a current limiting strategy to control the user's request rate

Several commonly used algorithms for current limiting

1) Counter current limiting

If you read the above content carefully, you will find that the example above with a threshold value of 1000 per second is the idea of ​​counter current limiting. The essence of counter current limiting is that after a certain period of time, after the traffic reaches the set limit, in this Before the time period has elapsed, the number of visits exceeding the threshold is refused to be processed. For example, you tell the boss that I only process 10 things in an hour. This is your processing ability, but the leader will give you intermittently within half an hour. You have assigned 10 tasks, and you have reached your limit at this time. In the next half hour, you will refuse to deal with the tasks sent by the leader until the start of the next hour.

2) Funnel current limiting

Funnel current limiting means that in a funnel container, when a request comes, it is placed from the top of the funnel, and the bottom of the funnel will flow out at a certain frequency. When the speed of putting in is greater than the speed of flowing out, the space of the funnel will gradually decrease. If it is 0, the request will be rejected at this time. In fact, it is the example of the pond flowing at the beginning above. The inflow rate is random, and the outflow rate is fixed. When the funnel is full, it actually reaches a smooth stage. Because the outflow is fixed, your inflow is also fixed, which is equivalent to the request passing through at a uniform speed.

2. Cache

2.1. Cache usage scenarios

  • Data that often needs to be read

  • frequently accessed data

  • Hot data cache

  • IO bottleneck data

  • Computing expensive data

  • Data that does not need to be updated in real time

  • The purpose of caching is to reduce access to backend services and reduce the pressure on backend services

 1) CDN cache

  • CDN The full name of is  Content Delivery Network, that is, content distribution network. CDN It is a content distribution network built on the Internet, relying on the edge servers deployed in various places, through the load balancing, content distribution, scheduling and other functional modules of the central platform, so that users can obtain the required content nearby, reduce network congestion, and improve user access response Speed ​​and hit rate. CDNThe key technologies mainly include content storage and distribution technology.

  • CDN It is also a cache itself. It caches the data of the back-end application. When the user wants to access it, it can be CDN obtained directly from the Internet.  It does not need to go to the back-end Nginxor the specific application server  . TomcatStability, if  CDN no data is obtained from the above, and then go to the back-end Nginxcache, Nginx if there is no online, then go to the back-end application server CDNto mainly cache static resources

2) Application cache:

        memory cache

                Cache data in memory, high efficiency, fast speed, application restart cache loss

        disk cache

                The data is cached on the disk. Compared with the memory, the read efficiency is slightly lower than that of the memory, but the cache will not be lost when the application is restarted. 3)

3) Multi-level cache

Data caching at different levels of the entire application system, multi-level caching, to improve access efficiency; for example:浏览器 -> CDN -> Nginx -> Redis -> DB (磁盘、文件系统)

With the continuous increase of business, server performance soon reached the bottleneck

3. Downgrade

The ultimate goal of downgrading is to keep core services available, even if lossy. And some services cannot be downgraded

3.1. What is downgrade?

Service downgrade means that when the pressure on the server increases sharply, some services and pages are strategically downgraded according to the current business situation and traffic, so as to release server resources and ensure the normal operation of core tasks.

3.2. Service downgrade method

Delay service: timing task processing, or mq delay processing.

Page downgrade: all click buttons on the page are grayed out, or the page is adjusted to a static page displaying "The system is under maintenance...".

Close non-core services: For example, e-commerce closes recommendation services, closes shipping insurance, returns and refunds, etc. Just place an order and pay for the core services of the main process.

Write downgrade: For example, flash sales, we can only update and return the Cache, and then asynchronously deduct the inventory to the DB through mq to ensure the final consistency. At this time, the DB can be downgraded to the Cache.

Read degradation: such as multi-level cache mode, if there is a problem with the backend service, it can be downgraded to a read-only cache

4. Other high concurrency means 

4.1. Cluster

  • There is a single application, when the access traffic is too large to support, then it can be deployed in clusters, which is also called "horizontal expansion" of the single application . In the past, one server was deployed to provide services, but now several more are deployed, and the service capability will be greatly reduced. will improve.

  • Multiple servers are deployed, but there can only be one user access entry. For example  , "load balancing"www.evanshare.com is required  . Load balancing is a necessary step after the application cluster is expanded. After the cluster is deployed,   if the user's session state is to be maintained, it needs to Achieve   sharing.sessionsession

4.2. Split

4.2.1 Application Splitting

Splitting of applications: distributed (microservices)

Monolithic application, with the development of business and the increase of application functions, the monolithic application gradually becomes very large. Many people maintain such a system, development, testing, and online will cause big problems, such as code conflicts, code duplication, The logic is confusing, the logic complexity of the code increases, the speed of responding to new requirements decreases, and the hidden risks increase. Therefore, it is necessary to  split the application according to the business dimension and adopt distributed development ;

After the application is split, the original calls in the same process are turned into remote method calls. At this time, some "remote call techniques" httpClient , hessian, dubbo, webservice etc. need to be used;

As business complexity increases, we need to adopt some open source solutions for development to improve development and maintenance efficiency, such as  Dubbo, SpringCloud;

After the application is split, expansion becomes easy. If the system processing capacity cannot keep up at this time, you only need to "add servers" (make a few more clusters for each split service)

4.2.2 Database

1) Database splitting

Database splitting is divided into: "vertical splitting and horizontal splitting (sub-database sub-table)"

 "According to the business dimension, put tables of the same type in one database, and other tables in another database." This method of splitting is called " vertical splitting." various databases

For example, the data of products, orders, and users used to be in one database, but now you can use three databases, which are  "product database, order database, and user database" . In this way, different databases can be deployed on different servers to improve the stand-alone Capacity and performance issues, and also solve the problem of IO competition between multiple tables

According to the characteristics and rules of data rows, "some rows in the table are split into one database, and some other rows are split into another database" . This method of splitting is called "horizontal splitting".

In the process of increasing data volume and traffic in single database and single table, large tables often become performance bottlenecks, so the database needs to be "horizontally split"

2) Read-write separation + master-slave replication

3) Database optimization

4.3. Staticization

For some data with a large number of visits and a low update frequency, you can directly generate static html pages regularly for front-end access instead of accessing jsp

Commonly used static technologies: freemaker1. velocity Timing tasks, generating a static page of the home page every 2 minutes

Static page can greatly improve the access speed first, no need to access the database or cache to obtain data, the browser  html can directly load the page

Static pages can improve the stability of the website. If there is a problem with the program or database, the static pages can still be accessed normally.

4.4. Peak clipping

   Peak shaving is essentially to delay user requests more, filter user access requirements layer by layer, and follow the principle of "the number of requests that finally land on the database should be as small as possible"

  1. Current limiting algorithm : Control access traffic by limiting the number or rate of requests per unit time. Common current limiting algorithms include token bucket algorithm and leaky bucket algorithm.

  2. Cache : Cache part of the data in memory to reduce the number of visits to the database, thereby reducing server pressure.

  3. Load balancing : Distribute access traffic to multiple servers, avoid single server overload, and improve system availability and stability.

  4. Asynchronous processing : Process some time-consuming operations asynchronously, such as sending emails, generating reports, etc., to avoid blocking the main thread and improve the concurrency of the system.

  5. CDN Acceleration : By caching static resources on CDN nodes, users' access to static resources is accelerated and server pressure is reduced

4.5. Current limiting

        Balance supply and demand through peak shaving strategies to ensure the normal operation of the system

       Like a funnel, try to filter and reduce the amount of data and requests layer by layer. Users who need to query can query the required information in the cache without having to query the database every time. The last thing left is For customers who really need to make deals, greatly reduce the reading and writing pressure of DB

1) The core idea of ​​hierarchical filtering

  •  By filtering out invalid requests as much as possible at different levels.
  •  Filter out a large number of pictures and requests for static resources through CDN.
  •  Then through a distributed cache like Redis, filter requests, etc. are typical upstream interception of read requests.

2) Basic principles of hierarchical filtering

  •  Time-based reasonable sharding of write data to filter out expired invalidation requests.
  •  Provide current limiting protection for write requests, and filter out requests that exceed the system's carrying capacity.
  •  The read data involved does not perform strong consistency checks to reduce bottlenecks caused by consistency checks.
  •  A strong consistency check is performed on the written data, and only the last valid data is kept.

        In the end, it is the valid request that makes the end of the "funnel" (database) . For example: when the user actually reaches the order and payment process, this requires strong data consistency.

5. Summary

High concurrency is one of the factors that must be considered in the architecture design of Internet distributed systems.

There are two main methods to improve the system concurrency capability: vertical scaling (Scale Up) and horizontal scaling (Scale Out) . The former vertical expansion can improve concurrency by improving the performance of stand-alone hardware or the performance of stand-alone architecture, but stand-alone performance always has a limit. The ultimate solution for high-concurrency Internet distributed architecture design is the latter: horizontal expansion .

At the same time, combine some strategies to achieve the purpose of diversion and current limitation

1. For high-concurrency scenarios such as seckill, the most basic principle is to intercept requests upstream of the system to reduce downstream pressure . If it is not intercepted at the front end, it is likely to cause database (mysql, oracle, etc.) read-write lock conflicts, or even deadlocks, and eventually avalanches and other scenarios may occur.

2. Divide static and dynamic resources, and use CDN for static resources to distribute services.

3. Make full use of the cache (redis, etc.): increase the QPS, thereby increasing the throughput of the entire cluster.

4. High peak traffic is a very important reason for crushing the system, so the RocketMQ message queue is required to undertake the instantaneous traffic peak at one end and push the message out smoothly at the other end.

The above are some superficial views after personal work, recorded and shared with you, if there are any shortcomings, please give me some pointers, thank you very much

reference

[1] In high concurrency scenarios, how do you implement system current limiting?

[2] Introduction to three commonly used current limiting schemes for distributed high-concurrency services

[3]  Brother Nine's Blog - Use of High Concurrency RateLimiter

Guess you like

Origin blog.csdn.net/qq_20957669/article/details/130763179