Spike business solutions

Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/yjn1995/article/details/100150461

Outline

  Spike business Internet companies in the electricity supplier landmark project business, it is a typical high concurrency scenarios. The main problem is the large number of users in a short time spike influx of business, resulting in huge instantaneous flow for the database, cache performance is a huge challenge.

Technical challenges spike behind the business

Business high concurrency, high-load service

  We usually measure the throughput of a server is the QPS (Query Per Second, the number of requests per second), to address tens of thousands of times per second high concurrency scenarios, this indicator is critical.
  Processing a service request is assumed average response time of 100 ms, while there are 20 servers within the system, set the maximum number of connections 500, the theoretical peak QPS Web system are (idealized calculated): 100,000 (100,000 QPS) It means that one second can be processed 100,000 requests, and "spike" that 5w / s spike appears to be a "paper tiger."
  However, the actual case, in the actual scene high concurrent, high-load state of the server, network bandwidth to be packed, at this time, the average response time is greatly increased. As the number of users, the database connection process increases, context switch need to deal with the more, resulting in server load heavier.

Coupled with high business

  Even more frightening problem is that when an application on the system since the delay becomes unavailable, the user clicks the more frequent, leading to a vicious cycle, "avalanche", because one of the servers hung up, causing traffic dispersed to other normal work machine, again resulting in normal machine also linked, and a vicious cycle, the system will collapse.

How to solve the bottleneck spike

Spike architecture design ideas

The request interceptor system in the upstream, downstream pressure lowering : spike system is characterized by a great amount of concurrency, but the actual spike in the number of requests are rarely successful, so if the interception is not likely to cause the front end database to read and write lock conflicts, and even lead to deadlock, The final request timed out.
Cache full advantage : using the cache read and write system can greatly improve the speed and reduce pressure on the database.
Message middleware : clipping the message queue can intercept a large number of concurrent requests, this is an asynchronous process, according to their background traffic handling capacity from the message queue of the active pull service processing request message.

Front-End Solution

  • Static pages: all can be static elements on the event page of all static and dynamic elements to minimize. To resist the peak by CDN.
  • Do not re-submit: After users submit button is grayed out, prohibit duplicate submission
  • Users limiting: in a certain time period only allow users to submit a request, for example, can take IP limiting

Back-end program

Controller layer

  • Restrict user access to the same frequency: above us blocked access to the browser's request, but for some malicious attacks or other plug-ins, for the same need to uniquely identify a user on the server access control layer, restrict access frequency.

Service Layer

  The above intercepts only a part of the access request, when the number of users is large spike, even if each user has only one request to the service layer is still very large number of requests. For example, we have 100 sets of 100W users to simultaneously grab the phone, the service layer concurrent requests pressure of at least 100W.

  • The use of message queues: You can put these requests are written to the message queue cache it, the database layer subscribe messaging inventory reduction, inventory reduction successful request returns spike success, failure to return the spike end.
  • Using the cache: for example, 11 double spike rush, is a typical reading and writing small business, most of the query request is a request, it can be used to share database cache pressure. Caching can also respond to written requests, such as we can to transfer the inventory data in the database to Redis cache, all reduce inventory operations are carried out in Redis, and then by a background process the user spike Redis requests in sync to the database.

Database layer

  Database layer is most vulnerable layer, application design in general when it is necessary to request the upstream interception off, the database layer bear only access requests "within the capability range". Therefore, by introducing the above queues and buffers in the service layer, so that the bottom of the database peace of mind.
For example: the use of messaging middleware and caching simple spike system
Redis is a distributed caching system supports a variety of data structures, we can use Redis easily a powerful spike system.
  Redis we can use simple key-value data structure, with a variable value (of AtomicInteger) as an atomic type key, the user id as the value, the maximum value of the number of atoms stock is variable. For each user spike, we use RPUSH key value is inserted into spike request, when the insertion spike requests reached the upper limit, stopping all subsequent insertion.
  Then we can start working in Taiwan multiple threads, using LPOP key to read the user id spike successful, and then operate the database to do final orders minus inventory operations.
  Of course, the above can be replaced Redis message middleware such as ActiveMQ, Kafka, etc., may be a combination of cache and messaging middleware, the cache system is responsible for receiving and recording a user request, the request message middleware is responsible for synchronizing to the database cache.

to sum up

The main coping strategies spike (high concurrent) scenario:

  • Limiting: As only a small number of users to spike success, so most of the traffic restrictions, allowing only a small part of the flow into the back-end service.

  • Clipping: spike system for instantaneous influx of large numbers of users have, so start there will be a rush to buy high instantaneous peak. Peak flow system is overwhelmed very important reason, so how high flow instantly becomes smooth flow over time is also very important design ideas spike system. Commonly implemented method are clipped using the messaging middleware technology.

  • Asynchronous processing: spike system is a highly concurrent systems, the use of asynchronous processing mode can greatly improve system concurrency, in fact, is an implementation of asynchronous processing of clipping.

  • Memory cache: the biggest bottleneck in the system are generally spike database read and write, because the database disk read and write belongs to IO, low performance, if we can put some data or business logic into the cache memory, the efficiency will be greatly improved.

  • May expand: Of course, if we want to support more users and greater concurrency, it will be the best system is designed to elastically expand, if traffic comes, expand the machine just fine. It will increase when a large number of machines to deal with high transaction as Taobao, Jingdong double eleven activities.

Guess you like

Origin blog.csdn.net/yjn1995/article/details/100150461