High Concurrency - Solution

Concurrency and Parallelism

Concurrency: When there are multiple threads operating, if the system has only one CPU, it is impossible for it to actually run more than one thread at the same time. It can only divide the CPU running time into several time periods, and then assign the time periods to Each thread executes, and other threads are suspended while the thread code is running for a period of time. . We call this way concurrent (Concurrent).

Parallel: When the system has more than one CPU, the operations of the threads may be non-concurrent. When one CPU executes a thread, the other CPU can execute another thread, and the two threads do not preempt each other's CPU resources and can proceed at the same time. This method is called parallel.

Difference: Concurrency and parallelism are two concepts that are both similar and different. Concurrency means that two or more events occur at the same time; while concurrency means that two or more events occur within the same time interval. In a multiprogramming environment, concurrency means that macroscopically, multiple programs are running at the same time in a period of time, but in a single-processor system, only one program can be executed at each moment, so microscopically, these programs only It can be performed alternately in time-sharing. If there are multiple processors in the computer system, these programs that can be executed concurrently can be allocated to multiple processors to achieve parallel execution, that is, each processor is used to process a program that can be executed concurrently. programs can be executed simultaneously.

What is high concurrency?

High concurrency means that the design ensures that the system can process many requests in parallel at the same time.
Such as the spike scene.

What is a spike?
The seckill scene is generally encountered when some events are held on e-commerce websites or when tickets are grabbed on the 12306 website during holidays. For some scarce or special products on e-commerce websites, e-commerce websites generally sell them in limited quantities at the appointed time. Because of the particularity of these products, they will attract a large number of users to snap up and buy them at the same time at the appointed time. Kill the page to snap up.

Seckill architecture design concept
Current limiting: Since only a small number of users can successfully seckill, most traffic should be restricted, and only a small portion of traffic should be allowed to enter the backend of the service.
Peak clipping: There will be a large influx of users in the instant kill system, so there will be a high instantaneous peak at the beginning of the panic buying. High peak traffic is an important reason for overwhelming the system, so how to change the instantaneous high traffic into a stable traffic for a period of time is also an important idea for designing a spike system. Common methods to achieve peak clipping include the use of technologies such as caching and message middleware.
Asynchronous processing: The seckill system is a high-concurrency system. The use of asynchronous processing mode can greatly increase the concurrency of the system. In fact, asynchronous processing is an implementation method of peak shaving.
Memory cache: The biggest bottleneck of the seckill system is generally database read and write. Since database read and write belongs to disk IO, the performance is very low. If part of the data or business logic can be transferred to the memory cache, the efficiency will be greatly improved.
Scalable: Of course, if we want to support more users and greater concurrency, it is best to design the system to be flexible and scalable. If traffic comes, it’s good to expand the machine. During Double Eleven events such as Taobao and JD.com, a large number of machines will be added to cope with the transaction peak.

Architecture scheme
General spike system architecture


The design idea
intercepts requests upstream of the system to reduce downstream pressure: The spike system is characterized by a large amount of concurrency, but the actual number of requests that succeed in spikes is very small, so if it is not intercepted at the front end, it is likely to cause database read-write lock conflicts, or even lead to death. lock, the final request times out. 
Make full use of the cache: The use of the cache can greatly improve the system read and write speed. 
Message queue: The message queue can cut the peak and will intercept a large number of concurrent requests. This is also an asynchronous processing process. The background business actively pulls request messages from the message queue for business processing according to its own processing capabilities.

Front-end solution
browser side (js):
page static: all static elements on the active page are static, and dynamic elements are minimized. Anti-peak through CDN. 
Repeated submission is prohibited: the button is grayed out after the user submits, and repeated submission is prohibited. 
User current limitation: users are only allowed to submit a request once within a certain period of time. For example, IP current limitation can be adopted.

Backend
solution server controller layer (gateway layer)
limitation uid (UserID) access frequency: We have intercepted browser access requests above, but for some malicious attacks or other plug-ins, the server-side control layer needs to target the same access uid to limit the access frequency.

Only a part of the access requests are intercepted on the service layer
. When the number of users in the second kill is large, even if there is only one request per user, the number of requests to the service layer is still very large. For example, we have 100W users grabbing 100 mobile phones at the same time, and the concurrent request pressure of the service layer is at least 100W.
Use message queue to cache requests: Since the service layer knows that there are only 100 mobile phones in stock, there is no need to pass 100W requests to the database. Then you can write these requests to the message queue cache first, and the database layer subscribes to messages to reduce inventory , the request to reduce the inventory successfully returns the success of the spike, and the unsuccessful request returns the end of the spike.
Use cache to deal with read requests: For ticket-buying services such as 12306, it is a typical read-more-write-less service, and most of the requests are query requests, so the cache can be used to share the database pressure.
Use the cache to respond to write requests: The cache can also respond to write requests. For example, we can transfer the inventory data in the database to the Redis cache. All inventory reduction operations are performed in Redis, and then the user in Redis is processed through the background process. The seckill request is synchronized to the database.

Database layer
The database layer is the most vulnerable layer. Generally, requests need to be intercepted upstream during application design. The database layer only undertakes access requests within the "capacity range". Therefore, by introducing queues and caches in the service layer, the bottom-level database can rest easy.

Case: Implementing a simple seckill system using message middleware and cache
Redis is a distributed cache system that supports multiple data structures. We can use Redis to easily implement a powerful seckill system.
We can use the simplest key-value data structure in Redis, using an atomic variable value (AtomicInteger) as the key, the user id as the value, and the inventory quantity is the maximum value of the atomic variable. For each user's seckill, we use the RPUSH key value to insert seckill requests, and when the number of seckill requests inserted reaches the upper limit, all subsequent insertions are stopped.
Then we can start multiple worker threads on the platform, use the LPOP key to read the user id of the seckill winner, and then operate the database to do the final order reduction operation.
Of course, the above Redis can also be replaced with message middleware such as ActiveMQ, RabbitMQ, etc., or the cache and message middleware can be combined. The cache system is responsible for receiving and recording user requests, and the message middleware is responsible for synchronizing the requests in the cache to the database.

Summary:
When encountering high concurrency, three aspects must first be considered:
1. Storage medium
2. Data consistency
3. Computer hardware

In the front-end and back-end processing methods,
front-end solutions:

prohibit repeated clicks, static processing of
pages ,
user current limit,
prohibit repeated submissions and other operations.

Back- end solution:
use message queue to cache requests for requests, and
use caches to respond to read requests
, use caches to respond to write requests and other operations.
Limit the frequency of uid access and database operations for database applications. The

business logic layer considers:

For example, adding business locks under the normal process can differentiate the business chain without affecting the operation of the business chain




Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325648341&siteId=291194637