WeChat high-concurrency capital transaction system design scheme - the technical support behind the tens of billions of red envelopes

Recommended comprehensive architecture exchange group: JAVA development senior group Click to join the group! ! !

Keywords: WeChat high-concurrency capital transaction system design scheme - the technical support behind tens of billions of red envelopes

Every year on holidays, the number of WeChat red envelopes sent and received will skyrocket, especially on New Year's Eve. What kind of technical support is needed behind such large-scale and high-peak business needs? How to ensure concurrent performance and financial security with a red envelope scale of tens of billions?
Background Introduction
On January 28, 2017, the first day of the first lunar month, WeChat announced the number of WeChat red packets sent and received by users on New Year's Eve - 14.2 billion, and the peak value of sending and receiving has reached 760,000 per second. How to ensure the concurrency performance and financial security of tens of billions of red envelopes? This presents a super challenge for WeChat. In the face of challenges, based on the analysis of the industry's "seckill" system solutions, WeChat Red Envelope adopts the design of SET, request queuing serialization, and two-dimensional sub-database tables, forming a unique high concurrency and capital security system solution. plan. Practice has proved that the scheme has stable performance and realized zero-fault operation of the system on New Year's Eve.
This article will introduce the high concurrency design scheme of the system behind the tens of billions of red envelopes, including the two major business characteristics of WeChat red envelopes, the technical difficulties of the WeChat red envelope system, the solutions commonly used to solve high concurrency problems, and the high concurrency solution of the WeChat red envelope system. plan.
Two business characteristics
of WeChat red envelopes WeChat red envelopes (especially red envelopes sent in WeChat groups, that is, group red envelopes) are very similar in business form to ordinary online commodity "seckill" activities.
When a user sends a red envelope in the WeChat group, it is equivalent to the listing of ordinary products in the “Seckill” activity; the actions of all users in the WeChat group to grab the red envelope is equivalent to the inventory query in the “Seckill” activity; after the user grabs the red envelope, the red envelope is opened. action, it corresponds to the user's "Seckill" action in the "Seckill" activity.
However, in addition to the above points, WeChat red envelopes also have their own characteristics in terms of business form compared with ordinary commodity "seckill" activities:
First, the WeChat red envelope business has more massive concurrent requirements than ordinary commodity "seckill" activities.
WeChat red envelope users send a red envelope in the WeChat group, which is equivalent to publishing a product "seckill" activity online. Assuming that 100,000 users in the group are sending red envelopes at the same time, it is equivalent to 100,000 "seckill" activities being released at the same time. 100,000 users in the WeChat group grab red envelopes at the same time, which will generate a large number of concurrent requests.
Second, the WeChat red envelope business requires a stricter level of security.
WeChat red envelope business is essentially a capital transaction. WeChat Red Envelope is a merchant of WeChat Pay, which provides capital transfer services.
When a user sends a red envelope, it is equivalent to using WeChat Pay to buy a sum of "money" on the WeChat red envelope merchant, and the delivery address is the WeChat group. When the user's payment is successful, the red envelope is "delivered" to the WeChat group. After the users in the group open the red envelope, the WeChat red envelope provides the service of transferring "money" into the WeChat change of the red envelope user.
Fund trading business has higher security level requirements than ordinary commodity "seckill" activities. Common commodity "Seckill" products are provided by merchants, and the inventory is preset by the merchant. During "Seckill", "oversold" (that is, the actual number of robbed products is more than the planned inventory), "undersold" (that is, "Seckill" can be allowed to exist. The actual number of merchants being robbed is less than the planned inventory). However, for WeChat red envelopes, a red envelope of 100 yuan sent by a user must not be taken out of 101 yuan; when a user sends 100 yuan and only receives 99 yuan, the remaining 1 yuan must be accurately refunded to the user who sent the red envelope after 24 hours of expiration. Neither more nor less.
The above are the two major features of the WeChat red envelope business model.
The technical difficulties of the WeChat red envelope system
Before WeChat red envelope system, the architecture design of a simple and typical commodity "Seckill" system is introduced, as shown in the following figure.

The system consists of access layer, logical service layer, storage layer and cache. Proxy handles request access, Server carries the main business logic, Cache is used to cache inventory quantities, and DB is used for data persistence.
A "seckill" activity corresponds to an inventory record in the DB. When the user "seckill" the product, the main logic of the system lies in the operation of the inventory in the DB. Generally speaking, the operation process of DB has the following three steps:
Lock Inventory
Insert "Seckill" records
Update Inventory
Among them , lock inventory is to avoid the "oversold" situation during concurrent requests. At the same time, these three steps are required to be completed in a transaction (the so-called transaction refers to a series of operations performed as a single logical unit of work, either completely or not at all).
The design difficulty of the "Seckill" system lies in this transaction operation. The commodity inventory is recorded as a row in the DB. When a large number of users "seckill" the same commodity at the same time, the first request to the DB locks this row of inventory records. This lock is occupied by the first request until the first transaction is committed, and all subsequent requests need to be queued. The more users who participate in the "seckill" at the same time, the more requests are sent to the DB concurrently, and the more serious the request queuing is. Therefore, concurrent requests to grab locks are the design difficulties of typical commodity "seckill" systems.
Compared with ordinary commodity "seckill" activities, WeChat red envelope business has the characteristics of massive concurrency and high security level requirements. In the design of the WeChat red envelope system, in addition to concurrent requests for lock grabbing, there are the following two outstanding difficulties:
First, transaction-level operations are of large magnitude. As mentioned above when introducing the characteristics of WeChat red envelope business, under normal circumstances, there will be tens of thousands of WeChat groups sending red envelopes at the same time. This business feature is mapped to the design of the WeChat red envelope system, that is, there are tens of thousands of "concurrent requests to grab locks" at the same time. This makes the pressure on the DB many times greater than that of ordinary single item "inventory" being locked.
Second, the transactional requirements are strict. The WeChat red envelope system is essentially a capital transaction system, which has higher transaction-level requirements than ordinary commodity "seckill" systems.
Common solutions for solving high concurrency problems There are generally the following solutions for
common commodity "seckill" activity systems to solve high concurrency problems:
Solution 1: Use memory operations to replace real-time DB transaction operations.
As shown in Figure 2, the behavior of "real-time deduction of inventory" is moved up to the in-memory cache operation, and the success of the in-memory cache operation is directly returned to the server, and then asynchronously dropped to the DB for persistence.

The advantage of this scheme is that memory operations are used instead of disk operations, which improves concurrent performance.
However, the shortcomings are also obvious. In the case of successful memory operation but failure of DB persistence, or failure of memory cache, DB persistence will lose data, which is not suitable for a capital transaction system such as WeChat red envelope.
The second option is to use optimistic locks instead of pessimistic locks.
Pessimistic locking is a method of concurrency control in relational database management systems. It prevents a transaction from modifying data in a way that affects other users. If an operation performed by a transaction applies a lock to a row of data, only when the transaction releases the lock, other transactions can perform operations that conflict with the lock. Corresponds to the "concurrent request lock grabbing" behavior in the above analysis.
The so-called optimistic locking assumes that concurrent transactions of multiple users will not affect each other during processing, and each transaction can process the part of the data affected by each other without generating locks. Before committing a data update, each transaction checks whether other transactions have modified the data after the transaction has read the data. If other transactions have updates, the committing transaction will be rolled back.
In the commodity "seckill" system, the specific application method of optimistic locking is to maintain a version number in the "inventory" record of the DB. Before updating the "inventory", go to the DB to obtain the current version number. When the transaction that updates the inventory commits, it checks whether the version number has been modified by another transaction. If the version has not been modified, the transaction is committed, and the version number is incremented by 1; if the version number has been modified by other transactions, the transaction is rolled back and an error is reported to the upper layer.
This solution solves the problem of "lock grabbing by concurrent requests" and can improve the concurrent processing capability of the DB.
However, if it is applied to the WeChat red envelope system, there will be the following three problems:
if the red envelope opening adopts optimistic locking, then among the red envelope opening requests that grab the same version number concurrently, only one can successfully open the red envelope, and other requests will return the transaction to the transaction. Rolling and returning fails, reporting an error to the user, and the user experience is completely unacceptable.
If optimistic locking is adopted, some users who open the red envelopes at the same time at the first time will directly return to failure. On the contrary, those users with "slow hands" may succeed in opening the red envelopes due to the reduction of concurrency, which will bring about negative user experience. influence.
If optimistic locking is adopted, a large number of invalid update requests and transaction rollbacks will be brought, causing unnecessary additional pressure on the DB.
Based on the above reasons, the WeChat red envelope system cannot solve the problem of concurrent lock grabbing by using optimistic locking.
High concurrency solution of WeChat red envelope system
Based on the above analysis, WeChat red envelope system adopts the following solutions to solve the high concurrency problem according to the corresponding technical difficulties.
1. The system is vertically SET, and divide and conquer.
When a WeChat red packet user sends a red packet, the WeChat red packet system generates an ID as the unique identification of the red packet. Next, all operations of this red packet, such as sending red packets, grabbing red packets, opening red packets, and querying the details of red packets, are related to this ID.
According to the red envelope ID, the red envelope system divides vertically up and down according to certain rules (such as taking the modulo according to the ID tail number, etc.). After the segmentation, the logical Server servers and DBs in a vertical chain are collectively referred to as a SET.
Each SET is independent of each other and decoupled from each other. And all requests for the same red packet ID, including red packet distribution, red packet grabbing, red packet splitting, checking details, etc., are vertically sticked to the same SET for processing, which is highly cohesive. In this way, the system disperses the huge torrent of all red envelope requests into multiple small streams, which do not affect each other, and divide and conquer, as shown in the figure below.

This solution solves the problem of massive transaction-level operations at the same time, and reduces the mass to a small amount.
2. The logical server layer queues requests to solve the DB concurrency problem.
The red envelope system is a capital transaction system, and the transactional nature of DB operations cannot be avoided, so there will be a problem of "concurrent lock grabbing". However, if the transaction operation reaching the DB (that is, the behavior of removing the red envelope) is not concurrent, but serial, there will be no "concurrent lock grabbing" problem.
According to this idea, in order to make the transaction operation of unpacking the red packet enter the DB serially, it is only necessary to queue the request in a FIFO (first-in, first-out) manner at the server layer to achieve this effect. Therefore, the problem is focused on the design of the FIFO queue of the Server.
The WeChat red envelope system has designed a distributed, lightweight and flexible FIFO queue scheme. The specific implementation is as follows:
First, stick all requests with the same red packet ID to the same server.
The SETization scheme has been introduced above. All requests for the same red packet ID are sticked to the same SET according to the red packet ID. However, in the same SET, there will be multiple servers connected to the same DB at the same time (for disaster tolerance and performance considerations, multiple servers are required to prepare each other and balance the pressure).
In order to make all requests for the same red packet ID stick to the same server, in addition to the SET design, the WeChat red packet system adds a layer of distribution based on the red packet ID hash value, as shown in the figure below.

Second, design a single-machine request queuing scheme.
After all requests to stick to the same server are received by the receiving process, they are queued according to the red packet ID. Then enter the worker process (execute business logic) serially for processing, so as to achieve the effect of queuing, as shown in the following figure.

Finally, add memcached to control concurrency.
In order to prevent the request queue in the server from being overloaded and causing the queue to be downgraded, so that all requests are crowded into the DB, the system adds memcached deployed on the same machine as the server server to control the number of concurrent requests for unpacking the same red envelope.
Specifically, the CAS atomic accumulation operation of memcached is used to control the number of requests that enter the DB to execute the red-envelope transaction at the same time, and the service will be directly refused if the value exceeds the preset value. Used for degraded experience when DB load increases.
Through the above three measures, the system effectively controls the "concurrent lock grabbing" situation of the DB.
3. Double-dimensional database table design to ensure stable system performance
The sub-database table rules of the red envelope system are initially divided into multiple databases and multiple tables according to the hash value of the red envelope ID. As the amount of red envelope data gradually increases, the amount of single-table data also gradually increases. The performance of the DB is related to the amount of data in a single table. When the amount of data in a single table reaches a certain level, the performance of the DB will be greatly reduced, affecting the stability of the system performance. This problem can be solved by adopting the separation of hot and cold, and storing the historical cold data separately from the current hot data.
When dealing with the hot and cold separation of WeChat red packet data, the system adds the dimension of cyclic talent sub-table on the basis of sub-database table based on the red packet ID dimension, forming the feature of dual-dimensional sub-database table.
Specifically, the sub-database table rules are designed like db_xx.t_y_dd, where xx/y are the last three digits of the hash value of the red envelope ID, and the value range of dd is 01~31, representing a maximum of 31 days in a month.
Through this two-dimensional database table method, the problem of performance degradation caused by the expansion of the data volume of the DB single table is solved, and the stability of the system performance is guaranteed. At the same time, on the issue of separation of hot and cold, it makes data relocation simple and elegant.
To sum up, the design of WeChat red envelope system to solve the problem of high concurrency mainly adopts SET-based divide and conquer, request queuing, two-dimensional database table and other solutions, which improves the concurrency performance of a single group of DBs by about 8 times. good effect.
In addition , in order to reduce the burden on DB and prevent useless data from entering DB, when sending red packets, only valid red packets are stored in DB. For example, after red packets are paid (before users grab red packets), they are temporarily stored in the cache!
final summary
The WeChat red envelope system is a high-concurrency capital transaction system. The biggest technical challenge is to ensure concurrent performance and capital security. This brand-new technical challenge cannot be completely solved by the traditional "Seckill" system design. On the basis of analyzing the industry's "seckill" system solutions, WeChat red envelopes adopt the design of SET, request queuing serialization, two-dimensional sub-database table and other designs, forming a unique high-concurrency, capital security system solution, and in the The feasibility has been fully proved in practice during the usual holidays and the Spring Festival in 2015 and 2016, and remarkable results have been achieved. On the New Year's Eve of the Rooster Year in 2017, the peak value of WeChat red envelopes sent and received reached 760,000 per second, and 14.2 billion WeChat red envelopes were sent and received. The performance of the WeChat red envelope system was stable, and the system achieved zero failures on New Year's Eve.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326707629&siteId=291194637