Solutions to Technical Problems of Hot Data Writing (Inventory Deduction) in Big Promotion Scenarios

"Solutions to Technical Problems of Hot Data Writing (Inventory Deduction) in Big Promotion Scenarios"

 

It's been a long time since there wasn't enough time to quiet myself down to write a technical article, and indeed in recent years, most of it has been spent on work and new work for 2017. Today, it is rare that I gave myself a bottle of 100ML of chicken blood, and published a related case of writing optimization of hot data in the scene of the big promotion of the trading system some time ago. Of course, different companies have different solutions and implementations, but the same thing remains the same. For a large website, its architecture must be simple and clear, not dazzlingly complicated, after all It is most effective to solve the problem in the most direct way to the point, otherwise things will only get worse .

 

In most cases, the product inventory is directly deducted in the relational database, so after the limited-time snap-up event officially starts, those hot-selling products with a unit price that is more powerful and attractive than usual will definitely participate actively and enthusiastically. Snap-buying, which will inevitably generate a large number of concurrent update operations for the same row of records in the database. Therefore, in order to ensure the atomicity of the database, the InnoDB engine locks the same row of data records by default, and turns the front-end concurrent requests into serial operations to ensure the consistency of data updates.

 

1. Deduction of commodity inventory in RDBMS

Let's first take a look at how to avoid oversold goods if the inventory is directly deducted in the database? In the production environment, we can avoid this problem through the optimistic locking mechanism. The so-called optimistic locking, in short, is to create a version field in the item table. Assuming that the actual inventory of a hot-selling product is n, it is not recommended to add for update for inventory query operations due to performance considerations. In a concurrent scenario, multiple users will inevitably get the same stock and version. Therefore, after the first user successfully deducts the product inventory, it is necessary to add 1 to the version in the item table. In this way, when the second user deducts the inventory, because the version does not match, in order to increase the inventory deduction The success rate can be retried appropriately. If the inventory is insufficient, it means that the product has been sold out. Otherwise, the version continues to increase by 1 after deducting the inventory. Pseudo-code for deducting inventory using optimistic locking in the database looks like this:

 

public void testStock(int num) {
if (threshold for retry times when versions are inconsistent) {
    SELECT stock,version FROM item WHERE item_id=1;
if (if the specified item in the query exists) {
if (judging whether stock is enough to deduct) {
             UPDATE item SET version=version+1,stock=stock-1 WHERE
                          item_id=1 AND version="+ version +";
if (Failed to deduct inventory) {
/* Start trying to retry when the version is inconsistent */
testStock(--num);
} else {
logger.info("Successfully deducted inventory");
}
} else {
logger.warn("The specified item is sold out");
}
}
}
}

 

If the front end of the system does not cooperate with processing such as current limiting and peak reduction, and allows a large number of concurrent update requests to directly deduct the inventory data of the same hot-selling item in the database, this will cause threads to compete with each other for InnoDB row locks . The update operation for the same row of data is performed serially, so before a thread does not release the lock, the rest of the threads will all be blocked in the queue waiting to acquire the lock. The higher the concurrency, the more threads will be waiting. , which will seriously affect the TPS of the database, resulting in a linear rise in RT, which may eventually lead to an avalanche of the system.

 

2. Deducting inventory in Redis
The row lock feature of InnoDB is actually a double-edged sword with equally obvious advantages and disadvantages. It reduces availability while ensuring consistency. So how to ensure that large concurrent updates of hot data are not It will cause the database to become a bottleneck , which is actually one of the core technical problems in the spike and panic buying scenarios . You can try to transfer the inventory deduction operation of hot-selling items to outside the database. Since the read/write capability of Redis is far superior to any type of relational database, implementing inventory deduction in Redis will be a good alternative. In this way, the commodity inventory stored in the database can be understood as the actual inventory, while the commodity inventory stored in Redis is the real-time inventory.

 

When deducting the inventory of hot-selling products in Redis, some students may have questions, how does Redis ensure consistency? How can we do not oversell and buy less? The answer is the Watch command provided by Redis to implement optimistic locking . Similar to the optimistic locking mechanism based on MySQL, in a concurrent environment, after the target Key is marked with the Watch command, when the transaction is committed, if the value corresponding to the target Key is monitored to have occurred If there is a change, it means that the version number has changed, so the transaction commit operation this time fails, as shown in Figure 1:

Figure 1 Using Redis optimistic locking to reduce commodity inventory 

 

Deducting the inventory of hot-selling items in Redis is mainly for the following 2 purposes:

1. The first is to avoid the problem that in RDBMS , multi-threads compete with each other for the row lock of the InnoDB engine, resulting in an increase in RT , a decrease in TPS , and eventually an avalanche;

2. The second is to be able to use the inherently efficient read / write capabilities of Redis to improve the overall throughput of the system.

 

3. Use the "split" technique to subtly improve the success rate of inventory deduction

Here I will share with you a business scenario of the author's company. Due to the characteristics of special agents, our time-limited purchases on the hour are often explosive models + large inventory (stocks ranging from tens of thousands to hundreds of thousands), and we all know that the peak of time-limited purchases is actually It is a spike, and it also comes with a large inventory. Compared with the ordinary seckill scenario, since the inventory is not much, if the upstream system cooperates with the trading system to take measures such as capacity expansion, current limiting protection, isolation (business isolation, data isolation, and system isolation), dynamic and static separation, localCache, etc., the seckill In this scenario, the vast majority of traffic can be blocked upstream of the system, allowing user traffic to decrease layer by layer like a funnel model, so that traffic can always be kept within the capacity that the system can handle.

 

Due to the "perverted" business characteristics, the business system not only has to withstand the impact of hundreds of millions of traffic, but also the trading system has to find a way to improve the success rate of inventory deduction when placing an order. This is indeed a challenge for us, because in the production environment A single mistake can have disastrous consequences. We all know that the meaning of architecture is to reconstruct the system in an orderly manner, continuously reduce the "entropy" of the system, and make it continue to improve, but the mistakes of architecture adjustment will be irreversible, especially those that are mature and have a large user scale. 's website.

 

We all know that it is very unacceptable to be able to snap up your favorite products after the seckill event starts, because in the same unit of time, besides you, other users are also placing orders, so for the same The WATCH collision probability of the explosive model will be ruthlessly magnified, and the success rate will naturally decrease. If it is a small inventory, you can directly return the goods that have been sold out, but the inventory of more than 100,000 is so large that users can see it, but it does not seem to be very friendly, and the operation strategy also hopes to eliminate these quickly. Good stock making gimmicks.

 

You don't need to expect to be able to use a certain database to improve both throughput and success rate. First of all, you need to understand that this is a real single-point problem. To ensure consistency, the success rate must be sacrificed. This Conflict, how to resolve it? Our current practice is to split the key of a certain SKU into N corresponding subKeys in Redis. When the inventory service deducts inventory, it routes to different subKeys through the polling routing strategy to reduce the probability of WATCH collision , to achieve the purpose of greatly improving the success rate of order placement , as shown in Figure 2:

Figure 2 Split parentKey into n subKeys

 

I believe that the concept of splitting is clear to everyone. Next, the author will share with you the specific details and some precautions about the splitting operation. The split operation is mainly composed of two parts, the first is the routing component embedded in the inventory service, and the second is the split management service. The task of the routing component is very simple, subscribing to the split rules of the configuration center, and then polling and routing to different subKeys You can make a deduction. The split management service is relatively complex. It is responsible for the split operation of parentKey, and it also needs to handle some related inventory aggregation (subKeys inventory aggregation) and pull-down (re-divide inventory to subKeys) tasks.

 

If it is split, it is necessary to manage the split information. For example, the operation background deducts a large inventory of a parentKey, adjusts the number of splits for a parentKey, and deletes the split rules for a parentKey. These operations all include the following two actions:
1. Inventory aggregation (subKeys inventory aggregation), and set the subKey inventory to 0;
2. Then return the aggregated inventory to the target parentKey;

 

Since aggregation and return are not in the same thing, it would be tragic if for some reason the execution was abnormal. For example, when the aggregation of inventory is successful, the inventory of subKeys has been set to 0, the user cannot place an order normally, but the action of returning the inventory to the parentKey fails, which will lead to fewer products sold, so you need to rely on the following 2 points To try to ensure that many products are sold:
1. Increase the number of retries of Redis in business;
2. If Redis fails, manually intervene to return the inventory after an alarm;

 

Why do you need to distinguish between ordinary user deduction of inventory and operation background deduction of inventory? Because these are two completely different concepts, because the user's deduction of inventory is often limited by business (such as limiting the number of products that a user can buy at one time), but the operation background is different, and sometimes it may be caused by human reasons. The inventory is set to exceed, so a large amount of inventory needs to be deducted. However, if the deducted inventory quantity is greater than the effective inventory held by each subKey, the deduction operation cannot be completed, so we provide a separate deduction for the deduction of the operation background. The subtraction method first aggregates the inventory of subKeys and sets the number of inventory held by the subKey to 0, returns the deducted inventory to the parentKey, and then waits for the inventory to be reassigned to the subKeys. Here we need to pay attention that if a product is particularly explosive, the greater the user concurrency, the longer the time window for aggregation redistribution will be.

 

Occasionally, the inventory number between subKeys may be uneven, so when the inventory held by a subKey is deducted and there is no return inventory for pull-down redistribution, only the inventory deduction action routed to this subKey is required. It will fail, and users will have an unfriendly experience that they can see but cannot buy. Therefore, actions can be taken on the routing component. When the inventory of a certain subKey has been eliminated, the local need to remove the action, and the routing will not be performed next time. to this subKey.

 

Finally, I would like to give you some advice. If the number of parentKey splits is more, the success rate of inventory deduction will be greater. Of course, the number of splits is not as much as possible. Generally speaking, it is enough to split a parentKey into 10-20 subKeys. , compared to the previous one, the success rate of order deduction has been improved by 10-20 times.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326350362&siteId=291194637