Distributed lock-high concurrency optimization practice (segmented locking idea)!

Original: Distributed lock high concurrency optimization practice in the scenario of thousands of orders per second! [The Architecture Notes of Shishan]

Background introduction

E-commerce company: If you use distributed locks to prevent inventory oversold when placing orders, but it is a high concurrency scenario with thousands of orders per second, how to optimize distributed locks to deal with this scenario?

In the actual production, the distributed lock guarantees the accuracy of the data, but its natural concurrency ability is a bit weak.

The occurrence of oversold inventory

The occurrence of oversold inventory

Suppose that the order system is deployed on two machines, and different users have to buy 10 iPhones at the same time, and each sends a request to the order system.
Each order system instance sends SQL to the database to place an order, and then deducts 10 inventory, one of which deducts the inventory from 12 to 2, and the other deducts the inventory from 2 to -8.

Use distributed locks to solve the problem of inventory oversold

The implementation principle of distributed locks: with the same lock key, only one client can obtain the lock at the same time, and other clients will fall into infinite waiting to try to obtain that lock. Only the client that has obtained the lock can perform the following services logic.
Implementation logic of distributed locks
Avoid oversold.
Why can I avoid oversold
Only one instance of the order system can successfully add a distributed lock, and then only one instance can check the inventory (when placing an order), determine whether the inventory is sufficient, place an order to deduct the inventory, and then release the lock.

After the lock is released, another order system instance can be locked, and then check the inventory, and found that there are only 2 units in the inventory, and the inventory is insufficient to purchase, and the order fails. Will not deduct the inventory to -8.

Problems of distributed locks in high concurrency scenarios

Once the distributed lock is added, an order request for the same product will cause all clients to lock the inventory lock key of the same product.

For example, when placing an order for an item called iphone, the lock key "iphone_stock" must be locked. This will lead to orders for the same product, which must be serialized and processed one by one .

Assuming that after the lock is locked and before the lock is released, check inventory -> create order -> deduct inventory. The performance of this process is very high. If the whole process is 20 milliseconds, this should be good.

Then 1 second is 1000 milliseconds, and only 50 requests for this product can be processed sequentially in sequence.
Serial processing

Defect: When multiple users place orders for the same product at the same time, it will be serialized based on distributed locks, which makes it impossible to process a large number of orders for the same product at the same time.

This kind of scheme may be acceptable if it deals with the ordinary small e-commerce system with low concurrency and no spike scenarios.

High concurrency optimization for distributed locks

In fact, it is very simple to say it. After reading the source code and underlying principle of ConcurrentHashMap in java, you should know the inside 核心思路: 分段加锁!

Divide the data into many segments, and each segment is a separate lock, so when multiple threads come to modify the data concurrently, they can modify the data of different segments concurrently. Needless to say, only one thread can exclusively modify the data in ConcurrentHashMap at the same time.

In addition, a LongAdder class has been added to Java 8, which is also optimized for AtomicLong prior to Java 7. The solution is that CAS operations in high concurrency scenarios use the idea of ​​optimistic locking, which will cause a large number of threads to repeat loops for a long time.

A similar segmented CAS operation is also used in LongAdder. If it fails, it will automatically migrate to the next segment for CAS.

In fact, the optimization idea of ​​distributed locks is similar. We used this solution in production in another business scenario before, not in the problem of oversold inventory.

But the business scenario of oversold inventory is good and easy to understand, so let's use this scenario to illustrate.
Segmented locking

分段加锁思想. If you have 1000 iPhones in inventory, you can split them into 20 inventory segments. If you want, you can create 20 inventory fields in the database table. Each inventory segment is 50 pieces of inventory. For example, stock_01 corresponds to 50. Piece inventory, stock_02 corresponds to 50 pieces inventory. Similarly, you can also put 20 inventory keys in places like redis.

Then, 1000 requests/s, using a simple random algorithm, each request is randomly selected from the 20 segmented inventory, and one is selected for locking.

Each order request locks an inventory segment, and then in the business logic, you can operate the segment inventory in the database or Redis, including checking inventory -> judging whether the inventory is sufficient -> deducting inventory.

It is equivalent to 20 milliseconds, which can process 20 order requests concurrently, so in 1 second, 20 * 50 = 1000 order requests for iphone can be processed in turn.

Once you have done segment processing on a piece of data, you have to automatically release the lock at this time , then immediately change to the next segment inventory, try to lock it again and try processing. This process must be realized.有一个坑一定要注意:就是如果某个下单请求,咔嚓加锁,然后发现这个分段库存里的库存不足了,此时咋办?

Insufficiency of distributed lock concurrency optimization scheme

The biggest shortcoming is very inconvenient and too complicated to implement.

  • First of all, you have to store a piece of data in segments. An inventory field is already good, but now it has to be divided into 20 segmented inventory fields;
  • Secondly, every time you process inventory, you have to write your own random algorithm and randomly select a segment to process;
  • Finally, if there is not enough data in a certain segment, you have to automatically switch to the next segment for processing.

This process is implemented by manually writing code, which is still a bit of work and quite troublesome.

But we do in some business scenarios, because distributed locks are used, and then we must optimize the lock concurrency, and further use the technical solution of segmented locking. Of course, the effect is very good, and the concurrent performance is all of a sudden. It can grow dozens of times.

Subsequent improvements to the optimization scheme

Take the inventory oversold scenario we mentioned in this article as an example. If you play like this, you will make yourself very painful!

Again, our inventory oversold scenario here is just a demonstration scenario. I have the opportunity in the future to talk about other solutions for inventory oversold under the high concurrency spike system architecture.

Guess you like

Origin blog.csdn.net/eluanshi12/article/details/84616173