Implementation Scheme of Distributed Distributed Lock

At present, almost many large-scale websites and applications are deployed in a distributed manner, and the problem of data consistency in distributed scenarios has always been a relatively important topic. The distributed CAP theory tells us that "no distributed system can satisfy Consistency, Availability and Partition tolerance at the same time, at most two of them can be satisfied at the same time." Therefore, many systems in At the beginning of the design, it is necessary to make a choice between these three. In the vast majority of scenarios in the Internet field, strong consistency needs to be sacrificed in exchange for high system availability. The system often only needs to ensure "eventual consistency", as long as the final time is within the range acceptable to users.

In many scenarios, in order to ensure the eventual consistency of data, we need a lot of technical solutions, such as distributed transactions, distributed locks, etc. Sometimes, we need to ensure that a method can only be executed by the same thread at the same time. In a stand-alone environment, Java actually provides many APIs related to concurrent processing, but these APIs are powerless in distributed scenarios. That is to say, pure Java Api cannot provide the ability of distributed lock. Therefore, there are currently a variety of solutions for the realization of distributed locks.

For the implementation of distributed locks, the following schemes are commonly used:

  • Distributed lock based on database
  • Distributed lock based on cache (redis, memcached, tair)
  • Distributed lock based on Zookeeper

Before analyzing these implementation schemes, let's think about what the distributed locks we need should be like? (The method lock is used as an example here, and the same is true for resource locks)

可以保证在分布式部署的应用集群中,同一个方法在同一时间只能被一台机器上的一个线程执行。

这把锁要是一把可重入锁(避免死锁)

这把锁最好是一把阻塞锁(根据业务需求考虑要不要这条)

有高可用的获取锁和释放锁功能

获取锁和释放锁的性能要好

Distributed lock based on database

Based on database table

To implement distributed locks, the easiest way may be to create a lock table directly, and then do so by manipulating the data in the table.

When we want to lock a method or resource, we add a record to the table, and delete this record when we want to release the lock.

Create such a database table

write picture description here

When we want to lock a method, execute the following SQL:

write picture description here

The above simple implementation has the following problems:

1. This lock is strongly dependent on the availability of the database. The database is a single point. Once the database hangs, the business system will be unavailable.

2. This lock has no expiration time. Once the unlock operation fails, the lock record will always be in the database, and other threads can no longer obtain the lock.

3. This lock can only be non-blocking, because the insert operation of the data will directly report an error once the insertion fails. Threads that have not acquired the lock will not enter the queue. To acquire the lock again, the lock acquisition operation is triggered again.

4. This lock is non-reentrant, and the same thread cannot acquire the lock again until the lock is released. Because the data already exists in the data.

Of course, we can also have other ways to solve the above problems.

Is the database a single point? Engage in two databases, and the data is synchronized in both directions before. Once hung up, quickly switch to the standby database.
No expiration time? Just do a scheduled task, and clean up the timeout data in the database at regular intervals.
non-blocking? Engage in a while loop until the insert is successful and then return to success.
non-reentrant? Add a field to the database table to record the host information and thread information of the machine that currently obtains the lock, then query the database first when acquiring the lock next time. If the host information and thread information of the current machine can be found in the database, directly Just assign the lock to him.

Database exclusive lock

In addition to adding and deleting records in the data table, distributed locks can also be implemented with the help of the locks that come with the data.

We also use the database table we just created. Distributed locks can be implemented through database exclusive locks. Based on MySql's InnoDB engine, you can use the following methods to implement locking operations:

write picture description here

If for update is added after the query statement, the database will add an exclusive lock to the database table during the query process (one more thing to mention here, when the InnoDB engine is locked, it will only use row-level locks when retrieving through the index, otherwise Table-level locks will be used. Here we want to use row-level locks, so we need to add an index to method_name. It is worth noting that this index must be created as a unique index, otherwise there will be multiple overloaded methods that cannot be accessed at the same time. Question. If you overload the method, it is recommended to add the parameter type as well.). When an exclusive lock is added to a record, other threads cannot add an exclusive lock to the row.

We can think that the thread that obtains the exclusive lock can obtain the distributed lock. When the lock is obtained, the business logic of the method can be executed. After the method is executed, it can be unlocked by the following methods:

write picture description here

Release the lock through the connection.commit() operation.

This method can effectively solve the above-mentioned problems of inability to release locks and blocking locks.

blocking lock? The for update statement will return immediately after the execution is successful, and will be blocked when the execution fails until it succeeds.
After the lock, the service is down and cannot be released? In this way, the database will release the lock itself after the service goes down.
However, it still cannot directly solve the problem of database single point and reentrancy.

There may be another problem here, although we use a unique index on method_name and show using for update to use row-level locking. However, MySql will optimize the query. Even if the index field is used in the condition, whether to use the index to retrieve data is determined by MySQL by judging the cost of different execution plans. If MySQL thinks that the full table scan is more efficient, such as For some very small tables, it will not use indexes, in which case InnoDB will use table locks instead of row locks. It would be tragic if that happened. . .

Another problem is that we need to use exclusive locks to lock distributed locks. If an exclusive lock is not submitted for a long time, it will occupy the database connection. Once there are too many similar connections, the database connection pool may burst.

Summarize

Summarize the way to use the database to implement distributed locks. These two ways are dependent on a table of the database. One is to determine whether there is currently a lock by the existence of records in the table, and the other is to determine whether there is a current lock through the database. Distributed locks are implemented using exclusive locks.

The advantages of implementing distributed locks in databases

It is easy to understand directly with the help of the database.

Disadvantages of implementing distributed locks in databases

There will be all kinds of problems that will make the whole scheme more and more complicated in the process of solving them.

Operating the database requires a certain amount of overhead, and performance issues need to be considered.

Using database row-level locks is not necessarily reliable, especially when our lock table is not large.

Distributed lock based on cache

Compared with the scheme of implementing distributed locks based on the database, the implementation based on the cache will perform better in terms of performance. And many caches can be deployed in clusters, which can solve single-point problems.

There are many mature caching products, including Redis, memcached, and Tair within our company.

Here we take Tair as an example to analyze the solution of using cache to realize distributed lock. There are many related articles on the Internet about Redis and memcached, and there are also some mature frameworks and algorithms that can be used directly.

The implementation of distributed locks based on Tair is similar to that of Redis. The main implementation method is to use the TairManager.put method.

write picture description here

The above implementation also has several problems:

1. This lock has no expiration time. Once the unlock operation fails, the lock record will always be in the tail, and other threads can no longer obtain the lock.

2. This lock can only be non-blocking, and returns directly regardless of success or failure.

3. This lock is non-reentrant. After a thread acquires the lock, it cannot acquire the lock again before releasing the lock, because the key used already exists in the tair. Put operations can no longer be performed.

Of course, there are also ways to solve it.

No expiration time? The put method of tair supports passing in the expiration time, after which the data will be automatically deleted.
non-blocking? while repeats execution.
non-reentrant? After a thread acquires the lock, it saves the current host information and thread information, and checks whether it is the owner of the current lock before acquiring it next time.

However, how long should I set the expiration time as well? How to set the invalidation time is too short, the lock will be automatically released before the method is executed, then there will be concurrency problems. If the set time is too long, other threads that acquire the lock may have to wait for a while. This problem also exists using a database to implement distributed locks

Summarize

A cache can be used instead of a database to implement distributed locks, which can provide better performance. At the same time, many cache services are deployed in clusters, which can avoid single-point problems. And many cache services provide methods that can be used to implement distributed locks, such as Tair's put method, redis' setnx method, etc. Moreover, these cache services also provide support for automatic deletion of expired data, and you can directly set a timeout to control the release of locks.

Advantages of using a cache to implement distributed locks

It has good performance and is more convenient to implement.

Disadvantages of using a cache to implement distributed locks

It is not very reliable to control the expiration time of the lock through the timeout period.

Distributed lock based on Zookeeper

Distributed locks that can be implemented by temporary ordered nodes based on zookeeper.

The general idea is: when each client locks a method, a unique instantaneous ordered node is generated in the directory of the designated node corresponding to the method on zookeeper. The way to determine whether to acquire a lock is very simple, you only need to determine the one with the smallest sequence number in the ordered node. When releasing the lock, just delete the transient node. At the same time, it can avoid the deadlock problem that the lock cannot be released due to the service downtime.

Let's see if Zookeeper can solve the problems mentioned above.

The lock cannot be released? Using Zookeeper can effectively solve the problem that the lock cannot be released, because when the lock is created, the client will create a temporary node in ZK. Once the client acquires the lock and suddenly hangs up (the session connection is disconnected), then this temporary node The node is automatically deleted. Other clients can then acquire the lock again.

Non-blocking locks? Using Zookeeper can achieve blocking locks. Clients can create sequential nodes in ZK and bind listeners to the nodes. Once the nodes change, Zookeeper will notify the client, and the client can check whether the node it created is current. The node with the smallest serial number among all nodes, if it is, then it will acquire the lock and execute the business logic.

Not reentrant? Using Zookeeper can also effectively solve the problem of non-reentrancy. When the client creates a node, it directly writes the host information and thread information of the current client into the node. The next time it wants to acquire the lock, it will be the smallest node at present. Compare the data in . If the information is the same as your own, then you can directly acquire the lock, if not, create a temporary sequence node to participate in the queuing.

Single point question? Using Zookeeper can effectively solve single-point problems. ZK is deployed in a cluster. As long as more than half of the machines in the cluster survive, it can provide external services.

You can directly use the zookeeper third-party library Curator client, which encapsulates a reentrant lock service.

write picture description here

The InterProcessMutex provided by Curator is an implementation of distributed locks. The acquire method user acquires the lock, and the release method is used to release the lock.

The distributed lock implemented using ZK seems to fully meet all our expectations for a distributed lock at the beginning of this article. However, it is not. The distributed lock implemented by Zookeeper actually has a disadvantage, that is, the performance may not be as high as that of the cache service. Because every time in the process of creating and releasing locks, instantaneous nodes must be dynamically created and destroyed to realize the lock function. The creation and deletion of nodes in ZK can only be performed through the Leader server, and then the data cannot be shared with all Follower machines.

In fact, the use of Zookeeper may also bring concurrency problems, but it is not common. Considering such a situation, due to network jitter, the client's session connection to the ZK cluster is disconnected, then ZK thinks that the client is hung up, and will delete the temporary node, at which time other clients can obtain distributed locks. Concurrency issues may arise. This problem is not common because zk has a retry mechanism. Once the zk cluster cannot detect the heartbeat of the client, it will retry. The Curator client supports multiple retry strategies. The temporary node will be deleted if it does not work after several retries. (So, it is also more important to choose an appropriate retry strategy, and to find a balance between the granularity of the lock and the concurrency.)

Summarize

Advantages of using Zookeeper to implement distributed locks

Effectively solve single-point problems, non-reentrant problems, non-blocking problems and problems that locks cannot be released. It is simpler to implement.

Disadvantages of using Zookeeper to implement distributed locks

The performance is not as good as using a cache to implement distributed locks. Requires some understanding of the principles of ZK.

Comparison of the three schemes

None of the above methods can be perfect. Just like CAP, it cannot be satisfied at the same time in terms of complexity, reliability, performance, etc. Therefore, it is the kingly way to choose the most suitable one according to different application scenarios.

From an ease of understanding perspective (low to high)
Database > Cache > Zookeeper

From an implementation complexity perspective (low to high)
Zookeeper >= cache > database

From a performance perspective (high to low)
cache > Zookeeper >= database

From a reliability perspective (high to low)
Zookeeper > Cache > Database

Reprinted from:
https://www.cnblogs.com/austinspark-jessylu/p/8043726.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324643744&siteId=291194637