Architect's Diary - Redis-based Distributed Lock Implementation

I talked about locks in concurrent The lock mechanism of concurrent programming: synchronized and lock . In a single-process system, when there are multiple threads that can change a variable at the same time, it is necessary to synchronize the variable or code block so that it can execute linearly when modifying such a variable to eliminate concurrent modification of the variable. The essence of synchronization is achieved through locks. In order to realize that multiple threads can only execute the same code block by one thread at a time, a mark needs to be made somewhere. This mark must be visible to every thread. When the mark does not exist, the mark can be set , and the other subsequent threads find that there is already a mark, and wait for the thread with the mark to end the synchronization code block to cancel the mark and then try to set the mark.

In a distributed environment, the problem of data consistency has always been a relatively important topic, but it is different from the case of a single process. The biggest difference between distributed and stand-alone is that it is not multi-threaded but multi-process. Since multiple threads can share heap memory, they can simply take memory as the mark storage location. Processes may not even be on the same physical machine, so the tag needs to be stored in a place that all processes can see.

A common scenario is the seckill scenario, where the order service deploys multiple instances. For example, there are 4 seckill products, the first user buys 3, and the second user buys 2. Ideally, the first user can buy successfully, and the second user prompts that the purchase fails, and vice versa. The actual situation that may occur is that both users get an inventory of 4, and the first user buys 3. Before updating the inventory, the second user placed an order for 2 items, and the updated inventory was 2, resulting in an error.

In the above scenario, the inventory of goods is a shared variable. In the face of high concurrency, it is necessary to ensure that the access to resources is mutually exclusive. In a stand-alone environment, Java actually provides many APIs related to concurrent processing, but these APIs are powerless in distributed scenarios. That is to say, pure Java Api cannot provide the ability of distributed lock. In a distributed system, due to the distributed nature of the distributed system, that is, multi-threading and multi-process and distributed in different machines, the two locks of synchronized and lock will lose the effect of the original lock, and we need to implement distributed locks ourselves.

Common locking schemes are as follows:

  • Distributed lock based on database
  • Based on cache, implement distributed locks, such as redis
  • Distributed lock based on Zookeeper

Below we briefly introduce the implementation of these types of locks.

database based

There are also two ways to implement database-based locks, one is based on database tables, and the other is based on database exclusive locks.

Additions and deletions based on database tables

The easiest way to add or delete a database table is to first create a lock table that mainly contains the following fields: method name, timestamp and other fields.

The specific method used, when a method needs to be locked, insert a related record into the table. It should be noted here that the method name has a unique constraint. If multiple requests are submitted to the database at the same time, the database will ensure that only one operation can succeed, then we can consider that the thread that succeeded in the operation has obtained the lock of the method. , the content of the method body can be executed.

After the execution is complete, you need to delete the record.

Of course, this is just a brief introduction here. The above scheme can be optimized, such as applying master-slave database, two-way synchronization between data. Once it hangs up, quickly switch to the standby database; do a scheduled task, and clean up the timeout data in the database at regular intervals; use a while loop until the insert is successful and then return successfully, although this is not recommended; you can also record the current The host information and thread information of the machine that obtained the lock, then query the database first when acquiring the lock next time. If the host information and thread information of the current machine can be found in the database, just assign the lock to him directly. Reentrant lock.

Database exclusive lock

We can also implement distributed locks through database exclusive locks. Based on MySql's InnoDB engine, you can use the following methods to implement locking operations:

public void lock(){
    connection.setAutoCommit( false )
     int count = 0 ;
     while (count < 4 ){
         try {
             select * from  lock  where lock_name=xxx for update;
             if (the result is not empty){
                 //represents the lock 
                return ;
            }
        }catch(Exception e){

        }
        //If it is empty or throws an exception, it means that the lock is not acquired 
        sleep( 1000 );
        count++;
    }
    throw new LockException();
}

Add for update after the query statement, and the database will add an exclusive lock to the database table during the query process. When an exclusive lock is added to a record, other threads cannot add an exclusive lock to the row. Others that do not acquire the lock will be blocked on the above select statement. There are two possible results. The lock is acquired before the timeout, and the lock is not acquired before the timeout.

The thread that obtains the exclusive lock can obtain the distributed lock. When the lock is obtained, the business logic of the method can be executed. After the method is executed, the lock is released connection.commit().

The existing problems are mainly low performance and abnormal sql timeout.

Advantages and disadvantages of database lock-based

The above two methods are dependent on a table of the database. One is to determine whether there is a current lock by the existence of the records in the table, and the other is to realize the distributed lock by the exclusive lock of the database.

  • The advantage is that it is simple and easy to understand directly with the help of the database.
  • The disadvantage is that operating the database requires a certain amount of overhead, and performance issues need to be considered.

Based on Zookeeper

Distributed locks that can be implemented by temporary ordered nodes based on zookeeper. When each client locks a method, a unique instantaneous ordered node is generated in the directory of the specified node corresponding to the method on zookeeper. The way to determine whether to acquire a lock is very simple, you only need to determine the one with the smallest sequence number in the ordered node. When releasing the lock, just delete the transient node. At the same time, it can avoid the deadlock problem that the lock cannot be released due to the service downtime.

The third-party library provided is curator , and readers can take a look at the specific use. The InterProcessMutex provided by Curator is an implementation of distributed locks. The acquire method acquires the lock, and the release method releases the lock. In addition, problems such as lock release, blocking lock, and reentrant lock can be effectively solved. Let’s talk about the implementation of blocking locks. The client can create sequential nodes in ZK and bind a listener to the node. Once the node changes, Zookeeper will notify the client, and the client can check whether the node it created is currently owned by the client. The node with the smallest serial number can execute business logic if it acquires the lock.

Finally, the distributed lock implemented by Zookeeper actually has a disadvantage, that is, the performance may not be as high as that of the cache service. Because every time in the process of creating and releasing locks, instantaneous nodes must be dynamically created and destroyed to realize the lock function. The creation and deletion of nodes in ZK can only be performed through the Leader server, and then the data cannot be shared with all Follower machines. For concurrency problems, there may be network jitter, and the session connection between the client and the ZK cluster is disconnected. The ZK cluster thinks that the client hangs up and deletes the temporary node. At this time, other clients can obtain distributed locks.

cache based

Compared with the scheme of implementing distributed lock based on database, the implementation based on cache will perform better in terms of performance, and the access speed will be much faster. And many caches can be deployed in clusters, which can solve single-point problems. There are several types of cache-based locks, such as memcached and redis. This article mainly explains the distributed implementation based on redis.

Redis-based distributed lock implementation

SETNX

Using redis's SETNX to implement distributed locks, multiple processes execute the following Redis commands:

SETNX lock.id <current Unix time + lock timeout + 1>

SETNX sets the value of the key to value if and only if the key does not exist. If the given key already exists, SETNX does nothing.

  • Returns 1, indicating that the process has acquired the lock, and SETNX sets the value of the key lock.id to the timeout time of the lock, the current time + the effective time of the lock.
  • Returns 0, indicating that other processes have acquired the lock, and the process cannot enter the critical section. A process can keep trying SETNX operations in a loop to acquire a lock.

deadlock problem

SETNX implements distributed locks, and there may be deadlocks. Compared with the lock in the stand-alone mode, in a distributed environment, it is not only necessary to ensure that the process is visible, but also the network problem between the process and the lock needs to be considered. After a thread acquires the lock, it disconnects from Redis, the lock is not released in time, and other threads competing for the lock will hang, resulting in a deadlock situation.

When using SETNX to obtain a lock, we set the value of the key lock.id to the valid time of the lock. After the thread obtains the lock, other threads will continue to detect whether the lock has timed out. If it times out, the waiting thread will also have the opportunity to obtain the lock. Lock. However, the lock times out and we cannot simply use the DEL command to delete the key lock.id to release the lock.

Consider the following scenarios:

  1. A has obtained the lock lock.id first, and then the line A is disconnected. Both B and C are waiting to compete for the lock;
  2. B, C read the value of lock.id, compare the current time with the value of the key lock.id to determine whether it has timed out, and find that it has timed out;
  3. B executes the DEL lock.id command, and executes the SETNX lock.id command, and returns 1, and B obtains the lock;
  4. Since C has just detected that the lock has timed out, it executes the DEL lock.id command, deletes the key lock.id just set by B, executes the SETNX lock.id command, and returns 1, that is, C acquires the lock.

The above steps obviously have a problem, causing B and C to acquire the lock at the same time. After a lock timeout is detected, the thread cannot simply DEL delete the key to acquire the lock.

For the improvement of the above steps, the problem lies in the operation of deleting the key, so how to improve it after acquiring the lock?

First look at the GETSET operation of redis, GETSET key valuewhich sets the value of the given key to value and returns the old value of the key. Using this operation command, we improve the above steps.

  1. A has obtained the lock lock.id first, and then the line A is disconnected. Both B and C are waiting to compete for the lock;
  2. B, C read the value of lock.id, compare the current time with the value of the key lock.id to determine whether it has timed out, and find that it has timed out;
  3. B detects that the lock has timed out, that is, the current time is greater than the value of the key lock.id, B will
    GETSET lock.id <current Unix timestamp + lock timeout + 1>set timestamp, and judge whether the process has acquired the lock by comparing whether the old value of the key lock.id is less than the current time;
  4. B finds that the value returned by GETSET is less than the current time, executes the DEL lock.id command, and executes the SETNX lock.id command, and returns 1, and B obtains the lock;
  5. C executes GETSET and the time obtained is greater than the current time, then continue to wait.

Before the thread releases the lock, that is, executes the DEL lock.id operation, it needs to determine whether the lock has timed out. If the lock has timed out, the lock may have been acquired by another thread, and then directly executing the DEL lock.id operation will cause the lock that other threads have acquired to be released.

an implementation

acquire lock

public boolean lock(long acquireTimeout, TimeUnit timeUnit)throws InterruptedException {
    acquireTimeout = timeUnit.toMillis(acquireTimeout);
    long acquireTime = acquireTimeout + System.currentTimeMillis();
    //使用J.U.C的ReentrantLock
    threadLock.tryLock(acquireTimeout, timeUnit);
    try {
         // try loop 
        while ( true ) {
             // call tryLock 
            boolean hasLock = tryLock();
             if (hasLock) {
                 // get lock successfully 
                return  true ;
            } else if (acquireTime < System.currentTimeMillis()) {
                break;
            }
            Thread.sleep(sleepTime);
        }
    } finally {
        if (threadLock.isHeldByCurrentThread()) {
            threadLock.unlock();
        }
    }

    return false;
}

public boolean tryLock(){

    long currentTime = System.currentTimeMillis();
    String expires = String.valueOf(timeout + currentTime);
    //Set the mutex 
    if (redisHelper.setNx(mutex, expires) > 0 ) {
         //Get the lock and set the timeout
        setLockStatus(expires);
        return true;
    } else {
        String currentLockTime = redisUtil.get(mutex);
        //Check if the lock has timed out 
        if (Objects.nonNull(currentLockTime) && Long.parseLong(currentLockTime) < currentTime) {
             //Get the old lock time and set the mutex
            String oldLockTime = redisHelper.getSet(mutex, expires);
            //Compare the old value with the current time 
            if (Objects.nonNull(oldLockTime) && Objects.equals(oldLockTime, currentLockTime)) {
                 //Acquire the lock and set the timeout
                setLockStatus(expires);
                return true;
            }
        }

        return false;
    }
}

lock calls the tryLock method, the parameters are the timeout time and unit of the acquisition. During the timeout period, the thread acquires the lock operation and spins there until the holder of the spin lock releases the lock.

In the tryLock method, the main logic is as follows:

  • setnx(lockkey, current time + expiration timeout), if it returns 1, the lock is acquired successfully; if it returns 0, the lock is not acquired
  • get(lockkey) gets the value oldExpireTime and compares this value with the current system time. If it is less than the current system time, it is considered that the lock has timed out and other requests can be re-acquired
  • Calculate newExpireTime = current time + expiration timeout time, then getset(lockkey, newExpireTime) will return the value of the current lockkey currentExpireTime
  • Determine whether currentExpireTime and oldExpireTime are equal. If they are equal, the current getset setting is successful and the lock is acquired. If it is not equal, it means that the lock has been acquired by another request, then the current request can directly return to failure, or continue to retry

release lock

public  boolean  unlock () {
     //Only the thread holding the lock can unlock 
    if (lockHolder == Thread.currentThread()) {
         //Determine whether the lock has timed out, and delete the mutex if there is no timeout 
        if (lockExpiresTime > System.currentTimeMillis ()) {
            redisHelper.del(mutex);
            logger.info( "Delete mutex [{}]" , mutex);
        }
        lockHolder = null ;
        logger.info( "Release [{}] lock succeeded" , mutex);

        return true;
    } else {
         throw  new IllegalMonitorStateException( "The thread that has not acquired the lock cannot perform the unlocking operation" );
    }
}

Under the implementation of the above lock acquisition, in fact, the release lock function here is not needed. Interested readers can combine the above code to see why? You can leave a message if you have an idea!

Summarize

This article mainly explains the implementation of distributed locks based on redis. In a distributed environment, the problem of data consistency has always been a relatively important topic, and synchronized and lock locks have lost their effect in a distributed environment. Common lock schemes include database-based distributed locks, cache-based distributed locks, and Zookeeper-based distributed locks. The implementation characteristics of each lock are briefly introduced; then, the paper explores the implementation scheme of redis locks; finally , this article gives the redis distributed lock based on Java implementation, readers can verify it by themselves.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325937888&siteId=291194637