Implementation of distributed lock in Redis

Distributed locks should be familiar to everyone. Interviewers like to ask this question during interviews at many major companies.

When we modify existing data in the system, we need to read it first, then modify and save it. At this time, it is easy to encounter concurrency problems. Since modification and saving are not atomic operations, some operations on the data may be lost in concurrent scenarios. In single-server systems, we often use local locks to avoid problems caused by concurrency. However, when services are deployed in a cluster, local locks cannot take effect between multiple servers. At this time, distributed locks are needed to ensure data consistency. accomplish.

accomplish

Redis lock mainly uses the setnx command of Redis.

Locking command: SETNX key value, when the key does not exist, set the key and return success, otherwise it returns failure. KEY is the unique identifier of the lock, and the name is generally determined according to the business.
Unlock command: DEL key, releases the lock by deleting the key-value pair so that other threads can acquire the lock through the SETNX command.
Lock timeout: EXPIRE key timeout, Set the timeout period of the key to ensure that even if the lock is not explicitly released, the lock can be automatically released after a certain period of time to avoid resources being locked forever.

The locking/unlocking pseudocode is as follows:

if (setnx(key, 1) == 1){
    expire(key, 30)
    try {
        //TODO 业务逻辑
    } finally {
        del(key)
    }
}

There are some problems with the above lock implementation:

1. Non-atomicity of SETNX and EXPIRE

If SETNX is successful, after setting the lock timeout, the server hangs, restarts, or has network problems, etc., resulting in the EXPIRE command not being executed, and the lock becoming deadlocked without setting the timeout.

You can use lua script to solve this problem, example:

if (redis.call('setnx', KEYS[1], ARGV[1]) < 1)
then return 0;
end;
redis.call('expire', KEYS[1], tonumber(ARGV[2]));
return 1;

// 使用实例
EVAL "if (redis.call('setnx',KEYS[1],ARGV[1]) < 1) then return 0; end; redis.call('expire',KEYS[1],tonumber(ARGV[2])); return 1;" 1 key value 100

2. Lock misunderstanding release

If thread A successfully acquires the lock and sets the expiration time of 30 seconds, but the execution time of thread A exceeds 30 seconds, the lock will be automatically released upon expiration. At this time, thread B acquires the lock; then after execution of A is completed, thread A uses the DEL command to The lock is released, but at this time the lock added by thread B has not yet been completed, and thread A actually releases the lock added by thread B.

By setting the current thread lock identifier in value, verify the value corresponding to the key before deleting it to determine whether the lock is held by the current thread. A UUID can be generated to identify the current thread, and a lua script can be used to verify the identification and unlock the operation.

// 加锁
String uuid = UUID.randomUUID().toString().replaceAll("-","");
SET key uuid NX EX 30
// 解锁
if (redis.call('get', KEYS[1]) == ARGV[1])
    then return redis.call('del', KEYS[1])
else return 0
end

3. Timeout unlocking leads to concurrency

If thread A successfully acquires the lock and sets the expiration time to 30 seconds, but the execution time of thread A exceeds 30 seconds, the lock will be automatically released upon expiration. At this time, thread B acquires the lock, and thread A and thread B execute concurrently.

Concurrency between threads A and B is obviously not allowed. There are generally two ways to solve this problem:

Set the expiration time long enough to ensure that the code logic can be executed before the lock is released.
Add a daemon thread for the thread that acquires the lock, and add a valid time for the lock that is about to expire but has not been released. ( Automatic renewal )

4. No reentrancy

When a thread requests a lock again while holding the lock, if a lock supports multiple locks by a thread, then the lock is reentrant. If a non-reentrant lock is re-locked, re-locking will fail because the lock is already held. Redis can count locks by reentrancy, adding 1 when locking, decrementing by 1 when unlocking, and releasing the lock when the count reaches 0.

Record the number of reentries locally. For example, use ThreadLocal in Java to count the number of reentries. Simple example code:

private static ThreadLocal<Map<String, Integer>> LOCKERS = ThreadLocal.withInitial(HashMap::new);
// 加锁
public boolean lock(String key) {
  Map<String, Integer> lockers = LOCKERS.get();
  if (lockers.containsKey(key)) {
    lockers.put(key, lockers.get(key) + 1);
    return true;
  } else {
    if (SET key uuid NX EX 30) {
      lockers.put(key, 1);
      return true;
    }
  }
  return false;
}
// 解锁
public void unlock(String key) {
  Map<String, Integer> lockers = LOCKERS.get();
  if (lockers.getOrDefault(key, 0) <= 1) {
    lockers.remove(key);
    DEL key
  } else {
    lockers.put(key, lockers.get(key) - 1);
  }
}

Although it is efficient to record the number of reentries locally, it will increase the complexity of the code if the expiration time and local and Redis consistency issues are taken into account. Another way is to use the Redis Map data structure to implement distributed locks, which not only stores the lock identification but also counts the number of reentries. Redission lock example:

// 如果 lock_key 不存在
if (redis.call('exists', KEYS[1]) == 0)
then
    // 设置 lock_key 线程标识 1 进行加锁
    redis.call('hset', KEYS[1], ARGV[2], 1);
    // 设置过期时间
    redis.call('pexpire', KEYS[1], ARGV[1]);
    return nil;
    end;
// 如果 lock_key 存在且线程标识是当前欲加锁的线程标识
if (redis.call('hexists', KEYS[1], ARGV[2]) == 1)
    // 自增
    then redis.call('hincrby', KEYS[1], ARGV[2], 1);
    // 重置过期时间
    redis.call('pexpire', KEYS[1], ARGV[1]);
    return nil;
    end;
// 如果加锁失败，返回锁剩余时间
return redis.call('pttl', KEYS[1]);

5. Unable to wait for lock release

The above command execution returns immediately. If the client can wait for the lock to be released, it cannot be used.

This problem can be solved by client polling. When the lock is not acquired, wait for a period of time to reacquire the lock until the lock is successfully acquired or the wait times out. This method consumes more server resources and will affect the efficiency of the server when the amount of concurrency is large.
Another way is to use the publish and subscribe function of Redis. When acquiring the lock fails, subscribe to the lock release message, and when the lock is successfully acquired and released, send the lock release message. as follows:

Redis also has distributed locks such as Redlock

Since the use of this method is controversial (there are problems in extreme cases), this article will not introduce it for the time being!

cluster

1. Active/standby switching

In order to ensure the availability of Redis, it is generally deployed in master-slave mode. There are two methods of master-slave data synchronization, asynchronous and synchronous. Redis records the instructions in the local memory buffer, and then asynchronously synchronizes the instructions in the buffer to the slave node. The slave node executes the synchronous instruction stream to achieve the same state as the master node. , while feeding back the synchronization status to the master node.

In a cluster deployment method that includes master-slave mode, when the master node fails, the slave node will take its place, but the client will not notice it. When client A successfully locks, the instructions have not yet been synchronized. At this time, the master node hangs up and the slave node is promoted to the master node. The new master node has no locked data. When client B locks, it will succeed.

2. Cluster split brain

Cluster split brain means that due to network problems, the Redis master node, the slave node and the sentinel cluster are in different network partitions. Because the sentinel cluster cannot sense the existence of the master, it promotes the slave node to the master node. At this time, there are two different master node. Redis Cluster cluster deployment method is the same.

When different clients connect to different master nodes, two clients can hold the same lock at the same time. as follows:

Conclusion

Redis is known for its high performance, but there are still some difficulties in using it to implement distributed locks to solve concurrency. Redis distributed locks can only be used as a means to alleviate concurrency. If you want to completely solve the concurrency problem, you still need anti-concurrency means in the database.