1. Non-atomic operations (setnx + expire)
When it comes to Redis
the distributed lock implemented, many friends immediately think of setnx+ expire
commands. In other words, setnx
it is used to , and then expire
set an expiration time for the lock after the lock is grabbed.
The pseudo code is as follows:
if(jedis.setnx(lock_key,lock_value) == 1){
//加锁
jedis.expire(lock_key,timeout); //设置过期时间
doBusiness //业务逻辑处理
}
This piece of code has pitfalls , because it is written separately setnx
from the two commands, and it is not an atomic operation! If the lock expire
is just about to be executed and the expiration time is about to be executed, the process may restart for maintenance, then the lock will be " immortal ", and other threads will never be able to acquire the lock.setnx
expire
crash
2. Overwritten by other client requests (setnx + value is the expiration time)
In order to solve: the problem that the lock cannot be released when an exception occurs . Some friends suggested that the expiration time can be put in setnx
it value
. If the lock fails, take out value
the value and the current system time to check whether it is expired. The pseudo code is implemented as follows:
long expireTime = System.currentTimeMillis() + timeout; //系统时间+设置的超时时间
String expireTimeStr = String.valueOf(expireTime); //转化为String字符串
// 如果当前锁不存在,返回加锁成功
if (jedis.setnx(lock_key, expireTimeStr) == 1) {
return true;
}
// 如果锁已经存在,获取锁的过期时间
String oldExpireTimreStr = jedis.get(lock_key);
// 如果获取到的老的预期过期时间,小于系统当前时间,表示已经过期了
if (oldExpireTimreStr != null && Long.parseLong(oldExpireTimreStr) < System.currentTimeMillis()) {
//锁已过期,获取上一个锁的过期时间,并设置现在锁的过期时间(不了解redis的getSet命令的小伙伴,可以去官网看下哈)
String oldValueStr = jedis.getSet(lock_key, expireTimeStr);
if (oldValueStr != null && oldValueStr.equals(oldExpireTimreStr)) {
//考虑多线程并发的情况,只有一个线程的设置值和当前值相同,它才可以加锁
return true;
}
}
//其他情况,均返回加锁失败
return false;
}
This kind of implementation scheme also has pitfalls: if multiple clients request at the same time when the lock expires, all of them will be executed. In the end, only jedis.getSet()
one client can successfully lock, but the expiration time of the client lock may vary . Overwritten by other clients .
3. Forgot to set the expiration time
In the previous review
code, I saw the distributed lock implemented in this way, the pseudocode :
try{
if(jedis.setnx(lock_key,lock_value) == 1){
//加锁
doBusiness //业务逻辑处理
return true; //加锁成功,处理完业务逻辑返回
}
return false; //加锁失败
} finally {
unlock(lockKey);- //释放锁
}
What's wrong with this piece? Yes, forgot to set the expiration time . If the machine suddenly hangs up during the running of the program, the code level has not reached finally
the code block, that is, the lock has not been deleted before the shutdown. In this case, there is no way to guarantee unlocking, so lockKey
an expiration time needs to be added here. Note that when using distributed locks, you must set an expiration time .
4. After the business is processed, forget to release the lock
Many small partners will use Redis
the set
instruction extension parameters to implement distributed locks.
set指令扩展参数:SET key value[EX seconds][PX milliseconds][NX|XX]
- NX :表示key不存在的时候,才能set成功,也即保证只有第一个客户端请求才能获得锁,
而其他客户端请求只能等其释放锁,才能获取。
- EX seconds :设定key的过期时间,时间单位是秒。
- PX milliseconds: 设定key的过期时间,单位为毫秒
- XX: 仅当key存在时设置值
The small partner will write the following pseudocode:
if(jedis.set(lockKey, requestId, "NX", "PX", expireTime)==1){
//加锁
doBusiness //业务逻辑处理
return true; //加锁成功,处理完业务逻辑返回
}
return false; //加锁失败
This piece of pseudo-code, at first glance, I think there is nothing wrong with it, but after thinking about it, it is not quite right. Because I forgot to release the lock ! If you have to wait until the timeout period before releasing the lock every time the lock is successfully acquired , there will be problems. This program is not efficient, and the lock should be released every time the business logic is processed .
For example:
try{
if(jedis.set(lockKey, requestId, "NX", "PX", expireTime)==1){
//加锁
doBusiness //业务逻辑处理
return true; //加锁成功,处理完业务逻辑返回
}
return false; //加锁失败
} finally {
unlock(lockKey);- //释放锁
}
5. B's lock is released by A
Let's look at this piece of pseudocode:
try{
if(jedis.set(lockKey, requestId, "NX", "PX",expireTime)==1){
//加锁
doBusiness //业务逻辑处理
return true; //加锁成功,处理完业务逻辑返回
}
return false; //加锁失败
} finally {
unlock(lockKey); //释放锁
}
What pits do you think there will be ?
Suppose in such a concurrency scenario:
A、B
two threads try tolockKey
lock the key of Redis, andA
the thread gets the lock first (if the lock timeout3
expires in seconds). IfA
the business logic executed by the thread is time-consuming,3
it still has not been executed after more than a second. At this time, the lockRedis
will be released automaticallylockKey
. Just at this time, when the threadB
comes over, it can grab the lock and start executing its business logic. At this time, when the threadA
finishes executing the logic and releases the lock, itB
releases the lock.
The correct way should be, when using set
the extended parameter to lock, put one more unique tag for this thread request , for example requestId
, when releasing the lock, judge whether it is the request just now .
try{
if(jedis.set(lockKey, requestId, "NX", "PX",expireTime)==1){
//加锁
doBusiness //业务逻辑处理
return true; //加锁成功,处理完业务逻辑返回
}
return false; //加锁失败
} finally {
if (requestId.equals(jedis.get(lockKey))) {
//判断一下是不是自己的requestId
unlock(lockKey);//释放锁
}
}
6. When releasing the lock, it is not atomic
The above piece of code still has pitfalls:
if (requestId.equals(jedis.get(lockKey))) {
//判断一下是不是自己的requestId
unlock(lockKey);//释放锁
}
Because judging whether it is a lock added by the current thread and releasing the lock is not an atomic operation . If unlock(lockKey)
the release lock is called, the lock has expired, so the lock may no longer belong to the current client, and the lock added by others will be released .
Therefore, the pit is: 判断和删除
there are two operations, not atomic, and there is a consistency problem. 释放锁必须保证原子性
, can be Redis+Lua
done using scripts, similar Lua
scripts are as follows:
if redis.call('get',KEYS[1]) == ARGV[1] then
return redis.call('del',KEYS[1])
else
return 0
end;
7. The lock expires and is released, and the business is not completed
After locking, if the timeout expires, Redis
the lock will be automatically released and cleared. In this way, the lock may be released in advance before the business is processed . How to do it?
Some friends think that it is enough to set the lock expiration time a little longer. In fact, let's imagine whether it is possible to start a timing daemon thread for the thread that acquires the lock, and check whether the lock still exists every once in a while. If it exists, the expiration time of the lock will be extended to prevent the lock from being released early.
The current open source framework Redisson solves this problem. Let's take a look at Redisson
the underlying schematic:
As long as the thread is locked successfully, a watchdog 一
will be started . It is a background thread that will check every second. If thread 1 still holds the lock, the life time of the lock will be continuously extended . Therefore, it is used to solve the problem that the lock expires and is released, and the business is not completed .watch dog
10
key
Redisson
Redisson
8. Redis distributed lock and @transactional use invalid
Let's take a look at this pseudocode:
@Transactional
public void updateDB(int lockKey) {
boolean lockFlag = redisLock.lock(lockKey);
if (!lockFlag) {
throw new RuntimeException(“请稍后再试”);
}
doBusiness //业务逻辑处理
redisLock.unlock(lockKey);
}
In the transaction, Redis
a distributed lock is used. Once this method is executed, the transaction takes effect, and then Redis
the distributed lock takes effect. After the code is executed, Redis
the distributed lock is released first, and then the transaction data is submitted, and finally the transaction ends. In this process, before the transaction is committed, the distributed lock has been released, causing the distributed lock to fail
This is because:
spring
YesAop
, the transaction willupdateDB
be opened before the method, and then the lock will be added. After the locked code is executed, the transaction will be submitted. Therefore, the locked code block is executed within the transaction, and it can be inferred that when the code block is executed , the transaction has not been committed yet, and the lock has been released. At this time, the code block that is locked after other threads get the lock, the inventory data read is not the latest.
The correct implementation method can be lockedupdateDB
before the method , that is, before the transaction is opened, then the security of the thread can be guaranteed.
9. Locks are reentrant
Redis
The distributed locks discussed above are not reentrant .
The so-called non-reentrant means that the current thread has acquired the lock by executing a certain method, so when trying to acquire the lock again in the method, it will be blocked and the lock cannot be acquired again. The same person can take a lock only once and not at the same
2
time.
Non-reentrant distributed locks can satisfy most business scenarios . But sometimes in some business scenarios, we still need reentrant distributed locks . In the process of implementing distributed locks, you need to pay attention to whether your current business scenarios need reentrant distributed locks.
Redis
As long as these two problems are solved, the reentrant lock can be realized :
- How to save the currently held thread
- How to maintain the number of locks (that is, how many times have you re-entered)
To implement a reentrant distributed lock, we can refer to JDK
the ReentrantLock
design idea. In fact, you can use the framework directly Redisson
, which supports reentrant locks.
10. Pit caused by Redis master-slave replication
When implementing Redis
distributed locks, pay attention to Redis
the pitfalls of master-slave replication . Because Redis
it is generally deployed in clusters:
If thread one gets the lock on the node, but the lock Redis
has not been synchronized to the node. Just at this time, a node fails, and a node will be upgraded to a node. Thread two can acquire the same lock, but thread one has already acquired the lock, and the security of the lock is gone.master
key
slave
master
slave
master
key
In order to solve this problem, Redis author antirez proposed an advanced distributed lock algorithm: Redlock
. Redlock
The core idea is this:
Do multiple Redis master deployments to ensure that they don't go down at the same time. And these master nodes are completely independent of each other, and there is no data synchronization between them. At the same time, you need to ensure that the same method is used to acquire and release locks on multiple master instances as on a single instance of Redis.
We assume that there is currently 5
a Redis master
node 5
running these instances on a server Redis
.
The implementation steps of RedLock are as follows:
- Get the current time in milliseconds.
- Request locks from nodes
5
in order .master
The client sets the network connection and response timeout period, and the timeout period should be less than the expiration time of the lock. (Assuming that the automatic lock expiration time is10
seconds, the timeout period is generally between5-50
milliseconds, let's assume that the timeout period is50ms
right). If it times out, skip themaster
node and try the nextmaster
node as soon as possible. - The client uses the current time to subtract the start time of acquiring the lock (that is,
1
the time recorded in the step) to obtain the time used to acquire the lock. If and only if more than half (N/2+1
, here is5/2+1=3
a node) ofRedis master
the nodes have acquired the lock, and the use time is less than the lock expiration time, the lock is considered successful. (as shown above,10s> 30ms+40ms+50ms+4m0s+50ms
) - If the lock is acquired,
key
the real effective time of the lock will change, and the time used to acquire the lock needs to be subtracted. - If the lock acquisition fails (the lock is not acquired at least in
N/2+1个master
the instance, or the lock acquisition time has exceeded the valid time), the client needs tomaster
unlock on all nodes (even if somemaster
nodes have not been successfully locked at all, they still need to be unlocked, so as to To prevent some slipping through the net).
The simplified steps are:
- Request locks from 5 master nodes in sequence
- Judging according to the set timeout period, whether to skip the master node.
- If more than or equal to 3 nodes are successfully locked, and the use time is less than the validity period of the lock, it can be determined that the lock is successful.
- If acquiring the lock fails, unlock it!