Slipped Conditions

The so-called Slipped conditions, that is, from a thread checking a certain condition to the thread operating the condition, this condition has been changed by other threads, causing the first thread to perform the wrong operation on the condition. Here is a simple example:

public class Lock {
    private boolean isLocked = true;

    public void lock(){
      synchronized(this){
        while(isLocked){
          try{
            this.wait();
          } catch(InterruptedException e){
            //do nothing, keep waiting
          }
        }
      }

      synchronized(this){
        isLocked = true;
      }
    }

    public synchronized void unlock(){
      isLocked = false;
      this.notify();
    }
}

As we can see, the lock() method contains two synchronized blocks. The first synchronized block executes the wait operation until isLocked becomes false and then exits, and the second synchronized block sets isLocked to true to lock the Lock instance to prevent other threads from passing through the lock() method.

We can imagine that if isLocked is false at a certain time, at this time, there are two threads accessing the lock method at the same time. If the first thread enters the first synchronized block first, it will find that isLocked is false at this time. If the second thread is allowed to execute at this time, it also enters the first synchronized block and also finds that isLocked is false. Both threads now check this condition to false, and they both proceed to the second synchronized block and set isLocked to true.

This scenario is an example of slipped conditions, where two threads check the same condition and then exit the synchronized block, thus allowing other threads to check the condition before the two threads change the condition. In other words, the condition has been changed by other threads before the condition is checked by a thread until the condition is changed by this thread.

To avoid slipped conditions, the checking and setting of the condition must be atomic, that is, during the time that the first thread checks and sets the condition, no other thread checks the condition.

The solution to the above problem is very simple, just simply move the isLocked = true line of code into the first synchronized block, after the while loop:

public class Lock {
    private boolean isLocked = true;

    public void lock(){
      synchronized(this){
        while(isLocked){
          try{
            this.wait();
          } catch(InterruptedException e){
            //do nothing, keep waiting
          }
        }
        isLocked = true;
      }
    }

    public synchronized void unlock(){
      isLocked = false;
      this.notify();
    }
}

Checking and setting the isLocked condition is now performed atomically in the same synchronized block.

a more realistic example

You might say that I couldn't have written such frustrating code, and that slipped conditions is a fairly theoretical problem. But the first simple example is just to better demonstrate slipped conditions.

饥饿和公平中实现的公平锁也许是个更现实的例子。再看下嵌套管程锁死中那个幼稚的实现，如果我们试图解决其中的嵌套管程锁死问题，很容易产生 slipped conditions 问题。首先让我们看下嵌套管程锁死中的例子：

//Fair Lock implementation with nested monitor lockout problem
public class FairLock {
  private boolean isLocked = false;
  private Thread lockingThread = null;
  private List waitingThreads =
            new ArrayList();

  public void lock() throws InterruptedException{
    QueueObject queueObject = new QueueObject();

    synchronized(this){
      waitingThreads.add(queueObject);

      while(isLocked || waitingThreads.get(0) != queueObject){

        synchronized(queueObject){
          try{
            queueObject.wait();
          }catch(InterruptedException e){
            waitingThreads.remove(queueObject);
            throw e;
          }
        }
      }
      waitingThreads.remove(queueObject);
      isLocked = true;
      lockingThread = Thread.currentThread();
    }
  }

  public synchronized void unlock(){
    if(this.lockingThread != Thread.currentThread()){
      throw new IllegalMonitorStateException(
        "Calling thread has not locked this lock");
    }
    isLocked      = false;
    lockingThread = null;
    if(waitingThreads.size() > 0){
      QueueObject queueObject = waitingThread.get(0);
      synchronized(queueObject){
        queueObject.notify();
      }
    }
  }
}
public class QueueObject {}

我们可以看到 synchronized(queueObject)及其中的 queueObject.wait()调用是嵌在 synchronized(this)块里面的，这会导致嵌套管程锁死问题。为避免这个问题，我们必须将 synchronized(queueObject)块移出 synchronized(this)块。移出来之后的代码可能是这样的：

//Fair Lock implementation with slipped conditions problem
public class FairLock {
  private boolean isLocked = false;
  private Thread lockingThread  = null;
  private List waitingThreads =
            new ArrayList();

  public void lock() throws InterruptedException{
    QueueObject queueObject = new QueueObject();

    synchronized(this){
      waitingThreads.add(queueObject);
    }

    boolean mustWait = true;
    while(mustWait){

      synchronized(this){
        mustWait = isLocked || waitingThreads.get(0) != queueObject;
      }

      synchronized(queueObject){
        if(mustWait){
          try{
            queueObject.wait();
          }catch(InterruptedException e){
            waitingThreads.remove(queueObject);
            throw e;
          }
        }
      }
    }

    synchronized(this){
      waitingThreads.remove(queueObject);
      isLocked = true;
      lockingThread = Thread.currentThread();
    }
  }
}

注意：因为我只改动了 lock()方法，这里只展现了 lock 方法。

现在 lock()方法包含了 3 个同步块。

第一个，synchronized(this)块通过 mustWait = isLocked || waitingThreads.get(0) != queueObject 检查内部变量的值。

第二个，synchronized(queueObject)块检查线程是否需要等待。也有可能其它线程在这个时候已经解锁了，但我们暂时不考虑这个问题。我们就假设这个锁处在解锁状态，所以线程会立马退出 synchronized(queueObject)块。

第三个，synchronized(this)块只会在 mustWait 为 false 的时候执行。它将 isLocked 重新设回 true，然后离开 lock()方法。

设想一下，在锁处于解锁状态时，如果有两个线程同时调用 lock()方法会发生什么。首先，线程 1 会检查到 isLocked 为 false，然后线程 2 同样检查到 isLocked 为 false。接着，它们都不会等待，都会去设置 isLocked 为 true。这就是 slipped conditions 的一个最好的例子。

解决 Slipped Conditions 问题

要解决上面例子中的 slipped conditions 问题，最后一个 synchronized(this)块中的代码必须向上移到第一个同步块中。为适应这种变动，代码需要做点小改动。下面是改动过的代码：

//Fair Lock implementation without nested monitor lockout problem,
//but with missed signals problem.
public class FairLock {
  private boolean isLocked = false;
  private Thread lockingThread  = null;
  private List waitingThreads =
            new ArrayList();

  public void lock() throws InterruptedException{
    QueueObject queueObject = new QueueObject();

    synchronized(this){
      waitingThreads.add(queueObject);
    }

    boolean mustWait = true;
    while(mustWait){
      synchronized(this){
        mustWait = isLocked || waitingThreads.get(0) != queueObject;
        if(!mustWait){
          waitingThreads.remove(queueObject);
          isLocked = true;
          lockingThread = Thread.currentThread();
          return;
        }
      }     

      synchronized(queueObject){
        if(mustWait){
          try{
            queueObject.wait();
          }catch(InterruptedException e){
            waitingThreads.remove(queueObject);
            throw e;
          }
        }
      }
    }
  }
}

我们可以看到对局部变量 mustWait 的检查与赋值是在同一个同步块中完成的。还可以看到，即使在 synchronized(this)块外面检查了 mustWait，在 while(mustWait)子句中，mustWait 变量从来没有在 synchronized(this)同步块外被赋值。当一个线程检查到 mustWait 是 false 的时候，它将自动设置内部的条件（isLocked），所以其它线程再来检查这个条件的时候，它们就会发现这个条件的值现在为 true 了。

synchronized(this)块中的 return;语句不是必须的。这只是个小小的优化。如果一个线程肯定不会等待（即 mustWait 为 false），那么就没必要让它进入到 synchronized(queueObject)同步块中和执行 if(mustWait)子句了。

细心的读者可能会注意到上面的公平锁实现仍然有可能丢失信号。设想一下，当该 FairLock 实例处于锁定状态时，有个线程来调用 lock()方法。执行完第一个 synchronized(this)块后，mustWait 变量的值为 true。再设想一下调用 lock()的线程是通过抢占式的，拥有锁的那个线程那个线程此时调用了 unlock()方法，但是看下之前的 unlock()的实现你会发现，它调用了 queueObject.notify()。但是，因为 lock()中的线程还没有来得及调用 queueObject.wait()，所以 queueObject.notify()调用也就没有作用了，信号就丢失掉了。如果调用 lock()的线程在另一个线程调用 queueObject.notify()之后调用 queueObject.wait()，这个线程会一直阻塞到其它线程调用 unlock 方法为止，但这永远也不会发生。

公平锁实现的信号丢失问题在饥饿和公平一文中我们已有过讨论，把 QueueObject 转变成一个信号量，并提供两个方法：doWait()和 doNotify()。这些方法会在 QueueObject 内部对信号进行存储和响应。用这种方式，即使 doNotify()在 doWait()之前调用，信号也不会丢失