JUC源码学习之AbstractQueuedSynchronizer

源码基于的Oracle JDK版本为：11.0.5

什么是CLH队列

简单理解是一个双向链表，链表中存放的是包含线程在内的信息，队首的是正在执行的线程，后面的是等待执行的线程，如下图所示：

CLH示意图

Node概述

The wait queue is a variant of a “CLH” (Craig, Landin, and Hagersten) lock queue. CLH locks are normally used for spinlocks. We instead use them for blocking synchronizers, but use the same basic tactic of holding some of the control information about a thread in the predecessor of its node. A “status” field in each node keeps track of whether a thread should block. A node is signalled when its predecessor releases. Each node of the queue otherwise serves as a specific-notification-style monitor holding a single waiting thread. The status field does NOT control whether threads are granted locks etc though. A thread may try to acquire if it is first in the queue. But being first does not guarantee success; it only gives the right to contend. So the currently released contender thread may need to rewait.

To enqueue into a CLH lock, you atomically splice it in as new tail. To dequeue, you just set the head field.

        +------+  prev +-----+       +-----+
head    |      | <---- |     | <---- |     |  tail
        +------+       +-----+       +-----+

Insertion into a CLH queue requires only a single atomic operation on “tail”, so there is a simple atomic point of demarcation from unqueued to queued. Similarly, dequeuing involves only updating the “head”. However, it takes a bit more work for nodes to determine who their successors are, in part to deal with possible cancellation due to timeouts and interrupts.

The “prev” links (not used in original CLH locks), are mainly needed to handle cancellation. If a node is cancelled, its successor is (normally) relinked to a non-cancelled predecessor. For explanation of similar mechanics in the case of spin locks, see the papers by Scott and Scherer at http://www.cs.rochester.edu/u/scott/synchronization/

We also use “next” links to implement blocking mechanics. The thread id for each node is kept in its own node, so a predecessor signals the next node to wake up by traversing next link to determine which thread it is. Determination of successor must avoid races with newly queued nodes to set the “next” fields of their predecessors. This is solved when necessary by checking backwards from the atomically updated “tail” when a node’s successor appears to be null. (Or, said differently, the next-links are an optimization so that we don’t usually need a backward scan.)

Cancellation introduces some conservatism to the basic algorithms. Since we must poll for cancellation of other nodes, we can miss noticing whether a cancelled node is ahead or behind us. This is dealt with by always unparking successors upon cancellation, allowing them to stabilize on a new predecessor, unless we can identify an uncancelled predecessor who will carry this responsibility.

CLH queues need a dummy header node to get started. But we don’t create them on construction, because it would be wasted effort if there is never contention. Instead, the node is constructed and head and tail pointers are set upon first contention.

Threads waiting on Conditions use the same nodes, but use an additional link. Conditions only need to link nodes in simple (non-concurrent) linked queues because they are only accessed when exclusively held. Upon await, a node is inserted into a condition queue. Upon signal, the node is transferred to the main queue. A special value of status field is used to mark which queue a node is on.

Node的内部结构如下：

)

Node的状态

代码及注释

/** waitStatus value to indicate thread has cancelled. */
static final int CANCELLED =  1;
/** waitStatus value to indicate successor's thread needs unparking. */
static final int SIGNAL    = -1;
/** waitStatus value to indicate thread is waiting on condition. */
static final int CONDITION = -2;
/**
* waitStatus value to indicate the next acquireShared should
* unconditionally propagate.
*/
static final int PROPAGATE = -3;

状态	简介
SIGNAL	The successor of this node is (or will soon be) blocked (via park), so the current node must unpark its successor when it releases or cancels. To avoid races, acquire methods must first indicate they need a signal, then retry the atomic acquire, and then, on failure, block.
CANCELLED	This node is cancelled due to timeout or interrupt. Nodes never leave this state. In particular, a thread with cancelled node never again blocks.
CONDITION	This node is currently on a condition queue. It will not be used as a sync queue node until transferred, at which time the status will be set to 0. (Use of this value here has nothing to do with the other uses of the field, but simplifies mechanics.)
PROPAGATE	A releaseShared should be propagated to other nodes. This is set (for head node only) in doReleaseShared to ensure propagation continues, even if other operations have since intervened.
0	None of the above

The values are arranged numerically to simplify use. Non-negative values mean that a node doesn’t need to signal. So, most code doesn’t need to check for particular values, just for sign.

The field is initialized to 0 for normal sync nodes, and CONDITION for condition nodes. It is modified using CAS (or when possible, unconditional volatile writes).

具体用途可等后续结合AQS框架整体的功能来看。

`ReentrantLock`概览

从ReentrantLock开始窥探AQS框架，其大致成员变量、方法以及内部类结构如下：

ReentrantLock结构示意图

ReentrantLock里面有一个抽象内部类叫做Sync，继承自AbstractQueuedSynchronizer，这个类有两个子类分别表示两种获取加锁的方式：公平锁和非公平锁。它们间的区别留待后续说面。从上面的图中的①和②可以看出，FairSync与NonfairSync的获取锁的方式不同，释放锁的方法都是一样的，即获取锁的公平与否，体现在如何获取锁上。**默认是非公平锁。**即如代码所言：

// ReentrantLock.java
/**
* Creates an instance of {@code ReentrantLock}.
* This is equivalent to using {@code ReentrantLock(false)}.
*/
public ReentrantLock() {
    sync = new NonfairSync();
}

/**
* Creates an instance of {@code ReentrantLock} with the
* given fairness policy.
*
* @param fair {@code true} if this lock should use a fair ordering policy
*/
public ReentrantLock(boolean fair) {
    sync = fair ? new FairSync() : new NonfairSync();
}

这里使用默认的实现NonfairSync来进行分析。通常，使用ReentrantLock加锁的时候都会调用lock()方法，其实现如下：

// ReentrantLock.java
public void lock() {
		sync.acquire(1);
}

lock()方法的执行后达到的效果如下：

Acquires the lock if it is not held by another thread and returns immediately, setting the lock hold count to one.(没有线程占用此锁，占锁数+1，并立即返回)
If the current thread already holds the lock then the hold count is incremented by one and the method returns immediately.(当前线程已获取此锁，占锁数+1，并立即返回)
If the lock is held by another thread then the current thread becomes disabled for thread scheduling purposes and lies dormant until the lock has been acquired, at which time the lock hold count is set to one.(别的线程已占用此锁，当前线程暂停等待获取锁)

sync.acquire(1)调用的是父类AQS的实现。从这里进入ASQ的源代码：

AQS框架

// AbstractQueuedSynchronizer.java
/**
* Acquires in exclusive mode, ignoring interrupts.  Implemented
* by invoking at least once {@link #tryAcquire},
* returning on success.  Otherwise the thread is queued, possibly
* repeatedly blocking and unblocking, invoking {@link
* #tryAcquire} until success.  This method can be used
* to implement method {@link Lock#lock}.
*
* @param arg the acquire argument.  This value is conveyed to
*        {@link #tryAcquire} but is otherwise uninterpreted and
*        can represent anything you like.
*/
public final void acquire(int arg) {
    if (!tryAcquire(arg) &&
        acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
        selfInterrupt();
}

尝试获取锁

首先调用第一个方法tryAcquire(arg)来尝试获取锁，成功返回true。但这个方法是一个需要子类去实现的方法：

// AbstractQueuedSynchronizer.java

// Main exported methods
/**
* Attempts to acquire in exclusive mode. This method should query
* if the state of the object permits it to be acquired in the
* exclusive mode, and if so to acquire it.
*
* <p>This method is always invoked by the thread performing
* acquire.  If this method reports failure, the acquire method
* may queue the thread, if it is not already queued, until it is
* signalled by a release from some other thread. This can be used
* to implement method {@link Lock#tryLock()}.
*
* <p>The default
* implementation throws {@link UnsupportedOperationException}.
*
* @param arg the acquire argument. This value is always the one
*        passed to an acquire method, or is the value saved on entry
*        to a condition wait.  The value is otherwise uninterpreted
*        and can represent anything you like.
* @return {@code true} if successful. Upon success, this object has
*         been acquired.
* @throws IllegalMonitorStateException if acquiring would place this
*         synchronizer in an illegal state. This exception must be
*         thrown in a consistent fashion for synchronization to work
*         correctly.
* @throws UnsupportedOperationException if exclusive mode is not supported
*/
protected boolean tryAcquire(int arg) {
    throw new UnsupportedOperationException();
}

也就是说具体的实现类为AQS的子类Sync的子类NonfairSync中tryAcquire()的实现。

// ReentrantLock.java
static final class NonfairSync extends Sync {
    private static final long serialVersionUID = 7316153563782823691L;
    protected final boolean tryAcquire(int acquires) {
        return nonfairTryAcquire(acquires);
    }
}
// 最终的实现方法
final boolean nonfairTryAcquire(int acquires) {
    final Thread current = Thread.currentThread();
    // 获取当前状态，volatile变量state
    int c = getState();
    if (c == 0) { // 说明没有线程获取到该锁
        // 此处可能发生有多个线程同时执行
        // 通过CAS设置值，保证当前state是在为0的情况下才设置成acquires，即只有一个线程能够执行if为true时的语句
        if (compareAndSetState(0, acquires)) {
            // 设置正在执行的线程为当前线程
            setExclusiveOwnerThread(current);
            return true;
        }
    }
    else if (current == getExclusiveOwnerThread()) { // 如果当前线程已获取锁
        int nextc = c + acquires; // state加1
        if (nextc < 0) // overflow。一个线程一直获取该锁的次数超过int的最大值
            throw new Error("Maximum lock count exceeded");
        setState(nextc);
        return true;
    }
    // 如果该锁被其他线程占有，那么直接返回获取锁失败
    return false;
}

获取锁失败后，将该线程等息息加入CLH队列

获取锁失败后，返回false，在acquire(int arg)方法中将执行acquireQueued(addWaiter(Node.EXCLUSIVE), arg))方法，也就是将该线程加入到CLH队列中，首先执行的是addWaiter(Node.EXCLUSIVE)。在这方法中，先创建一个新的Node节点，这里传来的是Node.EXCLUSIVE，即null，作为其后继。即：

/** Establishes initial head or SHARED marker. */
Node() {}

/** Constructor used by addWaiter. */
Node(Node nextWaiter) {
    this.nextWaiter = nextWaiter;
    THREAD.set(this, Thread.currentThread());
}
Node(Thread thread, Node mode) {     // Used by addWaiter
    this.nextWaiter = mode;
    this.thread = thread;
}

接下来是一个死循环，也就是经典的for循环+CAS操作，这个操作的目的就是将当前节点，插入到CLH队列的队尾，也就是入队操作。

// AbstractQueuedSynchronizer.java
/**
* Creates and enqueues node for current thread and given mode.
*
* @param mode Node.EXCLUSIVE for exclusive, Node.SHARED for shared
* @return the new node
*/
private Node addWaiter(Node mode) {
    Node node = new Node(mode);

    for (;;) {
        Node oldTail = tail;
        if (oldTail != null) {
            // 问题1：为什么设置前驱需要CAS操作？
            node.setPrevRelaxed(oldTail);
            if (compareAndSetTail(oldTail, node)) {
                oldTail.next = node;
                return node;
            }
        } else { // CLH队列为空
            // 初始化队列
            // 问题2：为什么初始化队列需要CAS操作？
            initializeSyncQueue();
        }
    }
}

以上代码的作用不难理解，但是仔细思考还是存在两个问题，分别是为什么设置前驱需要CAS操作？以及为什么初始化队列需要CAS操作？。

问题1：为什么设置前驱需要CAS操作？

// 简单读写
final void setPrevRelaxed(Node p) {
    PREV.set(this, p);
}
// PREV是什么
private static final VarHandle PREV;
PREV = l.findVarHandle(Node.class, "prev", Node.class);

PREV是一个VarHandle，可以用来对Node里面的prev属性执行一些操作，如简单读写操作、volatile读写操作、CAS操作等。这是一个jdk8后面的版本才出的一个用来替代Unsafe操作的一个工具。具体的用法会在后续的博客中进行信息的探讨。这里的PREV.set(this, p)并不是一个CAS操作，是一个普通的读写操作。volatile写是PREV.setVolatile()、CAS是PREV.compareAndSet()。所以这是一个误解，这里并不存在CAS操作。

问题2：为什么初始化队列需要CAS操作？

/**
 * Initializes head and tail fields on first contention.
 */
private final void initializeSyncQueue() {
    Node h;
    if (HEAD.compareAndSet(this, null, (h = new Node())))
        tail = h;
}

/**
 * Head of the wait queue, lazily initialized.  Except for
 * initialization, it is modified only via method setHead.  Note:
 * If head exists, its waitStatus is guaranteed not to be
 * CANCELLED.
 */
private transient volatile Node head;

private static final VarHandle HEAD;
HEAD = l.findVarHandle(AbstractQueuedSynchronizer.class, "head", Node.class);

既然是初始化，AQS中的head变量肯定为null。如果不为空，说明已经被别的线程初始化了，CAS操作会失败，从而跳出initializeSyncQueue()，继续进入for(;;)尝试在新的队列中将该Node入队。。这种情况会出现在两个线程同时去获得该锁，且此时该锁没有被任何线程获得（即队列为空），同时执行完了addWaiter()中的if (oldTail != null)语句，因为为null，所以两个线程都转而去执行initializeSyncQueue()的前提下。

让线程暂时停止、休息

final boolean acquireQueued(final Node node, int arg) {
    boolean interrupted = false;
    try {
        for (;;) {
            final Node p = node.predecessor();
            // 前驱为head，尝试获取锁
            if (p == head && tryAcquire(arg)) {
                setHead(node);
                p.next = null; // help GC
                return interrupted;
            }
            // 检查是否可以让当前线程暂时停止
            if (shouldParkAfterFailedAcquire(p, node))
                // 暂时停止 等被唤醒的时候，会为interrupted赋值
                interrupted |= parkAndCheckInterrupt();
        }
    } catch (Throwable t) {
        cancelAcquire(node);
        if (interrupted)
            selfInterrupt();
        throw t;
    }
}

判断当前线程是否可以被暂时停止。如果前驱Node的状态被设置成了Node.SIGNAL，那么可以被停止；否则都不能被停止，返回继续执行acquireQueued()中的for(;;)代码。当不能被停止的时候，只有两种情况，如下：

前驱被取消

这时候就要一直往前找，直到状态是没有被取消的；

前驱ode状态不为Node.SIGNAL

这时候就要先通过CAS的方式将前驱的状态改成Node.SIGNAL。

/**
 * Checks and updates status for a node that failed to acquire.
 * Returns true if thread should block. This is the main signal
 * control in all acquire loops.  Requires that pred == node.prev.
 *
 * @param pred node's predecessor holding status
 * @param node the node
 * @return {@code true} if thread should block
 */
private static boolean shouldParkAfterFailedAcquire(Node pred, Node node) {
    int ws = pred.waitStatus;
    if (ws == Node.SIGNAL)
        /*
            * This node has already set status asking a release
            * to signal it, so it can safely park.
            */
        return true;
    if (ws > 0) { // 前驱被取消
        /*
            * Predecessor was cancelled. Skip over predecessors and
            * indicate retry.
            */
        do {
            node.prev = pred = pred.prev;
        } while (pred.waitStatus > 0);
        pred.next = node;
    } else { // 前驱ode状态不为`Node.SIGNAL`
        /*
            * waitStatus must be 0 or PROPAGATE.  Indicate that we
            * need a signal, but don't park yet.  Caller will need to
            * retry to make sure it cannot acquire before parking.
            */
        pred.compareAndSetWaitStatus(ws, Node.SIGNAL);
    }
    return false;
}

暂停线程，如果被唤醒，将继续执行后面的return Thread.interrupted();，然后继续返回acquireQueued()执行for(;;)里面的语句，如尝试获取锁、寻找可用的前驱、停止线程等。

/**
 * Convenience method to park and then check if interrupted.
 *
 * @return {@code true} if interrupted
 */
private final boolean parkAndCheckInterrupt() {
    LockSupport.park(this);
    return Thread.interrupted();
}

其实后面还有一个catch语句，这里面做的事情是发生异常了，将该node从链表中移除掉，然后再抛出异常。

到这里AQS与ReentrantLock基本上上锁的流程结束了。

释放锁

ReentrantLock的unlock()方法执行后，如果锁被当前线程持有，那么锁持有数将会减1；如果锁的持有数为0，将直接释放；如果当前线程不持有该锁，那么将抛出IllegalMonitorStateException异常。

// ReentrantLock.java
/**
 * Attempts to release this lock.
 *
 * <p>If the current thread is the holder of this lock then the hold
 * count is decremented.  If the hold count is now zero then the lock
 * is released.  If the current thread is not the holder of this
 * lock then {@link IllegalMonitorStateException} is thrown.
 *
 * @throws IllegalMonitorStateException if the current thread does not
 *         hold this lock
 */
public void unlock() {
    sync.release(1);
}

释放锁与上锁的逻辑基本上类似，设计模式都是模板方法。释放锁的框架代码都写好了，具体怎么释放由子类自行实现。转入AQS的框架代码如下：

// AbstractQueuedSynchronizer.java
/**
 * Releases in exclusive mode.  Implemented by unblocking one or
 * more threads if {@link #tryRelease} returns true.
 * This method can be used to implement method {@link Lock#unlock}.
 *
 * @param arg the release argument.  This value is conveyed to
 *        {@link #tryRelease} but is otherwise uninterpreted and
 *        can represent anything you like.
 * @return the value returned from {@link #tryRelease}
 */
public final boolean release(int arg) {
    if (tryRelease(arg)) {
        Node h = head;
        if (h != null && h.waitStatus != 0)
            unparkSuccessor(h);
        return true;
    }
    return false;
}

再次进入子类ReentrantLock中，调用其内部类Sync的tryRelease()方法。其中需要注意的是：没有线程持有该锁时，返回true，否则返回false。

// ReentrantLock.java -> Sync
protected final boolean tryRelease(int releases) {
    // 持锁数减去releases
    int c = getState() - releases;
    // 非本线程持有该锁，抛出异常
    if (Thread.currentThread() != getExclusiveOwnerThread())
        throw new IllegalMonitorStateException();
    boolean free = false;
    // 没有线程持有该锁时，返回true，否则返回false
    if (c == 0) {
        free = true;
        setExclusiveOwnerThread(null);
    }
    setState(c);
    return free;
}

回到AQS的框架代码，将看到如果tryRelease()方法如果返回true，才有可能执行后面的unparkSuccessor()方法。这个方法就是找到head的可用（等待着呗唤醒的线程）后继，然后unpark()该线程，让该线程醒过来，继续执行acquireQueued()方法中的for(;;),让它获取到锁。

// AbstractQueuedSynchronizer.java
private void unparkSuccessor(Node node) {

    int ws = node.waitStatus;
    if (ws < 0)
        node.compareAndSetWaitStatus(ws, 0);

    Node s = node.next;
    if (s == null || s.waitStatus > 0) {
        s = null;
        for (Node p = tail; p != node && p != null; p = p.prev)
            if (p.waitStatus <= 0)
                s = p;
    }
    if (s != null)
        LockSupport.unpark(s.thread);
}

公平锁与非公平锁

回过头来看FairSync与NonfairSync之间的差别。正如前面所言，他们之间的区别在于获取锁的方法不一样，上面的代码是NonfairSync的方式，现在看一下公平锁的实现：

// ReentrantLock.java -> NonfairSync
protected final boolean tryAcquire(int acquires) {
    final Thread current = Thread.currentThread();
    int c = getState();
    if (c == 0) {
        if (!hasQueuedPredecessors() &&
            compareAndSetState(0, acquires)) {
            setExclusiveOwnerThread(current);
            return true;
        }
    }
    else if (current == getExclusiveOwnerThread()) {
        int nextc = c + acquires;
        if (nextc < 0)
            throw new Error("Maximum lock count exceeded");
        setState(nextc);
        return true;
    }
    return false;
}

核心的思想大致是：查看队列中是否有存在的其他的等待线程，处于等待状态的最前面的那一个线程，如果不是本线程，那么直接返回false。

// AbstractQueuedSynchronizer.java
public final boolean hasQueuedPredecessors() {
    Node h, s;
    if ((h = head) != null) {
        if ((s = h.next) == null || s.waitStatus > 0) {
            s = null; // traverse in case of concurrent cancellation
            for (Node p = tail; p != h && p != null; p = p.prev) {
                if (p.waitStatus <= 0)
                    s = p;
            }
        }
        if (s != null && s.thread != Thread.currentThread())
            return true;
    }
    return false;
}

也就是说FairSync的tryAcquire()返回false，所以它将继续执行AQS里面的acquireQueued()，即准备进入等待队列。充分体现了先来后到的公平性。

public final void acquire(int arg) {
    if (!tryAcquire(arg) &&
        acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
        selfInterrupt();
}

完

参考：https://www.cnblogs.com/waterystone/p/4920797.html

一条肥鱼

发布了166 篇原创文章 · 获赞 118 · 访问量 26万+

私信关注