Java中的数据结构:队列(上)

这是我参与2022首次更文挑战的第3天,活动详情查看:2022首次更文挑战

人生苦短,不如养狗

队列

基本概念

  队列,是一种只能在一端(队尾)插入,另一端(队首)删除的有序线性表。简单理解,就如同食堂打饭的队列,新来的同学只能排在队伍的最后,每次队首的同学打完饭之后才能轮到下一个同学继续打饭。下图就是队列的一个例子。

队列.jpeg

队列抽象数据类型的基本操作如下:

  • void enQueue(T data);
  • T deQueue();

常见的实现队列方式有如下三种方式:

  • 基于简单循环数组的实现方法
  • 基于动态循环数组的实现方法
  • 基于链表的实现方法

  对于第一种和第二种方式来讲原理是相同的,只是第一种方式中使用的数组是固定长度的,第二种方式使用的数组是可以动态扩容的。

Java中常见的队列

1. ArrayDeque

  ArrayDeque就是使用上面说的动态循环数组来实现的。可以看一下jdk源码中的介绍:

/**
 * Resizable-array implementation of the {@link Deque} interface.  Array
 * deques have no capacity restrictions; they grow as necessary to support
 * usage.  They are not thread-safe; in the absence of external
 * synchronization, they do not support concurrent access by multiple threads.
 * Null elements are prohibited.  This class is likely to be faster than
 * {@link Stack} when used as a stack, and faster than {@link LinkedList}
 * when used as a queue.
**/
复制代码

  从文档中我们可以看到,这种队列是没有容量限制的。下面我们来具体鉴赏一下代码。

  首先,来看一下队列中的成员变量:

/**
 * The array in which the elements of the deque are stored.
 * The capacity of the deque is the length of this array, which is
 * always a power of two. The array is never allowed to become
 * full, except transiently within an addX method where it is
 * resized (see doubleCapacity) immediately upon becoming full,
 * thus avoiding head and tail wrapping around to equal each
 * other.  We also guarantee that all array cells not holding
 * deque elements are always null.
 */
transient Object[] elements; // non-private to simplify nested class access

/**
 * The index of the element at the head of the deque (which is the
 * element that would be removed by remove() or pop()); or an
 * arbitrary number equal to tail if the deque is empty.
 */
transient int head;

/**
 * The index at which the next element would be added to the tail
 * of the deque (via addLast(E), add(E), or push(E)).
 */
transient int tail;

/**
 * The minimum capacity that we'll use for a newly created deque.
 * Must be a power of 2.
 */
private static final int MIN_INITIAL_CAPACITY = 8;
复制代码

  除去常量,一共只有三个变量:

  • elements用于存放对象,是一个Object类型的数组。该数组的长度必须是2的幂,所以当进行扩容的时候是按照原长度乘2进行扩容的。
  • head队首的数组下标。
  • tail队尾的数组下标。

  下面我们鉴赏一下入队和出队方法,看看是不是和大家想的一样:

入队

/**
 * Inserts the specified element at the end of this deque.
 *
 * <p>This method is equivalent to {@link #add}.
 *
 * @param e the element to add
 * @throws NullPointerException if the specified element is null
 */
public void addLast(E e) {
    if (e == null)
        throw new NullPointerException();
    elements[tail] = e;
	  // 这里通过位运算(二进制与运算)来检查队列是否已经满了,如果满了就进行扩容操作
    if ( (tail = (tail + 1) & (elements.length - 1)) == head)
        doubleCapacity();
}


/**
 * Doubles the capacity of this deque.  Call only when full, i.e.,
 * when head and tail have wrapped around to become equal.
 */
private void doubleCapacity() {
    assert head == tail;
    int p = head;
    int n = elements.length;
    int r = n - p; // number of elements to the right of p
    int newCapacity = n << 1;
    if (newCapacity < 0)
        throw new IllegalStateException("Sorry, deque too big");
    Object[] a = new Object[newCapacity];
    System.arraycopy(elements, p, a, 0, r);
    System.arraycopy(elements, 0, a, r, p);
    elements = a;
    head = 0;
    tail = n;
}

复制代码

出队

public E pollLast() {
    int t = (tail - 1) & (elements.length - 1);
    @SuppressWarnings("unchecked")
	  // 这里不进行类型检查
    E result = (E) elements[t];
    if (result == null)
        return null;
    elements[t] = null;
    tail = t;
    return result;
}
复制代码

  上面的代码其实也不是很难,这里就不逐行解释了。

2. AQS(AbstractQueueSynchronizer)

  大名鼎鼎的AQS,也就是队列同步器,相信不少人应该很熟悉。我们经常谈论的ReentrantLock、ReentrantReadWriteLock就是使用AQS来实现的。

  首先,来看看源码的文档中是怎么介绍AQS的:

* Provides a framework for implementing blocking locks and related
* synchronizers (semaphores, events, etc) that rely on
* first-in-first-out (FIFO) wait queues.  This class is designed to
* be a useful basis for most kinds of synchronizers that rely on a
* single atomic {@code int} value to represent state. Subclasses
* must define the protected methods that change this state, and which
* define what that state means in terms of this object being acquired
* or released.  Given these, the other methods in this class carry
* out all queuing and blocking mechanics. Subclasses can maintain
* other state fields, but only the atomically updated {@code int}
* value manipulated using methods {@link #getState}, {@link
* #setState} and {@link #compareAndSetState} is tracked with respect
* to synchronization.
复制代码

  文档中介绍,AQS基于一个FIFO的等待队列 (也就是CLH队列) 为实现阻塞锁和其他相关的同步器提供了一个框架。可以理解为官方认定的并发包C位组件。

  和ArrayDeque实现的方式不同,AQS中CLH队列是使用链表来实现的。所以这里我们需要将关注一下链表中的结点是如何实现的。

static final class Node {
    /** Marker to indicate a node is waiting in shared mode */
    static final Node SHARED = new Node();
    /** Marker to indicate a node is waiting in exclusive mode */
    static final Node EXCLUSIVE = null;

    /** waitStatus value to indicate thread has cancelled */
    static final int CANCELLED =  1;
    /** waitStatus value to indicate successor's thread needs unparking */
    static final int SIGNAL    = -1;
    /** waitStatus value to indicate thread is waiting on condition */
    static final int CONDITION = -2;
    /**
     * waitStatus value to indicate the next acquireShared should
     * unconditionally propagate
     */
    static final int PROPAGATE = -3;

    /**
     * Status field, taking on only the values:
     *   SIGNAL:     The successor of this node is (or will soon be)
     *               blocked (via park), so the current node must
     *               unpark its successor when it releases or
     *               cancels. To avoid races, acquire methods must
     *               first indicate they need a signal,
     *               then retry the atomic acquire, and then,
     *               on failure, block.
     *   CANCELLED:  This node is cancelled due to timeout or interrupt.
     *               Nodes never leave this state. In particular,
     *               a thread with cancelled node never again blocks.
     *   CONDITION:  This node is currently on a condition queue.
     *               It will not be used as a sync queue node
     *               until transferred, at which time the status
     *               will be set to 0. (Use of this value here has
     *               nothing to do with the other uses of the
     *               field, but simplifies mechanics.)
     *   PROPAGATE:  A releaseShared should be propagated to other
     *               nodes. This is set (for head node only) in
     *               doReleaseShared to ensure propagation
     *               continues, even if other operations have
     *               since intervened.
     *   0:          None of the above
     *
     * The values are arranged numerically to simplify use.
     * Non-negative values mean that a node doesn't need to
     * signal. So, most code doesn't need to check for particular
     * values, just for sign.
     *
     * The field is initialized to 0 for normal sync nodes, and
     * CONDITION for condition nodes.  It is modified using CAS
     * (or when possible, unconditional volatile writes).
     */
    volatile int waitStatus;

    /**
     * Link to predecessor node that current node/thread relies on
     * for checking waitStatus. Assigned during enqueuing, and nulled
     * out (for sake of GC) only upon dequeuing.  Also, upon
     * cancellation of a predecessor, we short-circuit while
     * finding a non-cancelled one, which will always exist
     * because the head node is never cancelled: A node becomes
     * head only as a result of successful acquire. A
     * cancelled thread never succeeds in acquiring, and a thread only
     * cancels itself, not any other node.
     */
    volatile Node prev;

    /**
     * Link to the successor node that the current node/thread
     * unparks upon release. Assigned during enqueuing, adjusted
     * when bypassing cancelled predecessors, and nulled out (for
     * sake of GC) when dequeued.  The enq operation does not
     * assign next field of a predecessor until after attachment,
     * so seeing a null next field does not necessarily mean that
     * node is at end of queue. However, if a next field appears
     * to be null, we can scan prev's from the tail to
     * double-check.  The next field of cancelled nodes is set to
     * point to the node itself instead of null, to make life
     * easier for isOnSyncQueue.
     */
    volatile Node next;

    /**
     * The thread that enqueued this node.  Initialized on
     * construction and nulled out after use.
     */
    volatile Thread thread;

    /**
     * Link to next node waiting on condition, or the special
     * value SHARED.  Because condition queues are accessed only
     * when holding in exclusive mode, we just need a simple
     * linked queue to hold nodes while they are waiting on
     * conditions. They are then transferred to the queue to
     * re-acquire. And because conditions can only be exclusive,
     * we save a field by using special value to indicate shared
     * mode.
     */
    Node nextWaiter;

    /**
     * Returns true if node is waiting in shared mode.
     */
    final boolean isShared() {
        return nextWaiter == SHARED;
    }

    /**
     * Returns previous node, or throws NullPointerException if null.
     * Use when predecessor cannot be null.  The null check could
     * be elided, but is present to help the VM.
     *
     * @return the predecessor of this node
     */
    final Node predecessor() throws NullPointerException {
        Node p = prev;
        if (p == null)
            throw new NullPointerException();
        else
            return p;
    }

    Node() {    // Used to establish initial head or SHARED marker
    }

    Node(Thread thread, Node mode) {     // Used by addWaiter
        this.nextWaiter = mode;
        this.thread = thread;
    }

    Node(Thread thread, int waitStatus) { // Used by Condition
        this.waitStatus = waitStatus;
        this.thread = thread;
    }
}
复制代码

  在这个结点内部类中一共有7个成员变量和4个常量,其中我们需要关注的是以下几个变量:

  • waitStatus等待状态,用于表示该节点线程获取资源的状态,有如下值SIGNALCANCELLEDCONDITIONPROPAGATE、0;
  • prev前序结点,指向该节点的前任。通过检查前任的状态,可以缩短结点的排队时间(如果前任的等待状态都是取消的话);
  • next后续结点,指向该结点的后继者。需要注意的是在出队的时候为了帮助GC,该变量被设置为了null,也就是说这个变量为null并不意味着当前结点就是队列的队尾结点,此时我们需要通过prev变量来判断二次检查;
  • thread持有该节点的线程,每个等待获取资源的线程都会拥有一个结点;
  • nextWaiter等待队列的后续结点,Node结点获取同步状态的模型(Mode)。实际上就是用来表示当前结点是处于何种模式(SHAREDEXCLUSIVE#isShared())。

  根据源码中文档我们可以看到,实际上CHL同步队列的队首元素是一个假的队首元素。

CLH queues need a dummy header node to get started

  当然这个队首元素不会在构造器中创建,而是实际产生等待资源线程之后进行实际的队列创建时才会进行创建。

  由于本次我们不是来探索AQS如何实现同步器的功能,所以这里我们就鉴赏一下CLH队列的入队和出队方法。

入队

/**
 * Creates and enqueues node for current thread and given mode.
 *
 * @param mode Node.EXCLUSIVE for exclusive, Node.SHARED for shared
 * @return the new node
 */
private Node addWaiter(Node mode) {
    Node node = new Node(Thread.currentThread(), mode);
    // Try the fast path of enq; backup to full enq on failure
    Node pred = tail;
	  // 快速尝试
    if (pred != null) {
        node.prev = pred;
        if (compareAndSetTail(pred, node)) {
            pred.next = node;
            return node;
        }
    }
	  // 快速尝试失败,自旋进行多次尝试直到成功
    enq(node);
    return node;
}

/**
 * Inserts node into queue, initializing if necessary. See picture above.
 * @param node the node to insert
 * @return node's predecessor
 */
private Node enq(final Node node) {
    for (;;) {
        Node t = tail;
        if (t == null) { // Must initialize
            // 这里设置了一个假的结点
            if (compareAndSetHead(new Node()))
                tail = head;
        } else {
            node.prev = t;
            if (compareAndSetTail(t, node)) {
                t.next = node;
                return t;
            }
        }
    }
}
复制代码

  代码还是比较简单的,其中值得注意的是为了保证并发安全,这里使用了CAS操作(这里的CAS操作使用的Unsafe类中的方法,有兴趣的朋友可以了解一下),同时Node中相应的变量都使用了volatile来修饰。

出队

  当CLH队列的首节点释放同步状态后,会唤醒它的下一个结点,当后继结点获取同步状态成功时会将自己设置为首节点。具体方法如下:

private void setHead(Node node) {
    head = node;
    node.thread = null;
    node.prev = null;
}
复制代码

  这里并没有使用CAS操作,因为此时只会有一个线程会进行相应的操作。

  以上就是使用循环数组和链表来实现队列的两个比较常用的例子。

应用

这里列举一下较为常用的应用:

  • 顺序任务调度
  • 多道程序设计
  • 异步数据传输(管道)
  • 作为算法的辅助数据结构

  上述的具体实现这里就不一一展示了,有兴趣的同学可以Google一下。

  最后祝诸位新年快乐,疫情早日结束~~

おすすめ

転載: juejin.im/post/7060133697671921695