Source code analysis Synchronous Queue, a unique queue

Summary: Synchronous Queue is a unique queue, which itself has no capacity. For example, when the caller puts a data into the queue, the caller cannot return it immediately. The caller must wait for the data that others put me in. Once consumed, you can return.

This article is shared from HUAWEI CLOUD Community " Synchronous Queue Source Code Analysis ", author: JavaEdge.

1 Introduction

Synchronous Queue is a unique queue, which itself has no capacity. For example, when the caller puts a data into the queue, the caller cannot return immediately. The caller must wait for others to consume the data I put in. , to be able to return. Synchronous Queue is widely used in MQ. In this article, let us see how Synchronous Queue implements this function from the source code.

2 Overall Architecture

Unlike blocking queues such as ArrayBlockingQueue and LinkedBlockingDeque that use AQS to achieve concurrency, SynchronousQueue directly uses CAS operations to achieve secure data access, so the source code is full of a lot of CAS code.

The overall design of SynchronousQueue is relatively abstract. Two algorithm implementations are abstracted internally, one is a first-in, first-out queue, and the other is a last-in, first-out stack. The two algorithms are implemented by two internal classes, and the direct external The implementation of the put and take methods is very simple. They are implemented by directly calling the transfer methods of the two inner classes. The overall calling relationship is shown in the following figure:

2.1 Class annotations

The queue does not store data, so it has no size and cannot be iterated; the return of an insert operation must wait for another thread to complete the delete operation of the corresponding data, and vice versa;

The queue consists of two data structures, the last-in-first-out stack and the first-in, first-out queue. The stack is unfair and the queue is fair.

How is the second point done? How is the stack implemented? Next we will reveal it little by little.

2.2 Class Diagram

The overall class diagram of SynchronousQueue is similar to LinkedBlockingQueue. Both implement the BlockingQueue interface, but because it does not store data structures, some methods are not implemented, such as isEmpty, size, contains, remove, and iteration methods, which are implemented by default. , the following screenshot:

2.3 Structural details

The underlying structure of SynchronousQueue is completely different from other queues. It has two unique data structures: queue and stack. Let's take a look at the data structure:

// 堆栈和队列共同的接口
// 负责执行 put or take
abstract static class Transferer<E> {
    // e 为空的,会直接返回特殊值,不为空会传递给消费者
    // timed 为 true,说明会有超时时间
    abstract E transfer(E e, boolean timed, long nanos);
}

// 堆栈 后入先出 非公平
// Scherer-Scott 算法
static final class TransferStack<E> extends Transferer<E> {
}

// 队列 先入先出 公平
static final class TransferQueue<E> extends Transferer<E> {
}

private transient volatile Transferer<E> transferer;

// 无参构造器默认为非公平的
public SynchronousQueue(boolean fair) {
    transferer = fair ? new TransferQueue<E>() : new TransferStack<E>();
}

From the source code we can get a few points:

Both the stack and the queue have a common interface, called Transferer, which has a method: transfer, which is very magical and will undertake the dual functions of take and put;

When we initialize, we can choose whether to use the stack or the queue. If you don't choose, the default is the stack. This is also explained in the class annotation. The efficiency of the stack is higher than that of the queue.

Next, let's look at the specific implementation of stacks and queues.

3 Unfair stacks

3.1 The structure of the stack

First, let's introduce the overall structure of the stack, as follows:

As we can see from the above figure, we have a large stack pool. The opening of the pool is called the stack header. When put, we put data into the stack pool. When taking, the data is taken from the stack pool. Both operations operate on the data on the stack head. As you can see from the figure, the closer to the stack head, the newer the data, so every time you take a take, you will get the data. The latest data at the head of the stack, this is what we call LIFO, which is unfair.

The SNode in the figure is the representation of the stack elements in the source code. Let's look at the source code:

  • The next stack of volatile SNode next
    is the stack element that is pushed below by the current stack
  • volatile SNode match
    node matching, used to determine the timing when blocking stack elements can be awakened
  • For example, if we execute take first, there is no data in the queue at this time, take is blocked, and the stack element is SNode1.
    When there is a put operation, the stack element of the current put will be assigned to the match attribute of SNode1, and the take operation will
    be awakened when take is awakened . , when you find that the match attribute of SNode1 has a value, you can get the data from put and return
  • The blocking of volatile Thread waiter
    stack elements is achieved by thread blocking, and waiter is the blocked thread
  • Object item
    undelivered message, or unconsumed message

3.2 Push and pop

  • On the stack
    Use methods such as put to put data into the stack pool
  • Pop out of the stack
    Use methods such as take to take data out of the stack pool

The objects of operation are all stack heads. Although one of the two is to get data from the stack head and the other is to put data, the underlying method is the same. The source code is as follows:

The idea of ​​the transfer method is more complicated, because the two methods of take and put are mixed together.

@SuppressWarnings("unchecked")
E transfer(E e, boolean timed, long nanos) {
    SNode s = null; // constructed/reused as needed
 
    // e 为空: take 方法,非空: put 方法
    int mode = (e == null) ? REQUEST : DATA;
 
    // 自旋
    for (;;) {
        // 头节点情况分类
        // 1:为空,说明队列中还没有数据
        // 2:非空,并且是 take 类型的,说明头节点线程正等着拿数据
        // 3:非空,并且是 put 类型的,说明头节点线程正等着放数据
        SNode h = head;
 
        // 栈头为空,说明队列中还没有数据。
        // 栈头非空且栈头的类型和本次操作一致
        //	比如都是 put,那么就把本次 put 操作放到该栈头的前面即可,让本次 put 能够先执行
        if (h == null || h.mode == mode) {  // empty or same-mode
            // 设置了超时时间,并且 e 进栈或者出栈要超时了,
            // 就会丢弃本次操作,返回 null 值。
            // 如果栈头此时被取消了,丢弃栈头,取下一个节点继续消费
            if (timed && nanos <= 0) {      // 无法等待
                // 栈头操作被取消
                if (h != null && h.isCancelled())
                    // 丢弃栈头,把栈头的后一个元素作为栈头
                    casHead(h, h.next);     // 将取消的节点弹栈
                // 栈头为空,直接返回 null
                else
                    return null;
            // 没有超时,直接把 e 作为新的栈头
            } else if (casHead(h, s = snode(s, e, h, mode))) {
                // e 等待出栈,一种是空队列 take,一种是 put
                SNode m = awaitFulfill(s, timed, nanos);
                if (m == s) {               // wait was cancelled
                    clean(s);
                    return null;
                }
                // 本来 s 是栈头的,现在 s 不是栈头了,s 后面又来了一个数,把新的数据作为栈头
                if ((h = head) != null && h.next == s)
                    casHead(h, s.next);     // help s's fulfiller
                return (E) ((mode == REQUEST) ? m.item : s.item);
            }
        // 栈头正在等待其他线程 put 或 take
        // 比如栈头正在阻塞,并且是 put 类型,而此次操作正好是 take 类型,走此处
        } else if (!isFulfilling(h.mode)) { // try to fulfill
            // 栈头已经被取消,把下一个元素作为栈头
            if (h.isCancelled())            // already cancelled
                casHead(h, h.next);         // pop and retry
            // snode 方法第三个参数 h 代表栈头,赋值给 s 的 next 属性
            else if (casHead(h, s=snode(s, e, h, FULFILLING|mode))) {
                for (;;) { // loop until matched or waiters disappear
                    // m 就是栈头,通过上面 snode 方法刚刚赋值
                    SNode m = s.next;       // m is s's match
                    if (m == null) {        // all waiters are gone
                        casHead(s, null);   // pop fulfill node
                        s = null;           // use new node next time
                        break;              // restart main loop
                    }
                    SNode mn = m.next;
                     // tryMatch 非常重要的方法,两个作用:
                     // 1 唤醒被阻塞的栈头 m,2 把当前节点 s 赋值给 m 的 match 属性
                     // 这样栈头 m 被唤醒时,就能从 m.match 中得到本次操作 s
                     // 其中 s.item 记录着本次的操作节点,也就是记录本次操作的数据
                    if (m.tryMatch(s)) {
                        casHead(s, mn);     // pop both s and m
                        return (E) ((mode == REQUEST) ? m.item : s.item);
                    } else                  // lost match
                        s.casNext(m, mn);   // help unlink
                }
            }
        } else {                            // help a fulfiller
            SNode m = h.next;               // m is h's match
            if (m == null)                  // waiter is gone
                casHead(h, null);           // pop fulfilling node
            else {
                SNode mn = m.next;
                if (m.tryMatch(h))          // help match
                    casHead(h, mn);         // pop both h and m
                else                        // lost match
                    h.casNext(m, mn);       // help unlink
            }
        }
    }
}

To summarize the operation ideas:

  1. Determine whether it is the put method or the take method
  2. Determine whether the stack header data is empty, if it is empty or the operation of the stack header is consistent with this operation, if yes, go to 3, otherwise go to 5
  3. Determine whether the operation has set a timeout time, if the timeout time is set and it has timed out, return null, otherwise go to 4
  4. If the stack header is empty, set the current operation to the stack header, or the stack header is not empty, but the operation of the stack header is the same as this operation, also set the current operation to the stack header, and see if other threads can satisfy themselves , if it is not satisfied, it blocks itself. For example, the current operation is take, but there is no data in the queue, it blocks itself
  5. If the stack header is already blocked and needs to be woken up by someone else, to determine whether the current operation can wake up the stack header, you can wake up and go to 6, otherwise go to 4
  6. Treat yourself as a node, assign it to the match attribute of the stack header, and wake up the stack header node
  7. After the stack header is awakened, it gets the match attribute, which is to return the information of the node that it awakened.

In the whole process, there is a method of node blocking, the source code is as follows:

When a node/thread is about to block, it sets its waiter field, then checks the state at least one more time before actually parking, covering races and implementers, noticing that the waiter is non-null, so it should be woken up.

When called by a node that appears at the top of the stack at the call site, the call to park is preceded by a spin to avoid blocking when producers and consumers arrive in time. This may only be enough to happen on multiprocessors.

The order of checks returned from the main loop reflects the fact that priority: interrupt > normal return > timeout. (So, on timeout, a last match check is made before giving up.) Except for calls from a non-timed SynchronousQueue. {poll/offer} doesn't check for interrupts, doesn't wait at all, so gets stuck in a transfer method instead of calling awaitFulfill.

/**
 * 旋转/阻止,直到节点s通过执行操作匹配。
 * @param s 等待的节点
 * @param timed true if timed wait
 * @param nanos 超时时间
 * @return 匹配的节点, 或者是 s 如果被取消
 */
SNode awaitFulfill(SNode s, boolean timed, long nanos) {
 
    // deadline 死亡时间,如果设置了超时时间的话,死亡时间等于当前时间 + 超时时间,否则就是 0
    final long deadline = timed ? System.nanoTime() + nanos : 0L;
    Thread w = Thread.currentThread();
    // 自旋的次数,如果设置了超时时间,会自旋 32 次,否则自旋 512 次。
    // 比如本次操作是 take 操作,自旋次数后,仍无其他线程 put 数据
    // 就会阻塞,有超时时间的,会阻塞固定的时间,否则一致阻塞下去
    int spins = (shouldSpin(s) ?
                 (timed ? maxTimedSpins : maxUntimedSpins) : 0);
    for (;;) {
        // 当前线程有无被打断,如果过了超时时间,当前线程就会被打断
        if (w.isInterrupted())
            s.tryCancel();

        SNode m = s.match;
        if (m != null)
            return m;
        if (timed) {
            nanos = deadline - System.nanoTime();
            // 超时了,取消当前线程的等待操作
            if (nanos <= 0L) {
                s.tryCancel();
                continue;
            }
        }
        // 自选次数减1
        if (spins > 0)
            spins = shouldSpin(s) ? (spins-1) : 0;
        // 把当前线程设置成 waiter,主要是通过线程来完成阻塞和唤醒
        else if (s.waiter == null)
            s.waiter = w; // establish waiter so can park next iter
        else if (!timed)
            // 通过 park 进行阻塞,这个我们在锁章节中会说明
            LockSupport.park(this);
        else if (nanos > spinForTimeoutThreshold)
            LockSupport.parkNanos(this, nanos);
    }
}

It can be found that its blocking strategy is not blocked as soon as it comes up, but will truly block when there is still no other thread to meet its requirements after spinning a certain number of times.

The implementation strategies of queues are usually divided into fair mode and non-fair mode. In this paper, we focus on fair mode.

4 Fair Queue

4.1 Elemental composition

  • volatile QNode next next
    element of the current element
  • volatile Object item // CAS'ed to or from null
    The value of the current element, if the current element is blocked, when other threads wake up, other threads will set themselves into the item
  • volatile Thread waiter // to control park/unpark
    can block the current thread
  • final boolean isData
    true is put, false is take

The fair queue mainly uses the transfer method of the internal class of TransferQueue, see the source code:

E transfer(E e, boolean timed, long nanos) {

    QNode s = null; // constructed/reused as needed
    // true : put false : get
    boolean isData = (e != null);

    for (;;) {
        // 队列头和尾的临时变量,队列是空的时候,t=h
        QNode t = tail;
        QNode h = head;
        // tail 和 head 没有初始化时,无限循环
        // 虽然这种 continue 非常耗cpu,但感觉不会碰到这种情况
        // 因为 tail 和 head 在 TransferQueue 初始化的时候,就已经被赋值空节点了
        if (t == null || h == null)
            continue;
        // 首尾节点相同,说明是空队列
        // 或者尾节点的操作和当前节点操作一致
        if (h == t || t.isData == isData) {
            QNode tn = t.next;
            // 当 t 不是 tail 时,说明 tail 已经被修改过了
            // 因为 tail 没有被修改的情况下,t 和 tail 必然相等
            // 因为前面刚刚执行赋值操作: t = tail
            if (t != tail)
                continue;
            // 队尾后面的值还不为空,t 还不是队尾,直接把 tn 赋值给 t,这是一步加强校验。
            if (tn != null) {
                advanceTail(t, tn);
                continue;
            }
            //超时直接返回 null
            if (timed && nanos <= 0)        // can't wait
                return null;
            //构造node节点
            if (s == null)
                s = new QNode(e, isData);
            //如果把 e 放到队尾失败,继续递归放进去
            if (!t.casNext(null, s))        // failed to link in
                continue;

            advanceTail(t, s);              // swing tail and wait
            // 阻塞住自己
            Object x = awaitFulfill(s, e, timed, nanos);
            if (x == s) {                   // wait was cancelled
                clean(t, s);
                return null;
            }

            if (!s.isOffList()) {           // not already unlinked
                advanceHead(t, s);          // unlink if head
                if (x != null)              // and forget fields
                    s.item = s;
                s.waiter = null;
            }
            return (x != null) ? (E)x : e;
        // 队列不为空,并且当前操作和队尾不一致
        // 也就是说当前操作是队尾是对应的操作
        // 比如说队尾是因为 take 被阻塞的,那么当前操作必然是 put
        } else {                            // complementary-mode
            // 如果是第一次执行,此处的 m 代表就是 tail
            // 也就是这行代码体现出队列的公平,每次操作时,从头开始按照顺序进行操作
            QNode m = h.next;               // node to fulfill
            if (t != tail || m == null || h != head)
                continue;                   // inconsistent read

            Object x = m.item;
            if (isData == (x != null) ||    // m already fulfilled
                x == m ||                   // m cancelled
                // m 代表栈头
                // 这里把当前的操作值赋值给阻塞住的 m 的 item 属性
                // 这样 m 被释放时,就可得到此次操作的值
                !m.casItem(x, e)) {         // lost CAS
                advanceHead(h, m);          // dequeue and retry
                continue;
            }
            // 当前操作放到队头
            advanceHead(h, m);              // successfully fulfilled
            // 释放队头阻塞节点
            LockSupport.unpark(m.waiter);
            return (x != null) ? (E)x : e;
        }
    }
}

After the thread is blocked, how does the current thread pass its own data to the blocked thread.

Assuming that thread 1 takes data from the queue and is blocked, it becomes blocked thread A and then thread 2 starts to put data B into the queue. The general process is as follows:

  • Thread 1 takes data from the queue and finds that there is no data in the queue, so it is blocked and becomes A
  • When thread 2 puts data to the end of the queue, it will find the first blocked node from the end of the queue forward. Assuming that node A can be found at this time, thread B will put the put data into the item attribute of node A, and wake up thread 1
  • After thread 1 is awakened, the data put by thread 2 can be obtained from A.item, and thread 1 returns successfully.

In this process, fairness is mainly reflected in that every time data is put, it is put to the tail of the team, and every time data is taken, it is not directly taken from the head of the pile, but from the tail of the team to find the first one. Blocked threads, this will release blocked threads in order.

4.2 Graphical Fair Queuing Model

In fair mode, the underlying implementation uses the TransferQueue queue, which has a head and tail pointer to point to the thread node currently waiting for a match.

When initialized, the status of the TransferQueue is as follows:

1. The thread put1 executes the put(1) operation. Since there is currently no paired consumer thread, the put1 thread enters the queue, and sleeps and waits after spinning for a while. At this time, the queue status is as follows:

2. Next, the thread put2 performs the put(2) operation. As before, the put2 thread enters the queue, spins for a while and then sleeps and waits. At this time, the queue status is as follows:

3. At this time, a thread take1 came and performed the take operation. Since the tail points to the put2 thread, the put2 thread is paired with the take1 thread (one put and one take). At this time, the take1 thread does not need to join the queue, but please pay attention. At this time, the thread to wake up is not put2, but put1.

Why? Everyone should know that what we are talking about is fairness strategy. The so-called fairness is that whoever joins the team first will be awakened first. Our example is obviously that put1 should be awakened first. Some students may have a question. It is obvious that the thread of take1 matches the thread of put2, and the result is that the thread of put1 is awakened for consumption. How to ensure that the thread of take1 can also match the head.next node? In fact, you can take a piece of paper and draw a picture, and you will find that it is really like this.

The fair strategy can be summed up as follows: the tail of the team matches the head of the team.

After execution, the put1 thread is awakened, and the take() method of the take1 thread returns 1 (the data of the put1 thread), thus realizing one-to-one communication between threads. At this time, the internal state is as follows:

4. Finally, there is another thread take2 to perform the take operation. At this time, only the put2 thread is waiting, and the two threads are matched, the thread put2 is awakened, and the take2 thread take operation returns 2 (the data of the thread put2). At this time The queue is back to the starting point as follows:

The above is the implementation model of SynchronousQueue in fair mode. To sum up: the tail of the team matches the head of the team, first in, first out, reflecting the principle of fairness.

5 Unfair Models

5.1 Elemental composition

  • top of stack

  • volatile SNode next next
    element in stack
  • volatile Object item // data; or null for REQUESTs
    The value of the current element, if the current element is blocked, when other threads wake up, other threads will set themselves into the item
  • Volatile Thread waiter
    can block the current thread

5.2 Illustrating the unfair model

Or use the same operation process as in the fair mode, and compare the differences between the two strategies.

The underlying implementation of the unfair mode uses TransferStack, a stack. In the implementation, the head pointer is used to point to the top of the stack. Then let's take a look at its implementation model:

1. The thread put1 executes the put(1) operation. Since there is currently no paired consumer thread, the put1 thread is pushed into the stack, and after spinning for a while, it sleeps and waits. At this time, the stack state is as follows

2. Then, the thread put2 performs the put(2) operation again. As before, the put2 thread is pushed into the stack, and after spinning for a while, it sleeps and waits. At this time, the stack state is as follows:

3. At this time, a thread take1 came and performed the take operation. At this time, it was found that the top of the stack was the put2 thread, and the match was successful, but the implementation would first push the take1 thread into the stack, and then the take1 thread looped to execute the logic matching the put2 thread. If there is no concurrency conflict, the stack top pointer will point directly to the put1 thread

4. Finally, another thread, take2, performs the take operation, which is basically the same as the logic of step 3. The take2 thread is pushed into the stack, and then the put1 thread is matched in the loop. Finally, all matches are completed, the stack becomes empty, and the initial state is restored. ,As shown below:

It can be seen from the above process that although the put1 thread is pushed into the stack first, it is matched later, which is the origin of unfairness.

5 Summary

The source code of SynchronousQueue is relatively complicated. It is recommended that you debug the source code to learn the source code. We have prepared a debugging class for you: SynchronousQueueDemo. You can download the source code and debug it yourself, so it should be easier to learn.

  • Why can there be no container in SynchronousQueue to store an element?
    No container inside means that there is no memory space like an array to store multiple elements, but there is a single address memory space for exchanging data

Due to its unique thread one-to-one pairing communication mechanism, SynchronousQueue may not be used in most ordinary development, but it will be used in thread pool technology. Since AQS is not used internally, CAS is used directly. Therefore, it will be difficult to understand the code, but this does not prevent us from understanding the underlying implementation model. On the basis of understanding the model, and then reading the source code, we will have a sense of direction, and it will look easier!

 

Click Follow to learn about HUAWEI CLOUD's new technologies for the first time~

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/5519603