JUC源码分析-集合篇（十）LinkedTransferQueue

LinkedTransferQueue(LTQ) 相比 BlockingQueue 更进一步，生产者会一直阻塞直到所添加到队列的元素被某一个消费者所消费（不仅仅是添加到队列里就完事）。新添加的 transfer 方法用来实现这种约束。顾名思义，阻塞就是发生在元素从一个线程 transfer 到另一个线程的过程中，它有效地实现了元素在线程之间的传递（以建立 Java 内存模型中的 happens-before 关系的方式）。Doug Lea 说从功能角度来讲，LinkedTransferQueue 实际上是 ConcurrentLinkedQueue、SynchronousQueue（公平模式）和 LinkedBlockingQueue 的超集。而且 LinkedTransferQueue 更好用，因为它不仅仅综合了这几个类的功能，同时也提供了更高效的实现。

1. LinkedTransferQueue 概况

推荐一篇 LinkedTransferQueue 的介绍：http://ifeve.com/java-transfer-queue/

当我第一次看到 LinkedTransferQueue 时，首先想到了已有的实现类 SynchronousQueue。SynchronousQueue 的队列长度为 0，特别是对于两个线程之间传递元素这种用例。

LinkedTransferQueue 相比 SynchronousQueue 用处更广、更好用，因为你可以决定是使用 BlockingQueue 的方法（译者注：例如put方法）还是确保一次传递完成（译者注：即transfer方法）。在队列中已有元素的情况下，调用 transfer 方法，可以确保队列中被传递元素之前的所有元素都能被处理。

LinkedTransferQueue 的性能分别是 SynchronousQueue 的3倍（非公平模式）和14倍（公平模式）。因为像 ThreadPoolExecutor 这样的类在任务传递时都是使用 SynchronousQueue，所以使用 LinkedTransferQueue 来代替 SynchronousQueue 也会使得 ThreadPoolExecutor 得到相应的性能提升。

下面你可以参考这往篇文章实现一个自己的 LinkedTransferQueue：http://ifeve.com/customizing-concurrency-classes-11-2/#more-7388

2. LTQ 原理

LTQ 内部采用的是一种非常不同的队列，即松弛型双重队列(Dual Queues with Slack)：http://ifeve.com/buglinkedtransferqueue-bug/#more-11117

强烈建议大家读一下 Doug Lea 的 java doc 文档，对 LTQ 的数据结构有很清楚的说明。

2.1 双重队列(Dual Queues)

/**
 * A FIFO dual queue may be implemented using a variation of the
 * Michael & Scott (M&S) lock-free queue algorithm
 * (http://www.cs.rochester.edu/u/scott/papers/1996_PODC_queues.pdf).
 * It maintains two pointer fields, "head", pointing to a
 * (matched) node that in turn points to the first actual
 * (unmatched) queue node (or null if empty); and "tail" that
 * points to the last node on the queue (or again null if
 * empty). For example, here is a possible queue with four data
 * elements:
 *
 *  head                tail
 *    |                   |
 *    v                   v
 *    M -> U -> U -> U -> U
 *    
 *  M(matched)  U(unmatched)
 */

翻译：FIFO 双队列可以使用 Michael & Scott（M&S）无锁队列算法的变体实现。它维护两个指针字段： head 指向第一个不匹配节点(M)的前驱节点（如果为空则为空）；tail 指向队列中的最后一个节点（如果为空则为空）。

双重是指有两种类型相互对立的节点(Node.isData==false或true)，并且我理解的每种节点都有三种状态：

UNMATCHED 节点构造完成，刚进入队列的状态
MATCHED 节点备置为“满足”状态，即入队节点标识的线程成功取得或者传递了数据
CANCELED 节点被置为取消状态，即入队节点标识的线程因为超时或者中断决定放弃等待

2.2 松弛度(Slack)

/**
 * 在更新head/tail和查找中寻求平衡，大多数场景1~3比较合适。
 * 本质上：是增加对 volatile 变量读操作来减少了对 volatile 变量的写操作
 * 而对 volatile 变量的写操作开销要远远大于读操作，因此使用Slack能增加效率
 * 
 * We introduce here an approach that lies between the extremes of
 * never versus always updating queue (head and tail) pointers.
 * This offers a tradeoff between sometimes requiring extra
 * traversal steps to locate the first and/or last unmatched
 * nodes, versus the reduced overhead and contention of fewer
 * updates to queue pointers. For example, a possible snapshot of
 * a queue is:
 *
 *  head           tail
 *    |              |
 *    v              v
 *    M -> M -> U -> U -> U -> U
 *
 * The best value for this "slack" (the targeted maximum distance
 * between the value of "head" and the first unmatched node, and
 * similarly for "tail") is an empirical matter. We have found
 * that using very small constants in the range of 1-3 work best
 * over a range of platforms. Larger values introduce increasing
 * costs of cache misses and risks of long traversal chains, while
 * smaller values increase CAS contention and overhead.
 */

为了节省 CAS 操作的开销，LTQ 引入了“松弛度”的概念：在节点被匹配（被删除）之后，不会立即更新 head/tail，而是当 head/tail 节点和最近一个未匹配的节点之间的距离超过一个“松弛阀值”之后才会更新（在 LTQ 中，这个值为 2）。这个“松弛阀值”一般为1-3，如果太大会降低缓存命中率，并且会增加遍历链的长度；太小会增加 CAS 的开销。另外在 ConcurrentLinkedQueue 也有相应的应用：hops 设计意图

2.3 节点自链接

已匹配节点的 next 引用会指向自身。如果 GC 延迟回收，已删除节点链会积累的很长，此时垃圾收集会耗费高昂的代价，并且所有刚匹配的节点也不会被回收。为了避免这种情况，我们在 CAS 向后推进 head 时，会把已匹配的 head 的"next"引用指向自身（即“自链接节点”），这样就限制了连接已删除节点的长度（我们也采取类似的方法，清除在其他节点字段中可能的垃圾保留值）。如果在遍历时遇到一个自链接节点，那就表明当前线程已经滞后于另外一个更新 head 的线程，此时就需要重新获取 head 来遍历。

参考：

《JUC源码分析-集合篇（六）：LinkedTransferQueue》：https://www.jianshu.com/p/42ceaed2afe6

每天用心记录一点点。内容也许不重要，但习惯很重要！