Support internal promotion of lock-free concurrency priority thread pool

Support internal promotion of lock-free concurrency priority thread pool

introduction

In a technical group to discuss interesting business needs, it can be described as:

It has an internal thread pool task of sorting according to priority. Thread pool will give priority to the implementation of high-priority task. Over time, the thread pool inside a low-priority task's priority will be gradually promoted to become a high priority, in order to avoid being constantly new high-priority task blocking lead to starvation.

Considering the JDK has provided a custom thread pool for developers ThreadPoolExecutorand a priority queue PriorityBlockingQueue, a combination of both and adjusted periodically in low-priority task queue priority then resortlow priority task to adjust to the top of the queue, you can also to avoid being starved to some extent.

The problem with this approach is that resortconsumption is relatively high, and need to re-calculate the priority of each task. To this end, we have drawn the following design, want to use the data structure storage lock-free concurrent tasks, and task support automatic promotion of priority, to ensure low-priority tasks will ultimately be able to perform without increasing high-priority task hungry dead.

Welcome to technical exchange group 186,233,599 discuss and exchange, I also welcome the focus on public numbers: Wind & Fire said.

The derivation process

How to achieve the priority of promotion

Declare an array, in a manner as circular queue. It is mounted on a list of tasks for each array slot. There is a pointer to one of the current slot array, the slot is the current highest priority task insertion slot. Digital pointer increment direction priority order. The pointer moves in a certain way in increasing direction as a pointer to the slot is the highest priority, so the movement of the pointer actually means the priority of all slots are promoted.

This priority only discrete integer numbers, and the priority range from 0 to the array length minus 1. The highest priority is 0.

The case is expressed by the following graphic manner

FIG priority range is [0,6], current slot is the pointer to the highest priority, current slot on the left side is the lowest priority, current slot to the right of the second highest priority. Are mounted on each slot a queue, the queue priority tasks are the same (see subsequent algorithms have different priorities mixing).

Each time the task is always taken from the current queue slot pointer read task. After a certain lapse of time, a current pointer along the right move at this time means that all slots priorities are promoted, in addition to the original point of the current slot, it becomes the lowest priority slot.

Since the current pointer is always on the move, so the final will move to the previous low priority slot, this time task in this slot has become the highest priority task, read execution. This avoids the situation during the operation there have been a high-priority task is added to the original cause of the low priority starvation occurs.

Data structure design

According to the priority of promotion ideas above, it is clear there should be an array of its different slots represent different priorities. MPMC mount a type of each queue slot for adding this task priority and read.

Current using a pointer that points to the highest priority slot to slot.

One problem that arises pointer

If there is only a pointer, which means when the reading task, read from the pointer to the slot, so at this time slot pointer is the highest priority. Task when inserted, are calculated using the current pointer. This model concurrency problems when the priority of promotion.

When the pointer is updated to point to slot 1 from the slot 2. At this time, there is a remaining slot 1 may also be part of the task, the real part of this priority task which should be higher than the slot 2. And if you insert the lowest-priority task at this time, probably it will be inserted into the slot 1. Then the task queue slot 1 on the actual mix of the highest priority and the lowest-priority task, can not be distinguished.

In order to solve different priority task in a queue in the same mixing, we can move the pointer, moves before the rest of the slot to the current head of the queue slot. This effectively means that the requirements for double-ended queue is a queue mode. But because the task of moving the pointer moves and can not be atomized, or will cause the queue slot 1 of the highest priority task and the lowest priority task mixed together.

In terms of results achieved, what we need is the time to move the pointer to ensure that the slot in the implementation of the remaining tasks 1 had a high priority after completion in order to carry out this Slot 2 "original second highest priority, now the highest priority "task. In effect, it does not need to move certain tasks, can be a means to ensure that the original slot 1 high-priority task is finished and then to perform the task to slot 2.

Based on this consideration, we will be a pointer to split into two: Task insertion pointer and read pointer tasks.

Task into the task pointer and read pointer

Based on a consideration of concurrent read and write, pointers are two AtomicIntegertypes. Action of the two pointers are:

  • Task insertion pointer: a pointer to the slot as the current highest priority slot (subsequent rounds will introduce the concept of, so here the current bold).
  • Read pointer tasks: the task using the acquired pointer points to the slot structure is acquired from the task queue read task.

Task into the task pointer and read pointer separated advantage is that task into the slot pointer movement means different priorities actual promotion. It reads the task queue may be read in accordance with the read pointer to the slot, until the mobile reads the priority task is completed and then reading corresponding to a pointer to the next slot. Thus, in order to ensure the fairness of the column is processed, but also to ensure that the same unit of time than high-priority task low priority task to be processed, but also to avoid a single copy of the task pointer movement required to bring different priority tasks pollution problems.

How to move the insertion pointer task

Insertion pointer can be moved according to two strategies:

  • Natural passage of time move, move after a certain time.
  • Read a number of times as a unit, after a certain number of moves.

If a selection policy, a required background thread configuration, according to a fixed time to move the insertion pointer; if two selection policy requires a global AtomicIntegerobjects, the number of times determined.

If you select a program, there may be a scenario to put a lot of thread pool priority task is the same, so that the queue length on a long slot. If the cache is relatively task processing, the task is inserted into the pointer may be moved a plurality of times. This move will make the queue has a lot of different tasks on a priority slot. While reading tasks step by step in accordance with the priority to deal with, which makes the produce so many different priorities actually sense is not great.

So using Strategy II would be more appropriate.

As the reading task is multi-threaded, the point to note therefore the Strategy II implementation includes:

  • AtomicInteger#incrementAndGetAchieve the task reading the accumulated number of times. If the returned number is a multiple of the threshold value, it means that the task may move the insertion pointer.
  • Used AtomicInteger#incrementAndGetto move the insertion pointer.

Here concurrent movement of the insertion pointer consideration that, since the reading thread count used to read AtomicInteger#incrementAndGetmode is inevitable cumulative success, returned values are necessarily required to achieve the insertion pointer is incremented for promotion of multiple thresholds. Because of increasing inevitability, so use the same AtomicInteger#incrementAndGetways.

Task insertion pointer moves to the same position causes mixing problems priority task

The system assumes that the initial state, and the read pointer insertion point to the slot 1, a large number of tasks inserted in a slot. With the reading task, the pointer to the insert slot 2, when inserted into the slot on some of the tasks. With the reading task, insertion pointer continues to move, the moved length of the array, again pointing to the slot 2. At this time, assuming that the read pointer still in slot 1, and if this time insert Insert task. Well, in fact the task queue slot 2 should be divided into two types: the first part is a round insert task, just insert the second half of the current task .

If the read pointer is moved to slot 2, the front half portion should be after the task is completed to perform the task on slot 3, rather than all tasks are executed. Thus the actual task priority on the slot 3 should be higher than in the latter half of slot 2 of the task queue.

Based on the above, the problem can be transformed to rely read pointer when reading the task, how to identify the current task queue are not processed further movement round the secondary read pointer?

Considering the task into the task pointer and read pointer value itself is, monotonically increasing value, in fact, can be seen as an expression of the concept of "order." So when you add a task to prepare, you can insert a pointer value plus the priority of the task, the task of insertion declared as a priority. Read pointer when reading task, only insert priority of the current task is equal to the value of the read pointer, this round should read pointer tasks to be processed means that the task. If you insert a priority task of reading the read pointer ranging, means that the current task queue can no longer read, read pointer should move.

By inserting priority task itself avoids confusion different rounds priority task in a queue to be caused by the mixing.

How to read the task pointer moves

The chapters proposed insertion priority task to solve the same problem queue may mix different rounds of tasks. To solve this problem leads to the read pointer of mobile strategy: implies the need to move when unequal values ​​into priority and the read pointer to read the task.

But here we had a new problem: the problem of concurrent movement of the read pointer. In the case of concurrent read, will encounter a problem: priority read out of the task did not meet pointer, this time to back into the queue, but reinsert it may mix and insert task, resulting in data confusion .

There are several possible solutions:

  • Reading tasks using Synckeywords modification, if the reading task is not met, the back, and move the pointer. In the absence of concurrent read, but still possible because adding new tasks back and read data caused confusion.
  • Segmentation mechanism employed, each segment is a queue, and the segment forms a segment queue. Segments within a fixed priority, so that when the segment is exhausted, when the read pointer is switched.

Strategies together can not completely solve the problem, where we use two strategies program.

Introducing two strategies actually changed above a data structure, that is, the array elements are no longer stored in a task queue, but a segment queue. And each segment has an internal memory of the task queue and the task priority queue insert segments are identical. This means that when the segment is created on the insert with this priority value. Segmentation and segment insertion priority must be different, this structure would support the concept of natural rounds.

Introducing a segmented structure led to changes in the data structure, which may actually change process task into tasks and read. It will come into a detailed implementation below. Analysis here, the timing of the read pointer moves very aware, data segments within exhausted, means that the insertion of a specific priority tasks are finished reading.

Of course, taking into account the reasons for concurrent read and write, read thread found in the segment data does not mean that deplete the insertion priority task all been read, will be resolved later in the process flow for concurrency scenarios.

Insert and read and sent

May be inserted and read in the same slot on the same segment concurrently. The queue itself is segmented support MPMC, which was not a problem.

May give rise to a concurrency exception is inserted thread reads a value into a pointer, and ready to insert the data, but because of thread scheduling, lost CPU resources, data insertion has not been completed. At this time, the read thread tasks within the slot that no data has been read, the read pointer is moved to the next slot. After the read pointer movement, the thread is inserted to complete the insertion of data. This leads should have been a high-priority task becomes the task of the lowest priority slot. The current round once again read pointer pointing to the slot, the read pointer to get the task priority task will read the pointer value and the conflict itself.

For complicated by abnormal scene, there is a common solution idea is to double-check. After reading the thread that is moving the read pointer mission, currently under examination again appeared in the segment of new tasks, if so, to assist the migration to the next slot; written into the threads in the task, checking whether the read pointer moved, and if so, to assist the migration to the next slot.

However, in the read queue thread checks whether the remaining segments, the read thread checks whether the write pointer is moved, these are in dynamic state, still have some other problems. Double-check generally introduces a termination state to reduce the possibility of changes in the scene. Here, we introduce the segment status: use and termination. When a segment is used in the initialization state, when the read thread of tasks within the segment that is consumed, the final state should be updated. Once segmented into the final state, were abandoned, we should no longer have to add the task to the data segment.

Segmented by state, our mission can be divided into before terminating added to the added to the segment after segment and two types of termination. The former needs to be read normally, the latter need to migrate to other segments suitable to be processed further.

Up to this point, we change the elements for data structure and its properties is complete.

The array is expressed by different priorities circular queue fashion. To achieve internal promotion a priority task to move tasks by the write pointer. To achieve the task by reading a pointer in strict order of priority to be processed, and to avoid starving lower priority task is a high priority task. Element array to insert a segment of lowest priority on the slot. Hashed to the same segment along a slot formed in the queue in order of priority in accordance with the insertion.

Code

The entire code among the most complex is the task of inserting and reading, respectively below the design process.

Task into

The above derivation and analysis of the insertion of the conflict could lead to a scenario of concurrent read. Here we refine its resolution process. For insertion of thread, the situation has to be addressed include:

  • No segment corresponds to the slot element.
  • Priority value into the segment and a pointer insertion slot on the corresponding elements are not equal.
  • Element corresponding to the segmented slot list pointer is inserted into the insertion priority segment is terminated consistent state
  • Segment corresponding to the slot on the insertion element and the insertion pointer equal priority, and in use.

You can see, only the fourth case the task can succeed in the current segment is inserted and after insertion is completed also need to check the status of the segment again. Based on these considerations, we will insert process designed to

You can see, this process is not without a segment dealing with slots, the next chapter will be analyzed.

The task of reading

With the presence of the segment, the read pointer movement determination is more complex, the read thread may encounter scenarios are:

  • No segmentation on the read pointer of the hash slot.
  • There is in use, the segment and the read pointer of the hash slot, the task is not segmented.
  • There is in use, the segment and the read pointer of the hash slot, has the task of the segment.
  • The sections in the closed state and the read pointer of the hash slot, the task is not segmented.
  • The sections in the closed state and the read pointer of the hash slot, has the task of the segment.

Only the third case can be read and processed task. With rounds of this concept, the read pointer will forever be read on the first segment of the slot. If the slot is not segmented, or segmented insert a different priority to the read pointer, or no tasks within a segment, the read pointer movement may be considered. Note that , the state is not the segment read pointer moves off conditions, will analyze the following reason.

But moving the read pointer when you first need to consider whether the current read pointer is already in (the value of the write pointer + lowest priority number), and if so, which means already at the border, should not be on the move.

分段状态的更新只能由读取线程来进行。当读取线程发现该分段已经没有任务了,首先应该通过CAS的方式更新分段状态。CAS竞争成功的线程再次检查分段内是否出现了新的任务,如果出现的话,则提取任务,完成任务读取。为何不将任务移动到下一个槽位。因为下一个槽位上可能还没有分段,此时读取线程可能和写入线程竞争槽位上的分段写入。如果写入线程竞争成功,读取线程移动过去的任务数据的优先级就放到了错误的分段中;如果读取线程竞争成功,则读取线程创建的分段必须是第一个分段,否则任务还是移动到错误的地方。

解决这个问题最好的办法就是不解决。不移动任务,仍然在该分段上读取任务直到任务耗尽。然后再尝试移动读取指针。而对于写入线程而言,当其发现分段的状态变为终止后,是提取出任务重新执行完整的放入流程,不会有并发的问题。

再次梳理下没有任务情况下的流程,应该是通过CAS修改分段的状态。无论成功或失败,都可以继续检查队列是否有任务,如果有的话,则返回读取到的任务。如果没有的话,则CAS将读取指针+1。竞争成功的线程将当前分段的下一个分段设置给槽位,并且重新执行读取流程。竞争失败的线程则反复检查读取指针的值,发现变化后,重新执行读取流程。

这里有一个并发冲突需要考虑,当读取线程尝试将当前分段的下一个分段设置为槽位的值时,可能此时当前分段的下一个分段是null,而写入线程正在尝试为当前分段设置下一个分段。这种情况下可能导致下一个分段丢失。特别的,如果当前分段的下一个分段已经被设置,并且有任务被放入其中,丢失这个分段就意味着数据丢失。

为了避免这个情况,在当前分段的下一个分段为null时,就不能将下一个分段(属性值)设置给槽位。这使得在读取到分段时,需要首先检查分段的优先级,确认是否本轮次。如果是的话,再执行后续的流程。否则要么移动(该分段没有下一个分段),要么将该分段的下一个分段设置给槽位后,在移动。

从这个角度出发,我们可以在初始化的时候,将数组中的元素都填充一个分段。这样写入线程就不需要处理槽位上可能为空的场景了。

基于此,我们将读取任务的变化为:

  • 槽位上的分段优先级小于读取指针,且分段状态为终止。
  • 槽位上的分段优先级等于读取指针。
  • 槽位上的分段优先级大于读取指针。

第一种情况,如果该分段有下一个分段,CAS更新到槽位上;如果没有,则CAS移动读取指针。

第二种情况,按照上面分析的流程进行处理即可。

第三种情况,CAS移动读取指针。

综上,我们可以将读取流程设计为

包装为BlockQueue

在JDK提供的ThreadPoolExecutor类的构造方法中,需要传入BlockingQueue作为队列的接口。显然,上述的存储结构并不能支持BlockQueue,需要考虑包装。

显然,上面的存储结果在写入的时候并不会阻塞,因此只需要考虑如何包装读取数据不存在时的阻塞等待即可。

简单的方式就是在读取失败的获取锁,并且在队列空的condition对象执行等待;插入任务的时候执行唤醒。

效果展现

测试代码如下

首先添加一定量的高优先级任务,随后添加5个低优先级,最后通过CountLatch模拟在运行过程中添加高优先级任务。

如果单纯按照优先级排序,则需要所有高优先级任务输出完毕后才会输出低优先级任务,显然这是错误的。正确的实现应该是先输出第一批高优先级任务,再输出低优先级任务,最后输出第三批高优先级任务。运行代码,看到结果如下

与我们的预期相吻合。

代码托管地址

Gitee:https://gitee.com/eric_ds/eric_article/blob/master/优先级自动晋升线程池/AutoPromotePriorityQueue.java

Guess you like

Origin www.cnblogs.com/jfire/p/12177919.html