Java concurrent programming-Fork-Join parallel execution task framework principle

Fork-Join principle

Task type

CPU-bound

CPU-intensive is also called computationally intensive. It refers to the performance of the system's hard disk and memory is much better than that of the CPU. At this time, most of the system operation is CPU Loading 100%, and the CPU needs to read/write I/O (hard disk/memory ), I/O can be completed in a short time, and the CPU still has many operations to be processed, and the CPU Loading is very high. In a multi-program system, the program that spends most of the time doing calculations, logic judgments and other CPU actions is called CPUbound. For example, a program that calculates pi to one thousand digits below the decimal point is used for the calculation of trigonometric functions and square roots for most of the time in the execution process, which belongs to the CPU bound program. CPU bound programs generally have a high CPU usage rate. This may be because the task itself does not need to access the I/O device, or it may be because the program is implemented in multiple threads and thus shields the time waiting for I/O.
The number of threads is generally set as:
number of threads = number of CPU cores + 1

IO intensive (I/O bound)

IO-intensive refers to the fact that the CPU performance of the system is much better than that of the hard disk and memory. At this time, the system is operating. Most of the conditions are that the CPU is waiting for I/O (hard disk/memory) read/write operations. At this time, CPU Loading does not not tall. Generally, when the I/O bound program reaches the performance limit, the CPU usage is still low. This may be because the task itself requires a lot of I/O operations, and the pipeline is not doing well, and the processor power is not fully utilized.
The number of threads is generally set as:
number of threads = ((thread waiting time + thread CPU running time)/thread CPU running time) * CPU core number

CPU intensive vs IO intensive

We can divide tasks into computationally intensive and IO-intensive.
The characteristic of computationally intensive tasks is to perform a large number of calculations and consume CPU resources, such as calculating the pi ratio, and performing high-definition decoding of videos, etc., all relying on the computing power of the CPU. Although this kind of computationally intensive task can also be completed by multi-tasking, the more tasks, the more time spent in task switching, and the lower the efficiency of the CPU to perform tasks. Therefore, the most efficient use of the CPU, the computationally intensive The number of simultaneous tasks should be equal to the number of CPU cores. Computing-intensive tasks mainly consume CPU resources, so the efficiency of code operation is very important. Scripting languages ​​such as Python have very low operating efficiency and are completely unsuitable for computationally intensive tasks. For computationally intensive tasks, it is best to write in C language.
The second type of task is IO-intensive. Tasks involving network and disk IO are all IO-intensive tasks. This type of task is characterized by low CPU consumption and most of the task is waiting for the completion of the IO operation (because The speed of IO is much lower than the speed of CPU and memory). For IO-intensive tasks, the more tasks, the higher the CPU efficiency, but there is a limit. Most of the common tasks are IO-intensive tasks, such as web applications. During the execution of IO-intensive tasks, 99% of the time is spent on IO, and very little time is spent on the CPU. Therefore, it is completely impossible to replace a scripting language such as Python with a very fast running C language. Improve operational efficiency. For IO-intensive tasks, the most suitable language is the language with the highest development efficiency (least amount of code), scripting language is the first choice, and C language the worst.

Fork-Join framework

Definition and characteristics

The Fork-Join framework is a framework for parallel execution of tasks provided by Java 7. It is a framework that divides a large task
into several small tasks, and finally summarizes the results of each small task to get the result of the large task.
Fork is to divide a large task into several subtasks for parallel execution, Join is to merge the execution results of these subtasks, and finally get the result of this large task. For example, calculating 1+2+...+10000 can be divided into 10 subtasks, and each subtask sums up 1000 numbers, and finally summarizes the results of these 10 subtasks. As shown below:
Insert picture description here

Fork-Jion features:

  1. ForkJoinPool is not to replace ExecutorService, but to complement it. In some application scenarios, its performance is better than ExecutorService.
  2. ForkJoinPool is mainly used to implement divide and conquer algorithms, especially functions called recursively after divide and conquer, such as quick sort.
  3. ForkJoinPool is most suitable for computationally intensive tasks. If there are I/O, synchronization between threads, sleep(), etc., which will cause threads to block for a long time, it is best to use ManagedBlocker.

Work stealing algorithm

ForkJoin core framework is a lightweight scheduling mechanism, using work-stealing (Work-Stealing) basic scheduling strategy adopted.

The work-stealing algorithm refers to a thread stealing tasks from other queues for execution. We need to do a relatively large task. We can divide this task into several independent subtasks. In order to reduce the competition between threads, we put these subtasks into different queues, and create for each queue. A separate thread executes the tasks in the queue, and the thread corresponds to the queue one-to-one. For example, the A thread is responsible for processing the tasks in the A queue. However, some threads will finish the tasks in their own queues first, while there are still tasks waiting to be processed in the queues corresponding to other threads. Instead of waiting, the thread that finished its work might as well help other threads to work, so it went to the queue of other threads to steal a task for execution. At this time, they will access the same queue, so in order to reduce the competition between the stolen task thread and the stolen task thread, a double-ended queue is usually used. The stolen task thread always executes the task from the top of the double-ended queue. The thread that steals the task always executes the task from the base of the deque.

The running process of work stealing is shown in the figure below:

Insert picture description here

  • The advantage of the work-stealing algorithm is to make full use of threads for parallel computing and reduce competition between threads.

  • The disadvantage of the work-stealing algorithm is that there is still competition in some cases, such as when there is only one task in the deque. And it consumes more system resources, such as creating multiple threads and multiple deques.

working principle

  1. Each worker thread of ForkJoinPool maintains a work queue (WorkQueue), which is a deque, and the object stored in it is a task (ForkJoinTask).
  2. When each worker thread generates a new task during operation (usually because fork() is called), it is put into the top of the work queue, and the worker thread uses the LIFO method when processing its own work queue, that is Say every time a task is taken out from top to execute.
  3. While processing its own work queue, each worker thread will try to steal a task (either from the task just submitted to the pool or from the work queue of other worker threads), and the stolen task is located in the work queue of other threads The leader of the team, which means that the worker thread uses the FIFO method when stealing the tasks of other worker threads.
  4. When encountering join(), if the task that needs to join has not been completed, other tasks will be processed first and wait for it to complete.
  5. When there is neither its own task nor a task to steal, it goes to sleep.
ForkJoinPool

ForkJoinPool is an execution pool used to execute ForkJoinTask tasks. It is no longer a combination of the traditional execution pool Worker+Queue, but maintains a queue array WorkQueue (WorkQueue[]), which greatly reduces the time when submitting tasks and thread tasks. collision.

WorkQueue
  • WorkQueue is a two-way list, used for the orderly execution of tasks. If WorkQueue is used for its own execution thread Thread, the thread will select tasks from the end to execute LIFO by default.
  • Each ForkJoinWorkThread has its own WorkQueue, but not every WorkQueue has a corresponding ForkJoinWorkThread.
  • ForkJoinWorkThread WorkQueue no preservation is the submission, submission from the outside, the subscript WorkQueue [] is an even number of bits.

Insert picture description here

ForkJoinWorkThread

ForkJoinWorkThread is a thread used to execute tasks. It is used to distinguish the use of non-ForkJoinWorkThread threads to submit tasks. The start a Thread, is automatically registered to a WorkQueue Pool, have Thread of WorkQueue only appear in WorkQueue [] of odd bits.

Insert picture description here

ForkJoinTask

ForkJoinTask is a task. It is lighter than traditional tasks. It is no longer a subclass of Runnable. It provides Fork/Join methods to divide tasks and aggregate results.

fork method
  • fork() The only work done is to push the task into the work queue of the current worker thread.
public final ForkJoinTask<V> fork() {
    
    
    Thread t;
    if ((t = Thread.currentThread()) instanceof ForkJoinWorkerThread)
        ((ForkJoinWorkerThread)t).workQueue.push(this);
    else
        ForkJoinPool.common.externalPush(this);
    return this;
}
join method

join() The work of is much more complicated, which is why it can prevent threads from being blocked.

Check the call join()if the thread is ForkJoinThread thread. If it is not (such as the main thread), block the current thread and wait for the task to complete. If it is, it does not block.

Check the completion status of the task, if it has been completed, return the result directly.

If the task has not been completed but is in its own work queue, complete it.

If the task has been stolen by other worker threads, steal the task in the thief's work queue (in FIFO mode) and execute it in order to help it complete the pre-join task as soon as possible.

If the thief who stole the task has completed all his tasks and is waiting for the task that needs Join, find the thief's thief and help it complete its task.

Perform step 5 recursively.

img

In addition to the work queues owned by each worker thread, ForkJoinPool also has work queues. The function of these work queues is to receive tasks submitted by external threads (not ForkJoinThread threads). These work queues are called For submitting queue.

submit()And fork()in fact there is no essential difference, but the commit became submitting queue only (and some synchronization operation is initiated). The submitting queue, like other work queues, is the object "stolen" by the worker thread. Therefore, when a task in it is successfully stolen by a worker thread, it means that the submitted task really begins to enter the execution stage.

ForkJoin schematic

Insert picture description here

Guess you like

Origin blog.csdn.net/e891377/article/details/108876370