C++ thread pool ThreadPoolExecutor implementation principle

1. Why use thread pool

In actual use, threads take up system resources. Poor management of threads can easily lead to system problems. Therefore, thread pools are used to manage threads in most concurrency frameworks. The main benefits of using thread pools to manage threads are as follows:

  1. Reduce resource consumption . Reduce system performance loss as much as possible by reusing existing threads and reducing the number of thread shutdowns;
  2. Improve system response speed . By reusing threads, the process of creating threads is omitted, thus improving the response speed of the system as a whole;
  3. Improve the manageability of threads . Thread is a scarce resource. If created without limit, it will not only consume system resources, but also reduce the stability of the system. Therefore, it is necessary to use a thread pool to manage threads.

2. How the thread pool works

When a concurrent task is submitted to the thread pool, the thread pool allocates threads to execute the task as shown in the following figure:

 

Thread pool execution flow chart.jpg

As can be seen from the figure, the thread pool executes the submitted tasks in the following stages:

  1. First determine whether all threads in the core thread pool in the thread pool are performing tasks. If not, create a new thread to execute the task just submitted, otherwise, all threads in the core thread pool are executing tasks, then go to step 2;
  2. Determine whether the current blocking queue is full, if not, place the submitted task in the blocking queue; otherwise, go to step 3;
  3. Determine whether all threads in the thread pool are performing tasks, if not, create a new thread to perform tasks, otherwise, hand over to the saturation strategy for processing

3. Implementation of thread pool

Friends who don’t understand can take a look at this, a video explanation of the reality of the thread pool, click: 150 lines of code, handwritten thread pool

4. Thread pool creation

The creation of the thread pool is mainly done by the ThreadPoolExecutor class. ThreadPoolExecutor has many overloaded construction methods. The construction method with the most parameters is used to understand the parameters that need to be configured to create the thread pool. The construction method of ThreadPoolExecutor is:

ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue,
                              ThreadFactory threadFactory,
                              RejectedExecutionHandler handler)

The parameters are described below:

  1. corePoolSize: indicates the size of the core thread pool. When submitting a task, if the number of threads in the current core thread pool does not reach corePoolSize, a new thread will be created to execute the submitted task, even if there are idle threads in the current core thread pool . If the number of threads in the current core thread pool has reached corePoolSize, no more threads are created. If prestartCoreThread() or prestartAllCoreThreads() is called, all core threads will be created and started when the thread pool is created.
  2. maximumPoolSize: Indicates the maximum number of threads that the thread pool can create. If when the blocking queue is full and the number of threads in the current thread pool does not exceed the maximumPoolSize, a new thread will be created to perform the task.
  3. keepAliveTime: idle thread survival time. If the number of threads in the current thread pool has exceeded corePoolSize, and the thread idle time exceeds keepAliveTime, these idle threads will be destroyed, which can reduce system resource consumption as much as possible.
  4. unit: Time unit. Specify the time unit for keepAliveTime.
  5. workQueue: blocking queue. The blocking queue used to save tasks, you can read this article about blocking queues. You can use ArrayBlockingQueue, LinkedBlockingQueue, SynchronousQueue, PriorityBlockingQueue .
  6. threadFactory: The engineering class for creating threads. You can set a more meaningful name for each created thread by specifying the thread factory. If there is a concurrency problem, it is also convenient to find the cause of the problem.
  7. handler: Saturation strategy. When the blocking queue of the thread pool is full and the specified threads have been opened, indicating that the current thread pool is already in a saturated state, then a strategy is needed to deal with this situation. There are several strategies adopted:
    1. AbortPolicy: directly reject the submitted task and throw a RejectedExecutionException exception;
    2. CallerRunsPolicy: Only use the thread of the caller to perform tasks;
    3. DiscardPolicy: Discard the task directly without processing;
    4. DiscardOldestPolicy: Discard the longest stored task in the blocking queue and execute the current task

Share More on C / C ++ Linux back-end development of the underlying principles of knowledge and learning network to enhance the learning materials , complete technology stack, content knowledge, including Linux, Nginx, ZeroMQ, MySQL, Redis, thread pool, MongoDB, ZK, streaming media, audio and video , Linux kernel, CDN, P2P, epoll, Docker, TCP/IP, coroutine, DPDK, etc.

Thread pool execution logic

After the thread pool is created through ThreadPoolExecutor, the execution process after the task is submitted, let's take a look at the source code. The source code of the execute method is as follows:

public void execute(Runnable command) {
    if (command == null)
        throw new NullPointerException();
    /*
     * Proceed in 3 steps:
     *
     * 1. If fewer than corePoolSize threads are running, try to
     * start a new thread with the given command as its first
     * task.  The call to addWorker atomically checks runState and
     * workerCount, and so prevents false alarms that would add
     * threads when it shouldn't, by returning false.
     *
     * 2. If a task can be successfully queued, then we still need
     * to double-check whether we should have added a thread
     * (because existing ones died since last checking) or that
     * the pool shut down since entry into this method. So we
     * recheck state and if necessary roll back the enqueuing if
     * stopped, or start a new thread if there are none.
     *
     * 3. If we cannot queue task, then we try to add a new
     * thread.  If it fails, we know we are shut down or saturated
     * and so reject the task.
     */
    int c = ctl.get();
	//如果线程池的线程个数少于corePoolSize则创建新线程执行当前任务
    if (workerCountOf(c) < corePoolSize) {
        if (addWorker(command, true))
            return;
        c = ctl.get();
    }
	//如果线程个数大于corePoolSize或者创建线程失败,则将任务存放在阻塞队列workQueue中
    if (isRunning(c) && workQueue.offer(command)) {
        int recheck = ctl.get();
        if (! isRunning(recheck) && remove(command))
            reject(command);
        else if (workerCountOf(recheck) == 0)
            addWorker(null, false);
    }
	//如果当前任务无法放进阻塞队列中,则创建新的线程来执行任务
    else if (!addWorker(command, false))
        reject(command);
}

Please refer to the notes for the execution logic of the execute method of ThreadPoolExecutor. The following figure shows the execution diagram of the execute method of ThreadPoolExecutor:

 

Schematic diagram of execute execution process.jpg

There are several situations in the execute method execution logic:

  1. If the currently running threads are less than corePoolSize, new threads will be created to perform new tasks;
  2. If the number of running threads is equal to or greater than corePoolSize, the submitted tasks will be stored in the blocking queue workQueue;
  3. If the current workQueue queue is full, a new thread will be created to perform the task;
  4. If the number of threads has exceeded the maximumPoolSize, the saturation strategy RejectedExecutionHandler will be used for processing.

It should be noted that the design idea of ​​the thread pool is to use the core thread pool corePoolSize, the blocking queue workQueue and the thread pool maximumPoolSize , such a caching strategy to process tasks, in fact, this design idea will be used in the framework.

5. Closing the thread pool

To close the thread pool, you can use the shutdown and shutdownNow methods. Their principle is to traverse all the threads in the thread pool, and then interrupt the threads in turn. There are still differences between shutdown and shutdownNow:

  1. shutdownNow first sets the state of the thread pool to STOP , then tries to stop all threads that are executing and unexecuting tasks , and returns the list of tasks waiting to be executed;
  2. Shutdown just sets the state of the thread pool to the SHUTDOWN state, and then interrupts all threads that are not performing tasks

It can be seen that the shutdown method will continue the execution of the task being executed, and shutdownNow will directly interrupt the task being executed. When either of these two methods is called, the isShutdown method will return true. When all threads are shut down successfully, it means that the thread pool is successfully shut down. At this time, the isTerminated method will return true.

5. How to configure thread pool parameters reasonably?

If you want to configure the thread pool reasonably, you must first analyze the task characteristics, which can be analyzed from the following perspectives:

  1. The nature of the task: CPU-intensive tasks, IO-intensive tasks and mixed tasks.
  2. The priority of the task: high, medium and low.
  3. Task execution time: long, medium and short.
  4. Task dependency: whether it depends on other system resources, such as database connections.

Tasks with different nature of tasks can be processed separately by thread pools of different sizes. Configure CPU-intensive tasks with as few threads as possible, such as configuring a thread pool with Ncpu+1 threads. IO-intensive tasks need to wait for IO operations, and threads are not always performing tasks, so configure as many threads as possible, such as 2xNcpu . For mixed tasks, if they can be split, split them into a CPU-intensive task and an IO-intensive task. As long as the time difference between the execution of the two tasks is not too large, the throughput rate after the decomposition will be higher. Due to the throughput rate of serial execution, if the execution time of these two tasks differs too much, there is no need to decompose. We can get the number of CPUs of the current device through the Runtime.getRuntime().availableProcessors() method.

Tasks with different priorities can be processed using the priority queue PriorityBlockingQueue. It allows high-priority tasks to be executed first. It should be noted that if there are always high-priority tasks submitted to the queue, then the low-priority tasks may never be executed.

Tasks with different execution times can be handed over to thread pools of different sizes for processing, or priority queues can be used to allow tasks with short execution times to be executed first.

Relying on the task of the database connection pool, because the thread needs to wait for the database to return the result after submitting the SQL. If the waiting time is longer, the CPU idle time will be longer, then the number of threads should be set to be larger, so as to make better use of the CPU.

Moreover, the blocking queue is best to use a bounded queue . If an unbounded queue is used, once the task backlog is in the blocking queue, it will take up too much memory resources and even cause the system to crash.

Guess you like

Origin blog.csdn.net/Linuxhus/article/details/115030132