Baidu senior architect helps you analyze the Java thread pool ThreadPoolExecutor

The benefits of thread pools

  1. Reduce resource consumption Reduce the consumption
    caused by thread creation and destruction by reusing already created threads
  2. Improve response speed
    When a task arrives, the task can be executed immediately without waiting for the thread to be created
  3. Manageable
    threads that improve threads are scarce resources. If they are created without restrictions, they will not only consume system resources, but also reduce the stability of the system. Using thread pools can be used for unified allocation, tuning, and monitoring.

Implementation principle

When a new task is submitted to the thread pool, the processing flow of the thread pool is: The main workflow of the Java thread pool The main
workflow of the
Java thread pool

1). The thread pool determines whether the threads in the core thread pool are all executing tasks.
If not, create a new worker thread to execute the task. If the threads in the core thread pool are all executing tasks, enter the next process.
2). The thread pool determines whether the work queue is full.
If the work queue is not full, store newly submitted tasks in this work queue. If the work queue is full, go to the next process.
3). The thread pool judges whether the threads of the thread pool are all in working state.
If not, create a new worker thread to execute the task. If it is full, it is handed over to the saturation strategy to handle this task.

Schematic diagram of ThreadPoolExecutor executing the execute() method:

1). If the currently running thread is less than corePoolSize, create a new thread to execute the task (note that this step requires acquiring a global lock)
2). If the running thread is equal to or more than corePoolSize, add the task to the BlockingQueue.
3). If the task cannot be added to the BlockingQueue (the queue is full), create a new thread to process the task (note that a global lock needs to be acquired to perform this step).
4). If creating a new thread will make the currently running thread exceed the maximumPoolSize, the task will be rejected and the RejectedExecutionHandler.rejectedExecution() method will be called.

The general design idea of ​​ThreadPoolExecutor taking the above steps is to avoid acquiring the global lock as much as possible when executing the execute() method (that would be a serious scalability bottleneck). After the ThreadPoolExecutor completes the warm-up (the number of currently running threads is greater than or equal to corePoolSize), almost all execute() method calls are performed in step 2, and the second step 2 does not need to acquire a global lock.

Source code analysis. The above process analysis allows us to intuitively understand the working principle of the thread pool, let us see how it is implemented through the source code. Thread pools perform tasks in the following ways:
Baidu senior architect helps you analyze the Java thread pool ThreadPoolExecutor
worker threads. When the thread pool creates a thread, it will encapsulate the thread into a worker thread Worker. After the worker finishes executing the task, it will obtain the tasks in the work queue in an infinite loop for execution. We can see this from the worker's run method:
Baidu senior architect helps you analyze the Java thread pool ThreadPoolExecutor
1 2
new ThreadPoolExecutor(corePoolSize, maximumPoolSize, keepAliveTime, milliseconds, runnableTaskQueue, threadFactory, handler);
To create a thread pool, you need to enter several parameters:
corePoolSize (the basic size of the thread pool ): When submitting a task to the thread pool, the thread pool will create a thread to execute the task, even if other idle basic threads can execute new tasks, the thread will be created, and will not be executed when the number of tasks to be executed is greater than the basic size of the thread pool. Create again. If the thread pool's prestartAllCoreThreads method is called, the thread pool will create and start all basic threads in advance.
runnableTaskQueue (task queue): A blocking queue used to hold tasks waiting to be executed. The following blocking queues can be selected.
ArrayBlockingQueue: It is a bounded blocking queue based on an array structure, which sorts elements according to the FIFO (first in, first out) principle.
LinkedBlockingQueue: A blocking queue based on a linked list structure, this queue sorts elements according to FIFO (first in, first out), and the throughput is usually higher than that of ArrayBlockingQueue. The static factory method Executors.newFixedThreadPool() uses this queue.
SynchronousQueue: A blocking queue that does not store elements. Each insert operation must wait until another thread calls the remove operation, otherwise the insert operation is always blocked, and the throughput is usually higher than LinkedBlockingQueue, which is used by the static factory method Executors.newCachedThreadPool.
PriorityBlockingQueue: An infinite blocking queue with priority.
maximumPoolSize (the maximum size of the thread pool): the maximum number of threads allowed to be created by the thread pool. If the queue is full and the number of threads already created is less than the maximum number of threads, the thread pool will create new threads to perform tasks. It is worth noting that this parameter has no effect if the unbounded task queue is used.
ThreadFactory: It is used to set the factory for creating threads. You can set a more meaningful name for each created thread through the thread factory. It is very helpful when debugging and locating problems.

RejectedExecutionHandler (saturation strategy): When the queue and thread pool are full, indicating that the thread pool is saturated, a strategy must be adopted to process the new tasks submitted. This policy is AbortPolicy by default, which means that an exception is thrown when a new task cannot be processed. The following are four strategies provided by JDK1.5. n AbortPolicy: throw an exception directly.
CallerRunsPolicy: Only use the thread of the caller to run tasks.
DiscardOldestPolicy: Discard the most recent task in the queue and execute the current task.
DiscardPolicy: Do not process, discard.
Of course, you can also implement the RejectedExecutionHandler interface custom strategy according to the needs of the application scenario. Such as logging or persisting tasks that cannot be handled.
keepAliveTime (thread activity retention time): The time to keep alive after the worker thread of the thread pool is idle. Therefore, if there are many tasks and the execution time of each task is relatively short, this time can be increased to improve the utilization of threads.
TimeUnit (unit of thread activity retention time): optional units are days (DAYS), hours (HOURS), minutes (MINUTES), milliseconds (MILLISECONDS), microseconds (MICROSECONDS, thousandths of a millisecond) and nanoseconds (NANOSECONDS, thousandths of a microsecond).

To submit a task to the thread pool,
we can use the task submitted by execute, but the execute method has no return value, so it is impossible to judge whether the task is successfully executed by the thread pool. The following code shows that the task input by the execute method is an instance of the Runnable class.
Baidu senior architect helps you analyze the Java thread pool ThreadPoolExecutor
We can also use the submit method to submit the task, it will return a future, then we can use this future to determine whether the task is successfully executed, and use the get method of the future to get the return value, the get method will block until the task is completed, and use The get(long timeout, TimeUnit unit) method will block for a period of time and return immediately. At this time, the task may not be completed.
Baidu senior architect helps you analyze the Java thread pool ThreadPoolExecutor
Closing the thread pool
We can close the thread pool by calling the shutdown or shutdownNow methods of the thread pool, but their implementation principles are different. The principle of shutdown is to just set the state of the thread pool to the SHUTDOWN state, and then interrupt all tasks that are not executing. thread. The principle of shutdownNow is to traverse the worker threads in the thread pool, and then call the interrupt method of the thread one by one to interrupt the thread, so the task that cannot respond to the interruption may never be terminated. shutdownNow will first set the state of the thread pool to STOP, then try to stop all threads that are executing or suspending tasks, and return the list of tasks waiting to be executed.
The isShutdown method returns true whenever one of these two shutdown methods is called. When all tasks have been closed, it means that the thread pool is closed successfully, then calling the isTerminaed method will return true. As for which method we should call to close the thread pool, it should be determined by the characteristics of the tasks submitted to the thread pool. Usually, shutdown is called to close the thread pool. If the task does not have to be executed, shutdownNow can be called.

Reasonable configuration of thread pools
To properly configure thread pools, you must first analyze the characteristics of tasks, which can be analyzed from the following perspectives:
The nature of tasks: CPU-intensive tasks, IO-intensive tasks, and mixed tasks.
The priority of the task: high, medium and low.
Task execution time: long, medium and short.
Dependency of the task: Whether it depends on other system resources, such as database connections.
Tasks with different nature of tasks can be processed separately by thread pools of different sizes. For CPU-intensive tasks, configure as few threads as possible, such as configuring a thread pool of Ncpu+1 threads. IO-intensive tasks need to wait for IO operations, and the threads are not always executing tasks, so configure as many threads as possible, such as 2*Ncpu. Mixed tasks, if they can be split, split them into a CPU-intensive task and an IO-intensive task. As long as the execution time of the two tasks is not too different, the throughput of the split will be higher. Due to the throughput rate of serial execution, if the execution time of the two tasks is too different, there is no need to decompose. We can get the number of CPUs of the current device through the Runtime.getRuntime().availableProcessors() method.
Tasks with different priorities can be processed using the priority queue PriorityBlockingQueue. It allows tasks with higher priorities to be executed first. It should be noted that if tasks with higher priorities are always submitted to the queue, tasks with lower priorities may never be executed.
Tasks with different execution times can be handed over to thread pools of different sizes for processing, or a priority queue can be used to allow tasks with short execution times to be executed first.
Depends on the task of the database connection pool, because the thread needs to wait for the database to return the result after submitting the SQL. If the waiting time is longer, the CPU idle time will be longer, then the number of threads should be set larger, so as to make better use of the CPU.
It is recommended to use a bounded queue. The bounded queue can increase the stability and early warning capability of the system. It can be set larger as needed, such as several thousand. Once the queues and thread pools of the background task thread pool used by our group were full, and the exception of abandoned tasks was constantly thrown. Through investigation, it was found that there was a problem with the database, which caused the execution of SQL to become very slow, because the background task thread pool All the tasks in the thread pool need to query and insert data into the database, so all the worker threads in the thread pool are blocked, and the task backlog is accumulated in the thread pool. If we set it to an unbounded queue at that time, there will be more and more queues in the thread pool, which may fill up the memory, causing the entire system to be unavailable, not just background tasks. Of course, all tasks in our system are deployed on separate servers, and we use thread pools of different sizes to run different types of tasks, but when such problems occur, other tasks will also be affected.
The monitoring of the thread pool is monitored
through the parameters provided by the thread pool. There are some properties in the thread pool that you can use
taskCount when monitoring the thread pool: the number of tasks that the thread pool needs to execute.
completedTaskCount: The number of tasks that the thread pool has completed during the running process. Less than or equal to taskCount.
largestPoolSize: The largest number of threads the thread pool has ever created. Through this data, you can know whether the thread pool is full. If it is equal to the maximum size of the thread pool, it means that the thread pool has been full.
getPoolSize: The number of threads in the thread pool. If the thread pool is not destroyed, the threads in the pool will not be destroyed automatically, so this size will only increase or decrease.
getActiveCount: Get the number of active threads.
Monitoring by extending the thread pool. By inheriting the thread pool and overriding the thread pool's beforeExecute, afterExecute and terminated methods, we can do something before the task is executed, after the execution and before the thread pool closes. Such as the average execution time, maximum execution time and minimum execution time of monitoring tasks. These methods are empty methods in the thread pool. Such as:
<b>protected</b> <b>void</b> beforeExecute(Thread t, Runnable r) { }

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324820563&siteId=291194637