Java thread pool ThreadPoolExecutor (1) The core method and principle of thread pool

I. Overview

Everyone knows the way to create threads in java, but what's the problem with this creation? Of course, there is a problem. You must know that threads are extremely resource-intensive when they are created. For example, if you want to take a bus, you can use it and others can use it. When you arrive at the waste recycling station, of course you go back to the bus station. This bus station is like a thread pool, and the bus is like each thread. I don't know if you understand this description, and it doesn't matter if you don't understand. There are more detailed introductions below.

2. Getting to know ThreadPoolExecutor first

2.1 Official description

Open the idea and we find ThreadPoolExecutor to see this class as follows

 * <p>Thread pools address two different problems: they usually
 * provide improved performance when executing large numbers of
 * asynchronous tasks, due to reduced per-task invocation overhead,
 * and they provide a means of bounding and managing the resources,
 * including threads, consumed when executing a collection of tasks.
 * Each {@code ThreadPoolExecutor} also maintains some basic
 * statistics, such as the number of completed tasks.
 *
线程池处理两个不同的问题:它们通常在执行大量异步任务,由于减少了每个任务的调用开销,它们提供了一种界定和管理资源的方法,包括执行任务集合时使用的线程。每个线程池执行器也维护一些基本的统计信息,如已完成任务的数量。

Seeing that the official description is like this, then everyone has a better understanding of him.

2.2 Method

2.1.1 Construction method

The following are the four construction methods for creating a thread pool. Through the code, we know that the above three methods are executed by calling the fourth method.

    public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue) {
        this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
             Executors.defaultThreadFactory(), defaultHandler);
    }
    
    public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue,
                              ThreadFactory threadFactory) {
        this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
             threadFactory, defaultHandler);
    }

    public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue,
                              RejectedExecutionHandler handler) {
        this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
             Executors.defaultThreadFactory(), handler);
    }
    
    public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue,
                              ThreadFactory threadFactory,
                              RejectedExecutionHandler handler) {
        if (corePoolSize < 0 ||
            maximumPoolSize <= 0 ||
            maximumPoolSize < corePoolSize ||
            keepAliveTime < 0)
            throw new IllegalArgumentException();
        if (workQueue == null || threadFactory == null || handler == null)
            throw new NullPointerException();
        this.corePoolSize = corePoolSize;
        this.maximumPoolSize = maximumPoolSize;
        this.workQueue = workQueue;
        this.keepAliveTime = unit.toNanos(keepAliveTime);
        this.threadFactory = threadFactory;
        this.handler = handler;
    }

2.1.2 Parameter introduction

  • corePoolSize: The size of the core pool. This parameter has a great relationship with the implementation principle of the thread pool described later. After the thread pool is created, by default, there are no threads in the thread pool, but wait for a task to arrive before creating a thread to execute the task, unless the prestartAllCoreThreads() or prestartCoreThread() method is called, from these two methods As you can see from the name, it means pre-created threads, that is, to create corePoolSize threads or one thread before no tasks arrive. By default, after the thread pool is created, the number of threads in the thread pool is 0. When a task comes, a thread will be created to execute the task. When the number of threads in the thread pool reaches corePoolSize, the arrived Put the tasks in the cache queue;
  • maximumPoolSize: The maximum number of threads in the thread pool. This parameter is also a very important parameter. It indicates how many threads can be created in the thread pool at most;
  • keepAliveTime: Indicates how long the thread will be terminated when there is no task to execute. By default, keepAliveTime will work only when the number of threads in the thread pool is greater than corePoolSize, until the number of threads in the thread pool is not greater than corePoolSize, that is, when the number of threads in the thread pool is greater than corePoolSize, if a thread is idle When the time reaches keepAliveTime, it will be terminated until the number of threads in the thread pool does not exceed corePoolSize. But if the allowCoreThreadTimeOut(boolean) method is called, when the number of threads in the thread pool is not greater than corePoolSize, the keepAliveTime parameter will also work until the number of threads in the thread pool is 0;
  • unit: The time unit of the parameter keepAliveTime, there are 7 values, and there are 7 static properties in the TimeUnit class:
TimeUnit.DAYS;               //天
TimeUnit.HOURS;             //小时
TimeUnit.MINUTES;           //分钟
TimeUnit.SECONDS;           //秒
TimeUnit.MILLISECONDS;      //毫秒
TimeUnit.MICROSECONDS;      //微妙
TimeUnit.NANOSECONDS;       //纳秒

 

  • workQueue: A blocking queue used to store tasks waiting to be executed. The choice of this parameter is also very important and will have a significant impact on the running process of the thread pool. Generally speaking, the blocking queue here has the following options:
ArrayBlockingQueue;
LinkedBlockingQueue;
SynchronousQueue;

 ArrayBlockingQueue and PriorityBlockingQueue are used less, and LinkedBlockingQueue and Synchronous are generally used. The queuing strategy of the thread pool is related to BlockingQueue.

  • threadFactory: thread factory, mainly used to create threads;
  • handler: Indicates the strategy when the task is refused to be processed, and has the following four values:
ThreadPoolExecutor.AbortPolicy:丢弃任务并抛出RejectedExecutionException异常。 
ThreadPoolExecutor.DiscardPolicy:也是丢弃任务,但是不抛出异常。 
ThreadPoolExecutor.DiscardOldestPolicy:丢弃队列最前面的任务,然后重新尝试执行任务(重复此过程)
ThreadPoolExecutor.CallerRunsPolicy:由调用线程处理该任务 

 2.1.3 Inheritance and implementation relationship

We see that the ThreadPoolExecutor class inherits AbstractExecutorService, so let's take a look at the AbstractExecutorService class.

public abstract class AbstractExecutorService implements ExecutorService {
 
     
    protected <T> RunnableFuture<T> newTaskFor(Runnable runnable, T value) { };
    protected <T> RunnableFuture<T> newTaskFor(Callable<T> callable) { };
    public Future<?> submit(Runnable task) {};
    public <T> Future<T> submit(Runnable task, T result) { };
    public <T> Future<T> submit(Callable<T> task) { };
    private <T> T doInvokeAny(Collection<? extends Callable<T>> tasks,
                            boolean timed, long nanos)
        throws InterruptedException, ExecutionException, TimeoutException {
    };
    public <T> T invokeAny(Collection<? extends Callable<T>> tasks)
        throws InterruptedException, ExecutionException {
    };
    public <T> T invokeAny(Collection<? extends Callable<T>> tasks,
                           long timeout, TimeUnit unit)
        throws InterruptedException, ExecutionException, TimeoutException {
    };
    public <T> List<Future<T>> invokeAll(Collection<? extends Callable<T>> tasks)
        throws InterruptedException {
    };
    public <T> List<Future<T>> invokeAll(Collection<? extends Callable<T>> tasks,
                                         long timeout, TimeUnit unit)
        throws InterruptedException {
    };
}

AbstractExecutorService is an abstract class that implements the ExecutorService interface. Then let's follow up to see what is written in ExecutorService

public interface ExecutorService extends Executor {
 
    void shutdown();
    boolean isShutdown();
    boolean isTerminated();
    boolean awaitTermination(long timeout, TimeUnit unit)
        throws InterruptedException;
    <T> Future<T> submit(Callable<T> task);
    <T> Future<T> submit(Runnable task, T result);
    Future<?> submit(Runnable task);
    <T> List<Future<T>> invokeAll(Collection<? extends Callable<T>> tasks)
        throws InterruptedException;
    <T> List<Future<T>> invokeAll(Collection<? extends Callable<T>> tasks,
                                  long timeout, TimeUnit unit)
        throws InterruptedException;
 
    <T> T invokeAny(Collection<? extends Callable<T>> tasks)
        throws InterruptedException, ExecutionException;
    <T> T invokeAny(Collection<? extends Callable<T>> tasks,
                    long timeout, TimeUnit unit)
        throws InterruptedException, ExecutionException, TimeoutException;
}

 Barbara is a lot of methods, don’t you feel dizzy? It’s okay to see him finally inherit Executor

public interface Executor {
    void execute(Runnable command);
}

At this point, everyone should understand the relationship between ThreadPoolExecutor, AbstractExecutorService, ExecutorService and Executor. 

What? You still don't understand? It doesn't matter look at the picture

 

Executor is a top-level interface, in which only one method execute(Runnable) is declared, the return value is void, and the parameter is of Runnable type. It can be understood from the literal meaning that it is used to execute the tasks passed in; then the ExecutorService interface inherits Executor interface, and declares some methods: submit, invokeAll, invokeAny, and shutDown, etc.; the abstract class AbstractExecutorService implements the ExecutorService interface, basically implementing all the methods declared in ExecutorService; then ThreadPoolExecutor inherits the class AbstractExecutorService. There are several very important methods in the ThreadPoolExecutor class:

execute()

submit()

shutdown()

shutdownNow()

 

The execute() method is actually a method declared in Executor, which is implemented in ThreadPoolExecutor. This method is the core method of ThreadPoolExecutor. Through this method, a task can be submitted to the thread pool for execution by the thread pool.

The submit() method is a method declared in ExecutorService. It has been implemented in AbstractExecutorService. It has not been rewritten in ThreadPoolExecutor. This method is also used to submit tasks to the thread pool, but it is the same as execute( ) method is different, it can return the result of task execution, look at the implementation of the submit() method, you will find that it actually calls the execute() method, but it uses Future to obtain the task execution result (Future related content will be described in the next article).

  shutdown() and shutdownNow() are used to close the thread pool.

  There are many other ways:

  For example: getQueue(), getPoolSize(), getActiveCount(), getCompletedTaskCount() and other methods to obtain attributes related to the thread pool. Interested friends can refer to the API by themselves.

3. In-depth analysis of ThreadPoolExecutor

In the previous section, we introduced ThreadPoolExecutor from a macro perspective. Let's analyze the specific implementation principle of the thread pool in depth, and explain it from the following aspects:

 3.1 The state of the thread pool

 3.2 Execution of tasks

 3.3 Thread initialization in the thread pool

 3.4 Task cache queue and task queuing strategy

 3.5 Task rejection strategy

 3.6 Closing the thread pool

 3.7 Dynamic adjustment of thread pool capacity

3.1 The state of the thread pool

The following is the official note

     *   RUNNING:  Accept new tasks and process queued tasks
     *   SHUTDOWN: Don't accept new tasks, but process queued tasks
     *   STOP:     Don't accept new tasks, don't process queued tasks,
     *             and interrupt in-progress tasks
     *   TIDYING:  All tasks have terminated, workerCount is zero,
     *             the thread transitioning to state TIDYING
     *             will run the terminated() hook method
     *   TERMINATED: terminated() has completed


RUNNING:接受新任务并处理排队的任务

SHUTDOWN:不接受新任务,但处理排队的任务

STOP:不接受新任务,不处理排队的任务,中断正在进行的任务

TIDYING:所有任务都已终止,WorkerCount为零,线程转换为状态整理将运行终止的()hook方法

TERMINATED:已终止()已完成

When the thread pool is created, initially, the thread pool is in the RUNNING state;

  If the shutdown() method is called, the thread pool is in the SHUTDOWN state. At this time, the thread pool cannot accept new tasks, and it will wait for all tasks to be executed;

  If the shutdownNow() method is called, the thread pool is in the STOP state. At this time, the thread pool cannot accept new tasks and will try to terminate the executing tasks;

  When the thread pool is in the SHUTDOWN or STOP state, and all worker threads have been destroyed, the task cache queue has been emptied or the execution is completed, the thread pool is set to the TERMINATED state.

 

Corresponds to specific values, but note that its status is such that thread visibility is guaranteed


private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0)); // 初始值:状态RUNNINT,工作线程数量:0

 

 3.2 Execution of tasks

private final BlockingQueue<Runnable> workQueue;              //任务缓存队列,用来存放等待执行的任务
private final ReentrantLock mainLock = new ReentrantLock();   //线程池的主要状态锁,对线程池状态(比如线程池大小
                                                              //、runState等)的改变都要使用这个锁
private final HashSet<Worker> workers = new HashSet<Worker>();  //用来存放工作集
 
private volatile long  keepAliveTime;    //线程存活时间   
private volatile boolean allowCoreThreadTimeOut;   //是否允许为核心线程设置存活时间
private volatile int   corePoolSize;     //核心池的大小(即线程池中的线程数目大于这个参数时,提交的任务会被放进任务缓存队列)
private volatile int   maximumPoolSize;   //线程池最大能容忍的线程数
 
private volatile int   poolSize;       //线程池中当前的线程数
 
private volatile RejectedExecutionHandler handler; //任务拒绝策略
 
private volatile ThreadFactory threadFactory;   //线程工厂,用来创建线程
 
private int largestPoolSize;   //用来记录线程池中曾经出现过的最大线程数
 
private long completedTaskCount;   //用来记录已经执行完毕的任务个数

The role of each variable has been marked. Here we will focus on explaining the three variables corePoolSize, maximumPoolSize, and largestPoolSize. corePoolSize is translated into the core pool size in many places. In fact, I understand that this is the size of the thread pool. 

Take a chestnut: Suppose there are ten machines in a factory, and each machine can only do one thing at a time. Now if there is no task, the machine is idle. If there is a task, the task is assigned to the idle machine. If the machines are all working, then queue up redundant tasks (cache queue). If the speed of adding new tasks is much faster than the speed of each task produced by the machine, then the boss will think of remedial measures and purchase a few new machines. Suppose to buy first Ten (the maximum number of threads that the thread pool can tolerate); if the newly purchased machine cannot meet the existing requirements, then no new tasks will be accepted (rejection strategy). If there are fewer tasks in the future, the boss will consider selling some of the newly purchased machines. After all, it is a waste of money to be idle (outside the corePoolSize range of the thread pool), and there are only 10 newly purchased machines.

The corePoolSize in this example is 10, and the maximumPoolSize is 20 (10+10). In other words, corePoolSize is the size of the thread pool. In my opinion, maximumPoolSize is a remedy for the thread pool, that is, a remedy for when the task load is suddenly too large. However, for the convenience of understanding, corePoolSize will be translated into the core pool size later in this article.

In the ThreadPoolExecutor class, the core task submission method is the execute() method. Although tasks can also be submitted through submit, in fact the execute() method is finally called in the submit method, so we only need to study the execute() method. The principle can be realized:

 public void execute(Runnable command) {
        if (command == null)
            throw new NullPointerException();
       
        int c = ctl.get();
        if (workerCountOf(c) < corePoolSize) {
            if (addWorker(command, true))
                return;
            c = ctl.get();
        }
        if (isRunning(c) && workQueue.offer(command)) {
            int recheck = ctl.get();
            if (! isRunning(recheck) && remove(command))
                reject(command);
            else if (workerCountOf(recheck) == 0)
                addWorker(null, false);
        }
        else if (!addWorker(command, false))
            reject(command);
    }

 After reading it, do you feel confused like drinking two catties of Erguotou if there is no comment? It doesn't matter, the semantics of the code will be analyzed step by step below.

The first step is to determine whether the current thread is empty, and if it is empty, a null pointer is thrown

The second step is to judge whether the current number of threads is less than the number of core threads corePoolSize, less than when the total number of threads is less than corePoolSize, the task will be directly scheduled through addWorker().

The third step will enter the waiting queue at workQueue.offer(). If entering the waiting queue fails (for example, the bounded queue reaches the upper limit, or SynchronousQueue is used), the next addWorker() will be executed to submit the task directly to the thread pool.

In the fourth step, if the current number of threads has reached the maximumPoolSize, the submission fails and the rejection strategy reject() is executed.

I probably summed up the flow chart like this

 

3.3 Thread initialization in the thread pool

By default, the thread pool will not create threads when it is created. What if I have to create it? Of course there is a way, jdk provides us with the following methods

Initialize a core thread

public boolean prestartCoreThread() {
        return workerCountOf(ctl.get()) < corePoolSize &&
            addWorker(null, true);
}

Initialize all core threads

public int prestartAllCoreThreads() {
        int n = 0;
        while (addWorker(null, true))
            ++n;
        return n;
}

Then you can see that these two methods call the same method addWorker, which is used to create, run, and clean up Workers.

private boolean addWorker(Runnable firstTask, boolean core) {
        retry:
        for (;;) {
            int c = ctl.get();
            int rs = runStateOf(c);

            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN &&
                ! (rs == SHUTDOWN &&
                   firstTask == null &&
                   ! workQueue.isEmpty()))
                return false;

            for (;;) {
                int wc = workerCountOf(c);
                if (wc >= CAPACITY ||
                    wc >= (core ? corePoolSize : maximumPoolSize))
                    return false;
                if (compareAndIncrementWorkerCount(c))
                    break retry;
                c = ctl.get();  // Re-read ctl
                if (runStateOf(c) != rs)
                    continue retry;
                // else CAS failed due to workerCount change; retry inner loop
            }
        }

        boolean workerStarted = false;
        boolean workerAdded = false;
        Worker w = null;
        try {
            w = new Worker(firstTask);
            final Thread t = w.thread;
            if (t != null) {
                final ReentrantLock mainLock = this.mainLock;
                mainLock.lock();
                try {
                    // Recheck while holding lock.
                    // Back out on ThreadFactory failure or if
                    // shut down before lock acquired.
                    int rs = runStateOf(ctl.get());

                    if (rs < SHUTDOWN ||
                        (rs == SHUTDOWN && firstTask == null)) {
                        if (t.isAlive()) // precheck that t is startable
                            throw new IllegalThreadStateException();
                        workers.add(w);
                        int s = workers.size();
                        if (s > largestPoolSize)
                            largestPoolSize = s;
                        workerAdded = true;
                    }
                } finally {
                    mainLock.unlock();
                }
                if (workerAdded) {
                    t.start();
                    workerStarted = true;
                }
            }
        } finally {
            if (! workerStarted)
                addWorkerFailed(w);
        }
        return workerStarted;
    }

Do you feel dizzy again after seeing this? It doesn't matter, let's analyze the semantics in the code step by step.

Step 1: Obtain the status of the current thread pool. If it is STOP, TIDYING, TERMINATED, it will return false. If the current status is SHUTDOWN, but firstTask is not empty or workQueue is empty, then return false directly.

Step 2: Determine whether the Worker to be added is corePool by means of spin. If so, then determine whether the current workerCount is greater than corePoolsize, otherwise determine whether it is greater than maximumPoolSize. If it is satisfied, it means that workerCount exceeds the thread pool Size, return false directly. If it is less than, then judge whether the WorkerCount is successfully increased by 1 through the CAS operation, if the increase is successful. Then proceed to step 3, otherwise, judge the state of the current thread pool. If the obtained state is inconsistent with the state of entering the spin, then judge the state again continue retry.

Step 3: If it is satisfied, then create a new Worker object, and then obtain the reentrant lock of the thread pool to judge the status of the current thread pool. If the current status of the thread pool is STOP, TIDYING, TERMINATED, then call isAlive judges whether the surviving thread has been started, and if it is a started thread, an exception IllegalThreadStateException is thrown and the addWorkerFailed method is executed

addWorkerFailed method

private void addWorkerFailed(Worker w) {
        final ReentrantLock mainLock = this.mainLock;//获取可重入锁
        mainLock.lock();//开启锁
        try {
            if (w != null)
                workers.remove(w);//清空当前任务
            decrementWorkerCount();//将workerCount减一
            tryTerminate();//停止线程池
        } finally {
            mainLock.unlock();//释放锁
        }
}

Step 4: If the status is satisfied, then add the newly created worker to workers, recalculate the largestPoolSize, and then start the thread in the worker to start executing the task.

3.4 Task cache queue and task queuing strategy

Earlier we mentioned that when the thread pool exceeds corePoolSize, it will be put into the cache queue, so let's talk about the cache queue of tasks.

1. ArrayBlockingQueue: Based on the first-in-first-out array, the size must be specified when creating it. If it exceeds the direct corePoolSize tasks, it will be added to the queue. Only the size set by the queue can be added, and threads will be created for the rest of the tasks until (corePoolSize + new threads) > maximumPoolSize.

2. LinkedBlockingQueue: First-in-first-out, unbounded queue based on linked list. Tasks that exceed the direct corePoolSize will be added to the queue until the resources are exhausted.

3. synchronousQueue: This queue is quite special. It will not save the submitted tasks, but will directly create a new thread to execute the new tasks.

3.5 Task rejection strategy

The following are the notes provided by the official website to explain to you

 * <li> In the default {@link ThreadPoolExecutor.AbortPolicy}, the
 * handler throws a runtime {@link RejectedExecutionException} upon
 * rejection. </li>
 *
 * <li> In {@link ThreadPoolExecutor.CallerRunsPolicy}, the thread
 * that invokes {@code execute} itself runs the task. This provides a
 * simple feedback control mechanism that will slow down the rate that
 * new tasks are submitted. </li>
 *
 * <li> In {@link ThreadPoolExecutor.DiscardPolicy}, a task that
 * cannot be executed is simply dropped.  </li>
 *
 * <li>In {@link ThreadPoolExecutor.DiscardOldestPolicy}, if the
 * executor is not shut down, the task at the head of the work queue
 * is dropped, and then execution is retried (which can fail again,
 * causing this to be repeated.) </li>

1. ThreadPoolExecutor.AbortPolicy: discard the task and directly throw an exception RejectedExecutionException.

2. ThreadPoolExecutor.CallerRunsPolicy: The task is processed by the calling thread, and a simple feedback mechanism is provided, which can effectively prevent the submission of new tasks.

3. ThreadPoolExecutor.DiscardPolicy: Discard the current task, but no exception will be thrown.

4. ThreadPoolExecutor.DiscardOldestPolicy: If the tasker is not closed, then discard the task at the top of the queue, and then retry the execution (it may fail again, repeat the above process).

3.6 Closing the thread pool

 The thread pool is opened and of course it is closed. The official website provides two ways to close it.

1. Shutdown: Provides an orderly shutdown. It will wait for all the tasks in the current cache queue to be executed before shutting down, but will not receive new tasks (relatively elegant).

2. shutdownNow: The thread pool will be closed immediately, the tasks being executed will be interrupted and the tasks in the cache queue will be cleared, and the tasks that have not yet been executed will be returned.

3.7 Dynamic adjustment of thread pool capacity 

What if I want to dynamically resize the thread pool? The official website provides the corresponding method.

1. setCorePoolSize: As the name implies, it is to set the corePoolSize size, but it is worth noting that if the set value is smaller than the current value, then the redundant threads will become idle. If the current set value is greater than the new value, then the new thread will go to the execution cache queue. task.

2. setMaximumPoolSize: Set the maximumPoolSize of the current thread pool. If the set value is smaller than the old value, the excess thread will be terminated when it is idle next time.

The above are all referenced from JDK1.8

Guess you like

Origin blog.csdn.net/m0_37506254/article/details/90574038