Regarding the thread pool, you have to know these things

Foreword

Hello everyone, I am jack xu. This is the second part of concurrent programming. Today I will talk to you about the thread pool. This article is a bit long, and the boys calmly and patiently read him. .

Why use thread pool

1) Reduce the performance overhead of creating and destroying threads

2) Improve the response speed. When there is a new task to be executed, it can be executed immediately without waiting for the thread to be created.

3) Reasonably setting the thread pool size can avoid problems caused by the number of threads exceeding the hardware resource bottleneck

Let's take a look at Alibaba's code specification. Thread creation in a project must be created using a thread pool. The reason is that I said the above three points

Use of thread pool

First let's take a look at the UML class diagram

  • Executor: You can see that the top layer is the Executor interface. This interface is very simple, with only one execute method. The purpose of this interface is to decouple task submission and task execution.

  • ExecutorService: This is still an interface, inherited from Executor, which extends the Executor interface and defines more thread pool related operations.

  • AbstractExecutorService: Provides some default implementations of ExecutorService.

  • ThreadPoolExecutor: Actually the thread pool implementation we use is ThreadPoolExecutor. It implements a complete mechanism for thread pool work. It is also the focus of our next analysis.

  • ForkJoinPool: Both ThreadPoolExecutor and ThreadPoolExecutor are inherited from AbstractExecutorService, suitable for divide and conquer, recursive calculation algorithm

  • ScheduledExecutorService: This interface extends ExecutorService to define a method for delayed execution and periodic execution of tasks.

  • ScheduledThreadPoolExecutor: This interface implements the ScheduledExecutorService interface on the basis of inheriting ThreadPoolExecutor, providing the characteristics of executing tasks periodically and periodically.

It is important to understand the above structure. Executors is a tool class, and then look at two ways to create threads. The first is to implement it through the factory methods provided by Executors. There are four ways

        Executor executor1 = Executors.newFixedThreadPool(10);
        Executor executor2 = Executors.newSingleThreadExecutor();
        Executor executor3 = Executors.newCachedThreadPool();
        Executor executor4 = Executors.newScheduledThreadPool(10);
复制代码

The second is through the construction method

        ExecutorService executor5 = new ThreadPoolExecutor(1,
                1,
                0L,
                TimeUnit.MILLISECONDS,
                new ArrayBlockingQueue<>(2), Executors.defaultThreadFactory(),
                new ThreadPoolExecutor.AbortPolicy());
复制代码

In fact, look at the source code created in the first way and you will find:

    public static ExecutorService newCachedThreadPool() {
        return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
                                      60L, TimeUnit.SECONDS,
                                      new SynchronousQueue());
    }
复制代码

Basically, by calling the constructor of ThreadPoolExecutor, different parameters are passed in during creation, so essentially there is only one way to create a thread pool, which is to use the constructor. Here I do n’t want to talk about how the factory method of Executors specifically helped us create Thread pool, let us look at another Alibaba specification.

Everyone understands here, it is because the encapsulation is too strong, but the guys will not know how to use it, misuse, abuse, may lead to OOM, unless you are familiar with the four thread pools created, so I introduced it in vain, because it is not used. Next, we will focus on the meaning of each parameter in the ThreadPoolExecutor constructor. There are many constructors. I chose the most complete one.

public ThreadPoolExecutor(int corePoolSize, //核心线程数量
                          int maximumPoolSize, //最大线程数
                          long keepAliveTime, //超时时间,超出核心线程数量以外的线程空余存活时间
                          TimeUnit unit, //存活时间单位
                          BlockingQueue<Runnable> workQueue, //保存执行任务的队列
                          ThreadFactory threadFactory,//创建新线程使用的工厂
                          RejectedExecutionHandler handler //当任务无法执行的时候的处理方式)
复制代码
  • corePoolSize: the number of core threads in the thread pool, in fact, the minimum number of threads. Without allowCoreThreadTimeOut, threads within the number of core threads will always survive. The thread does not destroy itself, but returns to the thread pool in a suspended state. Until the application sends a request to the thread pool again, the suspended thread in the thread pool will activate the execution task again.

  • maximumPoolSize: the maximum number of threads in the thread pool

  • keepAliveTime and unit: the survival time and unit after exceeding the number of core threads

  • workQueue: is a blocking queue, used to save all tasks to be performed by the thread pool. The following three types are generally available:

1)ArrayBlockingQueue:基于数组的先进先出队列,此队列创建时必须指定大小;  
2)LinkedBlockingQueue:基于链表的先进先出队列,如果创建时没有指定此队列大小,则默认为Integer.MAX_VALUE;
3)SynchronousQueue:这个队列比较特殊,它不会保存提交的任务,而是将直接新建一个线程来执行新来的任务。
复制代码
  • ThreadFactory: We generally use the default factory Executors.defaultThreadFactory (). Why use a factory? In fact, it is to regulate the generated Thread. Avoid calling new Thread creation, which may cause differences in the created Thread

  • handler: Saturation strategy after the queue and the largest thread pool are full.

1、AbortPolicy:直接抛出异常,默认策略;
2、CallerRunsPolicy:用调用者所在的线程来执行任务;
3、DiscardOldestPolicy:丢弃阻塞队列中靠最前的任务,并执行当前任务;
4、DiscardPolicy:直接丢弃任务;
当然也可以根据应用场景实现 RejectedExecutionHandler 接口,自定义饱和策略,如记录
日志或持久化存储不能处理的任务
复制代码

After creating the thread pool, it is also very simple to use, with return value and without return value, corresponding to the implementation of the corresponding Runnable or Callable interface

        //无返回值
        executor5.execute(() -> System.out.println("jack xushuaige"));
        //带返回值
        String message = executor5.submit(() -> { return "jack xushuaige"; }).get();
复制代码

Source code analysis

execute method

Based on the source code entry for analysis, first look at the execute method

    public void execute(Runnable command) {
        if (command == null)
            throw new NullPointerException();
        int c = ctl.get();
        if (workerCountOf(c) < corePoolSize) {
            if (addWorker(command, true))
                return;
            c = ctl.get();
        }
        if (isRunning(c) && workQueue.offer(command)) {
            int recheck = ctl.get();
            if (! isRunning(recheck) && remove(command))
                reject(command);
            else if (workerCountOf(recheck) == 0)
                addWorker(null, false);
        }
        else if (!addWorker(command, false))
            reject(command);
    }
复制代码

There is a key comment in the source code that I did not paste in. Let me first explain this key comment translation:

It is processed in three steps:

1. If the number of running threads is less than corePoolSize, then try to create a new thread and execute the incoming command as its first task. Calling addWorker will automatically check runState and workCount, in order to prevent the error warning of adding a thread when the thread should not be added;

2. Even if the task can be successfully added to the queue, we still need to confirm again whether we should add the thread (because there may be threads dead after the last check) or the thread pool has been stopped after entering this method. So we will check the status again and roll back the queue if necessary. Or when there is no thread, start a new thread;

3. If the task cannot be added to the queue, you can try to add a new thread. If the addition fails, it is because the thread pool is closed or saturated, so the task is rejected.

If you still look dumb after reading, then it ’s okay. I ’ll draw down this flow chart.

Then introduce what ctl is in the source code, click here to view the source code

We found that it is an atomic class whose main function is to save the number of threads and the state of the thread pool. He uses bit operations. An int value is 32 bits. Here, the upper 3 bits are used to save the running state, and the lower 29 bits To save the number of threads.

Let's calculate the ctlOf (RUNNING, 0) method, where RUNNING = -1 << COUNT_BITS; -1 is shifted to the left by 29 bits, and the binary of -1 is 32 1s (1111 1111 1111 1111 1111 1111 1111 1111), shifted to the left by 29 After the bit is obtained (1110 0000 0000 0000 0000 0000 0000 0000), then 111 | 0 or 111, similarly can get the bit bit of other states. This bit operation is very interesting. Bit operation is also used in the hashmap source code. Guys can also try to use it in usual development, so that the operation speed will be fast, and it can be installed with b. Introduce the status of the five thread pool

  • RUNNING: Receive new tasks and execute tasks in the queue

  • SHUTDOWN: do not receive new tasks, but execute tasks in the queue

  • STOP: do not receive new tasks, do not execute the tasks in the queue, interrupt the tasks in progress

  • TIDYING: All tasks have ended, the number of threads is 0, the thread pool in this state is about to call the terminated () method

  • TERMINATED: Terminated () method execution is complete

Their conversion relationship is as follows:

addWorker method

We see that the core method of the execute process is addWorker, we continue to analyze, the source code looks bluffing, actually did two things, split it

The first step: update the number of workers, the code is as follows:

retry:
        for (;;) {
            int c = ctl.get();
            int rs = runStateOf(c);

            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN &&
                ! (rs == SHUTDOWN &&
                   firstTask == null &&
                   ! workQueue.isEmpty()))
                return false;

            for (;;) {
                int wc = workerCountOf(c);
                if (wc >= CAPACITY ||
                    wc >= (core ? corePoolSize : maximumPoolSize))
                    return false;
                if (compareAndIncrementWorkerCount(c))
                    break retry;
                c = ctl.get();  // Re-read ctl
                if (runStateOf(c) != rs)
                    continue retry;
                // else CAS failed due to workerCount change; retry inner loop
            }
        }
复制代码

Retry is a mark, used in conjunction with the loop, when you continue retry, it will jump to the place of retry and execute again. If break retry, then jump out of the entire loop body. The source code first obtains CTL, and then checks the status, and then checks the quantity according to the type of thread created. After updating the ctl status through CAS, if it succeeds, it will jump out of the loop. Otherwise, the thread pool state is obtained again. If it is inconsistent with the original one, the execution will start from the beginning. If the status has not changed, continue to update the number of workers. The flow chart is as follows:

Step 2: Add workers to the workers' set. And start the thread held in the worker. code show as below:

boolean workerStarted = false;
boolean workerAdded = false;
Worker w = null;
try {
    w = new Worker(firstTask);
    final Thread t = w.thread;
    if (t != null) {
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            // Recheck while holding lock.
            // Back out on ThreadFactory failure or if
            // shut down before lock acquired.
            int rs = runStateOf(ctl.get());

            if (rs < SHUTDOWN ||
                (rs == SHUTDOWN && firstTask == null)) {
                if (t.isAlive()) // precheck that t is startable
                    throw new IllegalThreadStateException();
                workers.add(w);
                int s = workers.size();
                if (s > largestPoolSize)
                    largestPoolSize = s;
                workerAdded = true;
            }
        } finally {
            mainLock.unlock();
        }
        if (workerAdded) {
            t.start();
            workerStarted = true;
        }
    }
} finally {
    if (! workerStarted)
        addWorkerFailed(w);
}
return workerStarted;
复制代码

You can see that when you add a work, you need to obtain the lock first, so as to ensure the safety of multi-thread concurrency. If the worker is added successfully, the start method of the thread in the worker is called to start the thread. If the startup fails, call the addWorkerFailed method to roll back. When you see this guy, you will find

1. ThreadPoolExecutor does not start and create any threads after initialization, addWorker will be called to create threads when the execute method is called

2. In the addWorker method, a new worker is created and the thread held by it is started to perform the task.

As mentioned above, if the number of threads has reached corePoolSize, only the command will be added to the workQueue, so how is the command added to the workQueue executed? Let's analyze the source code of Worker.

Worker class

Worker encapsulates the thread and is the unit of work in the executor. The worker inherits from AbstractQueuedSynchronizer and implements Runnable. Worker's simple understanding is actually a thread, and the run method has been re-created. Let's look at his construction method:

        Worker(Runnable firstTask) {
            setState(-1); // inhibit interrupts until runWorker
            this.firstTask = firstTask;
            this.thread = getThreadFactory().newThread(this);
        }
复制代码

Let's take a look at these two important attributes

        /** Thread this worker is running in.  Null if factory fails. */
        final Thread thread;
        /** Initial task to run.  Possibly null. */
        Runnable firstTask;
复制代码

firstTask uses it to save incoming tasks; thread is the thread created by ThreadFactory when calling the constructor, is the thread used to process the task, here is the thread created by ThreadFactory, and there is no direct new, the reason is also mentioned above I've been here. Here we see that this is passed by newThread. Because the Worker itself inherits the Runnable interface, the t.start () called in addWork actually runs the run method of the worker to which t belongs. The worker's run method is as follows:

public void run() {
    runWorker(this);
}
复制代码

The runWorker source code is as follows:

    final void runWorker(Worker w) {
        Thread wt = Thread.currentThread();
        Runnable task = w.firstTask;
        w.firstTask = null;
        w.unlock(); // allow interrupts
        boolean completedAbruptly = true;
        try {
            while (task != null || (task = getTask()) != null) {
                w.lock();
                // If pool is stopping, ensure thread is interrupted;
                // if not, ensure thread is not interrupted.  This
                // requires a recheck in second case to deal with
                // shutdownNow race while clearing interrupt
                if ((runStateAtLeast(ctl.get(), STOP) ||
                     (Thread.interrupted() &&
                      runStateAtLeast(ctl.get(), STOP))) &&
                    !wt.isInterrupted())
                    wt.interrupt();
                try {
                    beforeExecute(wt, task);
                    Throwable thrown = null;
                    try {
                        task.run();
                    } catch (RuntimeException x) {
                        thrown = x; throw x;
                    } catch (Error x) {
                        thrown = x; throw x;
                    } catch (Throwable x) {
                        thrown = x; throw new Error(x);
                    } finally {
                        afterExecute(task, thrown);
                    }
                } finally {
                    task = null;
                    w.completedTasks++;
                    w.unlock();
                }
            }
            completedAbruptly = false;
        } finally {
            processWorkerExit(w, completedAbruptly);
        }
    }
复制代码

Simple analysis

1. First remove the firstTask from the worker and clear it;

2. If there is no firstTask, call the getTask method to get the task from the workQueue;

3. Acquire the lock;

4. Execute beforeExecute. This is an empty method, which is implemented in subclasses if necessary;

5. Execute task.run;

6. Execute afterExecute. This is an empty method, which is implemented in subclasses if necessary;

7. Clear task, completed Tasks ++, and release the lock;

8. When there is an exception or no task is executable, it will enter the outer finnaly code block. Call processWorkerExit to exit the current worker. After removing this worker from the works, if the number of workers is less than corePoolSize, create a new worker to maintain the number of corePoolSize threads.

This line of code while (task! = Null || (task = getTask ())! = Null) ensures that the worker keeps getting task execution from the workQueue. The getTask method will come out from the poll or take tasks in BlockingQueue workQueue.

At this point, the process of how the executor creates and starts the thread to execute the task has been analyzed clearly, and there are other methods such as shutdown (), shutdownNow () for the boys to observe and study by themselves.

How to properly configure the size of the thread pool

The thread pool size does not depend on guessing, nor does it mean that the more the better.

  • CPU-intensive tasks: mainly to perform computing tasks, the response time is fast, the CPU has been running, this task has a high CPU utilization, then the configuration of the number of threads should be determined according to the number of CPU cores, and fewer threads should be allocated , Such as the size equivalent to the number of CPUs.
  • IO-intensive tasks: mainly for IO operations, and the time for performing IO operations is long. Since the thread is not running all the time, the CPU is idle. In this case, the size of the thread pool can be increased, such as the number of CPUs * 2

Of course, these are all empirical values, and the best way is to test and get the best configuration according to the actual situation.

Thread pool monitoring

If a thread pool is used on a large scale in the project, then a monitoring system must be in place to guide the current state of the thread pool, and problems can be quickly located when problems occur. We can achieve thread monitoring by rewriting the beforeExecute, afterExecute and shutdown methods of the thread pool

You can see from these names and definitions that this is implemented by subclasses, which can execute custom logic before, after, and after the thread executes.

to sum up

The thread pool is simple and easy to say, and difficult to say. It is simple because it is easy to use, so the guys may think that there is nothing to say about this. The difficulty is to know his underlying source code and how he schedules threads. Yes, let ’s talk about two things. The first is that a lot of flowcharts are used in this article. When we read the source code or do complex business development, we must calm down and draw a picture first, otherwise it will be haloed or others After interruption, I have to look at the side from the beginning to the end. The second is to read the source code. The newly graduated partner may use it as long as it works, but if you have been working for five years, you will still use it. How to achieve it, then what is your advantage over the new graduate, and what salary is higher than the new graduate. Well, the authors of this article are of limited level. If you have any questions, please share and discuss ...

Guess you like

Origin juejin.im/post/5e906cc2e51d4546e8576569