How to use thread pool gracefully

The thread pool is not only a very commonly used technology in the project, but also a basic knowledge point that must be asked in the interview. Next, follow me to consolidate the relevant knowledge of the thread pool. Before understanding the thread pool, let's first understand what is a process and what is a thread

process

  • Program: generally a file consisting of a set of CPU instructions, statically stored on a storage device such as a hard disk
  • Process: When a program is to be run by a computer, a runtime instance of the program is generated in memory, and we call this instance a process

After the user issues the command to run the program, a process will be generated, and the same program can generate multiple processes (one-to-many relationship) to allow multiple users to run the same program at the same time without conflict.

Processes need some resources to work, such as CPU usage time, memory, files, and I/O devices, and are executed sequentially one by one, that is, each CPU core can only run one process at any time. However, in an application, there is generally not only one task that is executed in a single line, there must be multiple tasks, and creating a process is time-consuming and resource-consuming, which is called a heavyweight operation.

  1. Creating a process takes up too many resources
  2. Communication between processes requires data to be passed around in different memory spaces, so inter-process communication will take more time and resources

thread

Thread is the smallest unit that the operating system can perform operation scheduling. In most cases, it is included in the process and is the actual operation unit in the process. A process can have multiple threads concurrently, each thread performing a different task. Multiple threads in the same process share all virtual resources in the process, such as virtual address space, file descriptors, signal handling, and so on. But multiple threads in the same process each have their own call stack.

> A process can have many threads, each thread performing different tasks in parallel.

data in the thread

  1. Local data on the thread stack : such as the local variables of the function execution process, we know that the thread model in Java is the model that uses the stack. Each thread has its own stack space.
  2. Global data shared in the whole process : We know that in a Java program, Java is a process, we can ps -ef | grep javasee how many Java processes are running in the program, such as global variables in our Java, between different processes is isolated, but shared between threads.
  3. Thread private data : In Java, we can ThreadLocalcreate private data variables between threads through.

> The local data on the thread stack can only be valid within this method, and the private data of the thread is shared by multiple functions among the threads.

CPU-bound and IO-bound

Understanding whether the server is CPU-intensive or IO-intensive can help us better set the parameters in the thread pool. How to set the specific settings will be analyzed later when we talk about the thread pool. Here you will know these two concepts first.

  • IO-intensive: CPU is idle most of the time, waiting for disk IO operations
  • CPU (computational) intensive: most of the time disk IO is idle, waiting for CPU computing operations

Thread Pool

Thread pool is actually an application of pooling technology. There are many common pooling technologies, such as database connection pool, memory pool in Java, constant pool and so on. And why is there pooling technology? The essence of the operation of the program is to complete the processing of information by using system resources (CPU, memory, network, disk, etc.). For example, creating an object instance in the JVM needs to consume CPU and memory resources. If your program needs to be created frequently A large number of objects, and the short survival time of these objects means that they need to be destroyed frequently, so it is very likely that this code will become a performance bottleneck. To sum up, it is actually the following points.

  • Reuse the same resources, reduce waste, and reduce the cost of new construction and destruction;
  • Reduce the cost of separate management and hand over to the "pool" uniformly;
  • Centralized management to reduce "fragmentation";
  • Improve system response speed, because there are existing resources in the pool, and there is no need to recreate them;

Therefore, the pooling technology is to solve our problems. In short, the thread pool is to save the used object. When the object is needed next time, it can be directly taken out from the object pool and reused to avoid frequent creation. and destruction. In Java, everything is an object, so a thread is also an object. A Java thread is an encapsulation of an operating system thread. Creating a Java thread also consumes the resources of the operating system, so there is a thread pool. But how do we create it?

Four thread pools provided by Java

Java provides us four ways to create a thread pool.

  • Executors.newCachedThreadPool: Create an unlimited number of thread pools that can be cached. If there is no free thread pool in the thread, then the task will create a new thread at this time. If the thread is useless for more than 60 seconds, the thread will be destroyed. Simply put, it is to create unlimited temporary threads when you are not busy, and then recycle when you are idle.

    	public static ExecutorService newCachedThreadPool() {
        return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
                                      60L, TimeUnit.SECONDS,
                                      new SynchronousQueue<runnable>());
    }
    
    
  • Executors.newFixedThreadPool: Create a fixed-size thread pool, which can control the maximum concurrent number of threads, and the excess threads will wait in the queue. To put it simply, tasks will be placed in an infinite-length queue when they are not busy.

    	public static ExecutorService newFixedThreadPool(int nThreads) {
        return new ThreadPoolExecutor(nThreads, nThreads,
                                      0L, TimeUnit.MILLISECONDS,
                                      new LinkedBlockingQueue<runnable>());
    }
    
    
  • Executors.newSingleThreadExecutor: Create a thread pool with a thread number of 1 in the thread pool, use a unique thread to execute tasks, and ensure that tasks are executed in the specified order

    	public static ExecutorService newSingleThreadExecutor() {
        return new FinalizableDelegatedExecutorService
            (new ThreadPoolExecutor(1, 1,
                                    0L, TimeUnit.MILLISECONDS,
                                    new LinkedBlockingQueue<runnable>()));
    }
    
    
  • Executors.newScheduledThreadPool: Create a fixed-size thread pool to support scheduled and periodic task execution

    public ScheduledThreadPoolExecutor(int corePoolSize) {
        super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS,
              new DelayedWorkQueue());
    }
    
    

How to create a thread pool

We click to go to the source code of these four implementations, and we can see that their underlying creation principles are the same, except that the passed parameters are composed of four different types of thread pools. are used ThreadPoolExecutorto create. We can take a look at ThreadPoolExecutor the parameters passed in to create.

public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<runnable> workQueue,
                              ThreadFactory threadFactory,
                              RejectedExecutionHandler handler)

So what exactly do these parameters mean?

  • corePoolSize : The number of core threads in the thread pool
  • maximumPoolSize : The maximum number of threads allowed in the thread pool
  • keepAliveTime : When the number of existing threads is greater than corePoolSize that, an idle thread will be found to destroy. This parameter is to set how long the idle thread will be destroyed.
  • unit :time unit
  • workQueue : Work queue, if the current number of threads in the thread pool is greater than the core threads, the next task will be put into the queue
  • threadFactory : When creating a thread, use the factory pattern to produce threads. This parameter is to set our custom thread creation factory.
  • handler : If the maximum number of threads is exceeded, then the rejection policy we set will be executed

Next, we combine these parameters to see what their processing logic is.

  1. When the previous corePoolSize task comes, a thread is created for a task
  2. If the number of threads in the current thread pool is greater than that corePoolSize , the next task will be put into the workQueue queue we set above
  3. If workQueue it is full at this time, then when the task comes again, a new temporary thread will be created, then if we set keepAliveTime or set it at this time allowCoreThreadTimeOut , the system will check the activity of the thread, and once the timeout expires, the thread will be destroyed.
  4. If the current thread in the thread pool is greater than the maximum number of threads at this time, then the rejection policy maximumPoolSize we just set will be executedhandler

Why is it recommended not to use the thread pool creation method provided by Java?

After understanding the parameters set above, let's take a look at why there is such a provision in the "Alibaba Java Manual".

I believe that after seeing the above four implementation principles for creating thread pools, you should know why Alibaba has such regulations.

  • FixedThreadPoolAnd SingleThreadExecutor: the implementation of these two thread pools, we can see that the work queues it sets are all LinkedBlockingQueue, we know that this queue is a queue in the form of a linked list, this queue has no length limit, it is an unbounded queue, then at this time If there are a lot of requests, it is possible to cause OOM.
  • CachedThreadPoolAnd ScheduledThreadPool: In the implementation of these two thread pools, we can see that the maximum number of threads it sets is both Integer.MAX_VALUE, which is equivalent to the number of threads allowed to be created Integer.MAX_VALUE. At this time, if there are a large number of requests, it may also be caused OOM.

How to set parameters

Therefore, if we want to use thread pools in our projects, we recommend creating thread pools individually based on the conditions of our own projects and machines. So how to set these parameters? In order to properly customize the length of the thread pool, it is necessary to understand your computer configuration, the situation of the required resources, and the characteristics of the task. For example, how many CPUs are installed on the deployed computer? How much memory? Is the main execution of the task IO-intensive or CPU-intensive? Does the task performed require a scarce resource such as a database connection?

> If you have multiple tasks of different classes that behave very differently, you should consider using multiple thread pools. In this way, different thread pools can be customized according to each task, so that another task will not be overwhelmed because one type of task fails.

  • CPU-intensive tasks: This indicates that a large number of computing operations are involved. For example, if there are N CPUs, then configure the capacity of the thread pool to be N+1, so as to obtain optimal utilization. Because the CPU-intensive thread happens to be suspended at some point because of a page fault or for some other reason, having just one extra thread ensures that CPU cycles don't interrupt work in this case.

  • IO-intensive tasks: It means that the CPU spends most of the time waiting for IO blocking operations, so at this time, the capacity of the thread pool can be configured to be larger. At this point, you can calculate about the appropriate number of your thread pools according to some parameters.

    • N: the number of CPUs
    • U: target CPU usage, 0<=U<=1
    • W/C: ratio of wait time to computation time
    • Then the optimal pool size isN*U*(1+W/C)

> Page fault (English: Page fault, also known as hard fault, hard interrupt, page fault, page miss, page fault interrupt, page fault, etc.) refers to when software attempts to access the virtual address space that has been mapped, but currently does not Interrupt issued by the CPU's memory management unit when a page is not loaded in physical memory

In fact, the setting of the thread pool size should be set according to your own business type. For example, when the current task requires pooled resources, such as the connection pool of the database, the length of the thread pool and the length of the resource pool will affect each other. If each task requires a database connection, then the size of the connection pool will limit the effective size of the thread pool. Similarly, when the task in the thread pool is the only consumer of the connection pool, the size of the thread pool will increase Will limit the effective size of the connection pool.

Thread destruction in thread pool

Creation and destruction of threads jointly managed by the number of core threads of the thread pool (corePoolSize), the maximum number of threads (maximumPoolSize), and the survival time of threads (keepAliveTime). Next, let's review how the thread pool creates and destroys threads

  • Current number of threads < number of core threads: create a thread for a task
  • The current number of threads = the number of core threads: a task will be added to the queue
  • The current number of threads > the number of core threads: At this time, there is a precondition that the queue is full, and a new thread will be created. At this time, the activity check of the thread will be enabled, and keepAliveTime the thread that is set to the time without activity will be recycled.

Then some people here may think corePoolSize of setting the number of core threads to 0 ( if you remember the above CachedThreadPool , you should still remember that the number of core threads is 0 ), because if this is set, threads will be dynamically created, and idle When there is no thread, create a thread in the thread pool when it is busy. This idea is good, but if we set this parameter to 0 in our custom parameter, and just set the waiting queue to not SynchronousQueue, then there will actually be a problem, because the new thread will only be created when the queue is full. I used an unbounded queue in the following code LinkedBlockingQueue , in fact, let's take a look at the output

ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(0,Integer.MAX_VALUE,1, TimeUnit.SECONDS,new LinkedBlockingQueue&lt;&gt;());
for (int i = 0; i &lt; 10; i++) {
    threadPoolExecutor.execute(new Runnable() {
        @Override
        public void run() {
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.printf("1");
        }
    });
}

You can take a look at the effect of the demonstration. In fact, it 1is printed every second. In fact, this is contrary to our original intention of using the thread pool, because we are running on a single thread.

But if we replace the work queue SynchronousQueue with what, we find that these 1are output in one block.

> SynchronousQueue It is not a real queue, but a mechanism for managing the transfer of information directly between threads. Here, it can be simply imagined as a producer producing messages to hand over SynchronousQueue , and if there are threads on the consumer side to receive, then this The message will be delivered directly to the consumer, otherwise it will block.

So when we set some parameters in the thread pool, we should think about the process of creating and destroying threads, otherwise our custom thread pool might as well use the four thread pools provided by Java.

Denial Policy in Thread Pool

ThreadPoolExecutorProvides us with four denial strategies, we can see that the denial strategies provided by the four thread pool creation provided by Java are the default denial strategies defined by them. So what are the other rejection policies besides this rejection policy?

private static final RejectedExecutionHandler defaultHandler =
    new AbortPolicy();

We can see that the rejection policy is an interface RejectedExecutionHandler , which means that we can set our own rejection policy. Let's first look at what the four rejection policies Java provides.

public interface RejectedExecutionHandler {

    /**
     * Method that may be invoked by a {@link ThreadPoolExecutor} when
     * {@link ThreadPoolExecutor#execute execute} cannot accept a
     * task.  This may occur when no more threads or queue slots are
     * available because their bounds would be exceeded, or upon
     * shutdown of the Executor.
     *
     * <p>In the absence of other alternatives, the method may throw
     * an unchecked {@link RejectedExecutionException}, which will be
     * propagated to the caller of {@code execute}.
     *
     * @param r the runnable task requested to be executed
     * @param executor the executor attempting to execute this task
     * @throws RejectedExecutionException if there is no remedy
     */
    void rejectedExecution(Runnable r, ThreadPoolExecutor executor);
}

AbortPolicy

This rejection policy is the default rejection policy provided by the four thread pool creation methods provided by Java. We can look at its implementation.

public static class AbortPolicy implements RejectedExecutionHandler {
 
    public AbortPolicy() { }

    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
        throw new RejectedExecutionException("Task " + r.toString() +
                                             " rejected from " +
                                             e.toString());
    }
}

So this rejection strategy is to throw RejectedExecutionExceptionan exception

CallerRunsPolicy

This rejection strategy simply means that the task is handed over to the caller for direct execution.

public static class CallerRunsPolicy implements RejectedExecutionHandler {

    public CallerRunsPolicy() { }

    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
        if (!e.isShutdown()) {
            r.run();
        }
    }
}

Why is it handed over to the caller to execute? We can see that it is calling the run()method, not the start()method.

DiscardOldestPolicy

It should be seen from the source code that this rejection strategy is to discard the oldest task in the queue before executing it.

public static class DiscardOldestPolicy implements RejectedExecutionHandler {

        public DiscardOldestPolicy() { }

        public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
            if (!e.isShutdown()) {
                e.getQueue().poll();
                e.execute(r);
            }
        }
    }

DiscardPolicy

It should be seen from the source code that this rejection policy is to do nothing for the current task, in simple terms, it directly discards the current task and does not execute it.

public static class DiscardPolicy implements RejectedExecutionHandler {

    public DiscardPolicy() { }

    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
    }
}

The rejection policy of the thread pool provides us with these four implementation methods by default. Of course, we can also customize the rejection policy to make the thread pool more in line with our current business. It will also be explained later when Tomcat customizes its own thread pool. Self-implemented rejection policy.

thread starvation deadlock

The thread pool brings a new possibility to the concept of "deadlock": thread starvation deadlock. In a thread pool, if a task submits another task to the same Executor, it usually causes a deadlock. The second thread stays in the work queue waiting for the first submitted task to complete, but the first task cannot complete because it is waiting for the second task to complete. It is shown in the diagram as follows

The words expressed in code are as follows. Note here that the thread pool we define here is that there is SingleThreadExecutoronly one thread in the thread pool, so as to simulate such a situation. If in a larger thread pool, if all threads are waiting for others to be still working Queued tasks are blocked, then this situation is called thread starvation deadlock . So try to avoid processing two different types of tasks in the same thread pool.

public class AboutThread {
    ExecutorService executorService = Executors.newSingleThreadExecutor();
    public static void main(String[] args) {
        AboutThread aboutThread = new AboutThread();
        aboutThread.threadDeadLock();
    }

    public void threadDeadLock(){
        Future<string> taskOne  = executorService.submit(new TaskOne());
        try {
            System.out.printf(taskOne.get());
        } catch (InterruptedException e) {
            e.printStackTrace();
        } catch (ExecutionException e) {
            e.printStackTrace();
        }
    }

    public class TaskOne implements Callable{

        @Override
        public Object call() throws Exception {
            Future<string> taskTow = executorService.submit(new TaskTwo());
            return "TaskOne" + taskTow.get();
        }
    }

    public class TaskTwo implements Callable{

        @Override
        public Object call() throws Exception {
            return "TaskTwo";
        }
    }
}

Extend ThreadPoolExecutor

If we want to make some extensions to the thread pool, we can use ThreadPoolExecutor some of the interfaces reserved for me to allow us to customize the thread pool at a deeper level.

thread factory

If we want to customize some names for each thread in our thread pool, then we can use thread factories to implement some customized operations. As long as we pass our custom factory to ThreadPoolExecutorit, whenever the thread pool needs to create a thread, it must be created through our defined factory. Next, let's take a look at the interface ThreadFactory. As long as we implement this interface, we can customize the unique information of our own thread.

public interface ThreadFactory {

    /**
     * Constructs a new {@code Thread}.  Implementations may also initialize
     * priority, name, daemon status, {@code ThreadGroup}, etc.
     *
     * @param r a runnable to be executed by new thread instance
     * @return constructed thread, or {@code null} if the request to
     *         create a thread is rejected
     */
    Thread newThread(Runnable r);
}

Next we can look at the thread pool factory class we wrote ourselves

class CustomerThreadFactory implements ThreadFactory{

    private String name;
    private final AtomicInteger threadNumber = new AtomicInteger(1);
    CustomerThreadFactory(String name){
        this.name = name;
    }

    @Override
    public Thread newThread(Runnable r) {
        Thread thread = new Thread(r,name+threadNumber.getAndIncrement());
        return thread;
    }
}


Just add this factory class when instantiating the thread pool

   public static void customerThread(){
        ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(0,Integer.MAX_VALUE,1, TimeUnit.SECONDS,new SynchronousQueue&lt;&gt;(),
                new CustomerThreadFactory("customerThread"));

        for (int i = 0; i &lt; 10; i++) {
            threadPoolExecutor.execute(new Runnable() {
                @Override
                public void run() {
                    System.out.printf(Thread.currentThread().getName());
                    System.out.printf("\n");
                }
            });
        }
    }

Next we execute this statement and find that the name of each thread has changed

customerThread1
customerThread10
customerThread9
customerThread8
customerThread7
customerThread6
customerThread5
customerThread4
customerThread3
customerThread2

Extend by subclassing ThreadPoolExecutor

When we look at ThreadPoolExecutor the source code, we can find that there are three methods in the source code.protected

protected void beforeExecute(Thread t, Runnable r) { }
protected void afterExecute(Runnable r, Throwable t) { }
protected void terminated() { }

> Members modified by protected are visible to this package and its subclasses

We can override these methods through inheritance, so we can make our own extensions. Threads that execute tasks call beforeExecute and afterExecute methods through which logging, timing, monitoring, or peer-to-peer information gathering can be added. Will beafterExecute called whether the task returns from run normally, or throws an exception ( not if the task completes and throws an ErrorafterExecute ). If beforeExecute one is thrown RuntimeException, the task will not be executed and afterExecute will not be called.

Terminated is called when the thread pool is closed, that is, after all tasks have been completed and all worker threads have been closed, terminated can be used to release various resources allocated by the Executor during its life cycle, and can also perform notifications. , record logs or mobile phone finalize statistics and other operations.

Code address of this article

refer to

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324114749&siteId=291194637