Project technical analysis - thread pool + asynchronous processing

use background

Involving system data analysis and result reading and writing, the amount of data is large, and serial processing is slow, so batch operations are performed, and multiple tasks do not interfere with each other;

Getting to know asynchronous

some concepts

Once a synchronous method call starts, the caller must wait until the method call returns before proceeding with subsequent actions.
An asynchronous method call is more like a message passing. Once started, the method call will return immediately, and the caller can continue subsequent operations. The asynchronous method is usually executed "real" in another thread. the entire process without hindering the work of the caller

Why use async

Overall: Improve performance and fault tolerance

The first reason: fault tolerance and robustness. If there is an abnormality in data writing, the processing data cannot be abnormal because of the abnormality in writing data; because user registration is the main function, and sending points is a secondary function, even if the sending of points is abnormal. It is necessary to remind the user that the registration is successful, and then compensate for the abnormal points later.
The second reason is to improve performance. For example, it takes 20 milliseconds to register a user and 50 milliseconds to send points. If synchronous is used, the total time will be 70 milliseconds. If asynchronous is used, there is no need to wait for points, so it takes 20 milliseconds.

Asynchronous implementation - based on thread pool

1. @Async annotation to execute asynchronous tasks requires us to manually enable the asynchronous function. The way to enable it is to add @EnableAsync
2. ThreadPoolTaskExecutor is implemented manually

Getting to know the thread pool-ThreadPool

Types of thread pools (mainly four)

1. newCachedThreadPool: used to create a thread pool that can be expanded infinitely , suitable for scenarios with light loads, and execute short-term asynchronous tasks. (The task can be executed quickly, because the task execution time is short, it can end quickly, and it will not cause excessive cpu switching) 2. newFixedThreadPool
: Create a fixed-size thread pool, because the unbounded blocking queue is used, so the actual number of threads It will never change. It is suitable for heavy load scenarios and limits the current number of threads. (Ensure that the number of threads is controllable, and will not cause too many threads, resulting in a more serious system load)
3. newSingleThreadExecutor: Create a single-threaded thread pool , which is suitable for the need to ensure the order of execution of various tasks.
4. newScheduledThreadPool: suitable for executing delayed or periodic tasks .

Core parameter settings

JAVA concurrent programming
can see that the queue size and the maximum number of threads of the default thread pool are both the maximum value of Integer, which will obviously leave certain hidden risks for the system.

asks: the number of tasks per second, assuming 500~1000
taskcost: the time spent on each task, assuming 0.1s
responsetime: the maximum response time allowed by the system, assuming 1s

For CPU-intensive tasks,
try to use a smaller thread pool . The maximum number of threads = number of CPU cores + 1.
Because CPU-intensive tasks make the CPU usage rate very high, if too many threads are opened, it will cause excessive CPU switching .
IO-intensive tasks (this project)
can use a slightly larger thread pool , the maximum number of threads = 2 * number of CPU cores.
**The CPU usage rate of IO-intensive tasks is not high,** so the CPU can have other threads to process other tasks while waiting for IO, and make full use of CPU time.
The number of core threads at the same time = the maximum number of threads * 20% .

Thread pool execution steps

1. When the pool size is smaller than corePoolSize, create a new thread and process the request.
2. When the pool size is equal to corePoolSize, put the request into the workQueue (QueueCapacity), and the idle threads in the pool will go to the workQueue to fetch tasks and process them.
3. When the workQueue When the task cannot be put down, create a new thread into the pool and process the request. If the pool size reaches the maximumPoolSize , use RejectedExecutionHandler to handle the rejection .
4. When the number of threads in the pool is greater than corePoolSize , the redundant threads will wait for keepAliveTime for a long time Destroys itself when there are no requests to process

Implementation

Application 1 - Annotation @Async method

Reference article
For asynchronous method calls, the @Async annotation has been provided since Spring3. We only need to mark this annotation on the method to implement asynchronous calls.
In addition, we also need a configuration class to enable asynchronous functions through the Enable module driver annotation @EnableAsync.

@Configuration
@EnableAsync
public class ThreadPoolConfig {

}

The thread pool must be declared manually through the constructor of ThreadPoolExecutor, using the @Async annotation, and the SimpleAsyncTaskExecutor thread pool is used by default , which is not a real thread pool .

Thread pool configuration

Reference article Author: Piaomiao Jam
cannot realize thread reuse by using this thread pool, and a new thread will be created every time it is called. If the system continues to create threads, it will eventually cause the system to occupy too much memory and cause an OutOfMemoryError error

// 核心线程池大小
private int corePoolSize = ;

// 最大可创建的线程数
private int maxPoolSize = ;

// 队列最大长度
private int queueCapacity = ;

// 线程池维护线程所允许的空闲时间
private int keepAliveSeconds = ;

business class

@Component
public class UserDataHandler {
	
 
	private static final Logger LOG = LoggerFactory.getLogger(SyncBookHandler.class);
	/**
	 * @param userdataList 一段数据集合
	 * @param pageIndex 段数
	 * @return Future<String> future对象
	 * @since JDK 1.8
	 */
	@Async
	public Future<String> syncUserDataPro(List<UserData> userdataList,int pageIndex){
		
 
			//声明future对象-主要是为了返回处理信息
		 	Future<String> result = new AsyncResult<String>("");
		 	//循环遍历该段旅客集合
			if(null != userdataList && userdataList.size() >0){
				for(UserData userdata: userdataList){
					try {
						//数据入库操作
						// 针对每一个获取到的切割子段，进行操作 同时进行
					} catch (Exception e) {
						
						//记录出现异常的时间，线程name
						result = new AsyncResult<String>("fail,time="+System.currentTimeMillis()+",thread id="+Thread.currentThread().getName()+",pageIndex="+pageIndex);
						continue;
					}
				}
			}
			return result;
		}

The following points should be noted here:
1. The implementation of batch data-asynchronous operation needs to divide the data, so there are loop slices here. 2.
After using the Async annotation, either void does not return a value, or only a value of Future type can be returned, otherwise the annotation is invalid; The case is to return execution information, so use Future.
3. As for the database exception scenario, it depends on whether the specific business requirements require transactions and rollbacks; here my business scenario allows data loss;

use-CountDownLatch

Purpose: To ensure that all previous threads are executed before going to the next step.
Although the asynchronous request interface is implemented, the efficiency has been greatly improved. However, due to the asynchronous call, the processing result will be returned before the data is processed. We must wait for all threads in syncUserDataPro to end before returning to the method of the current calling thread task

List<List<UserData>> lists = Lists.partition(res, 10);
CountDownLatch countDownLatch = new CountDownLatch(lists.size());
for (List<UserData> list : lists) {
      while (iterator.hasNext()) {
            UserData userData = iterator.next();
            this.userDataService.asyncinsertUserDataList(userData, countDownLatch);
      }
}
countDownLatch.await(); //保证之前的所有的线程都执行完成，才会走下面的；
log.info("完成！！共耗时：{} m秒", (endTime - startTime));

  	@Override
    @Async("threadPoolTaskExecutor")
    public void asyncinsertUserDataList(UserData userData, CountDownLatch countDownLatch)
    {
        try  {
//            log.info("start executeAsync");
            userDataMapper.insertUserData(userData);
//            log.info("end executeAsync");
        } catch(Exception e){
            e.printStackTrace();
        } finally {
            // 无论上面程序是否异常必须执行 countDown,否则 await 无法释放
            countDownLatch.countDown();
        }
    }

Supplement - CountDownLatch theory

CountDownLatch allows count threads to block in one place until all threads' tasks have been executed.

When creating a CountDownLatch object, you need to specify an initial count value, which represents the number of threads that need to wait. Whenever a thread completes its task, it calls CountDownLatch's countDown() method, and the value of the counter is decremented by one. When the value of the counter becomes 0, the waiting threads are woken up and resume their tasks.

For more detailed underlying principles, please refer to AQS related knowledge