Thread pool design for page query of multiple data combinations | JD Cloud technical team

background

When we deal with concurrent scenarios, we generally use the following method to estimate the number of threads in the thread pool. For example, if the QPS requirement is 1000, and the average execution time of each task is t seconds, then the number of threads we need is t * 1000.

However, in some cases, this t is difficult to estimate. Even if it is estimated, it still needs to be verified and fine-tuned in the actual thread environment. For example, in the data item combination scenario of paging query described in this article.

1. Data combination relies on different upstream interfaces, and their response times vary, and the gap is even very large. Some interfaces support batch queries while others do not. Some interfaces also need to consider downgrading and smoothing solutions due to performance issues.

2. In order to improve the user experience, the query here is designed with dynamic columns, so the data items and quantities required for each access are also different.

Therefore, it is unrealistic to estimate a reasonable t here.

plan

A dynamically adjustable strategy that fine-tunes the thread pool based on monitoring feedback. The overall design is divided into assembly logic and thread pool encapsulation design.

1. Assembly logic

Query the results, split the shards (horizontal split), assemble in parallel (vertical split), obtain the assembly item list (dynamic column), and assemble each item in parallel.



2. Thread pool encapsulation

Adjustable number of core threads, maximum number of threads, thread retention time, queue size, submit task retry waiting time, and submit task retry times. Fixed exception rejection policy.

Adjustment parameters:

Field name illustrate
corePoolSize Number of core threads Reference thread pool definition
maximumPoolSize Maximum number of threads Reference thread pool definition
keepAliveTime Thread survival time Reference thread pool definition
queueSize queue length Reference thread pool definition
resubmitSleepMillis Submit task retry waiting time Add wait time when retrying after a task is rejected
resubmitTimes Number of retries to submit a task The maximum number of times to retry adding a task after it is rejected
    @Data
	private static class PoolPolicy {

		/** 核心线程数 */
		private Integer corePoolSize;

		/** 最大线程数 */
		private Integer maximumPoolSize;

		/** 线程存活时间 */
		private Integer keepAliveTime;

		/** 队列容量 */
		private Integer queueSize;

		/** 重试等待时间 */
		private Long resubmitSleepMillis;

		/** 重试次数 */
		private Integer resubmitTimes;
	}

 

Create a thread pool:

The creation of the thread pool takes into account dynamic needs and meets the requirements for fine-tuning based on stress test results. First cache the old thread pool and then create a new thread. When the new thread pool is successfully created, close the old thread pool. It is guaranteed that the ongoing business will not be affected during this replacement process. The thread pool uses an interrupt strategy, so users can promptly sense that the system is busy and ensure the safety of system resource occupation.

public void reloadThreadPool(PoolPolicy poolPolicy) {
    if (poolPolicy == null) {
        throw new RuntimeException("The thread pool policy cannot be empty.");
    }
    if (poolPolicy.getCorePoolSize() == null) {
        poolPolicy.setCorePoolSize(0);
    }
    if (poolPolicy.getMaximumPoolSize() == null) {
        poolPolicy.setMaximumPoolSize(Runtime.getRuntime().availableProcessors() + 1);
    }
    if (poolPolicy.getKeepAliveTime() == null) {
        poolPolicy.setKeepAliveTime(60);
    }
    if (poolPolicy.getQueueSize() == null) {
        poolPolicy.setQueueSize(Runtime.getRuntime().availableProcessors() + 1);
    }
    if (poolPolicy.getResubmitSleepMillis() == null) {
        poolPolicy.setResubmitSleepMillis(200L);
    }
    if (poolPolicy.getResubmitTimes() == null) {
        poolPolicy.setResubmitTimes(5);
    }
    // - 线程池策略没有变化直接返回已有线程池。
    ExecutorService original = this.executorService;
    this.executorService = new ThreadPoolExecutor(
            poolPolicy.getCorePoolSize(),
            poolPolicy.getMaximumPoolSize(),
            poolPolicy.getKeepAliveTime(), TimeUnit.SECONDS,
            new ArrayBlockingQueue<>(poolPolicy.getQueueSize()),
            new ThreadFactoryBuilder().setNameFormat(threadNamePrefix + "-%d").setDaemon(true).build(),
            new ThreadPoolExecutor.AbortPolicy());
    this.poolPolicy = poolPolicy;
    if (original != null) {
        original.shutdownNow();
    }
}

Task submission:

The thread pool rejection policy used in the thread pool encapsulation object is AbortPolicy, so an exception will be triggered after the number of threads and the blocking queue reach the upper limit. In addition, in order to ensure the success rate of submission, a retry strategy is used to achieve a certain degree of delay processing. In specific scenarios, appropriate adjustments and configurations can be made based on business characteristics.

public <T> Future<T> submit(Callable<T> task) {
    RejectedExecutionException exception = null;
    Future<T> future = null;
    for (int i = 0; i < this.poolPolicy.getResubmitTimes(); i++) {
        try {
            // - 添加任务
            future = this.executorService.submit(task);
            exception = null;
            break;
        } catch (RejectedExecutionException e) {
            exception = e;
            this.theadSleep(this.poolPolicy.getResubmitSleepMillis());
        }
    }
    if (exception != null) {
        throw exception;
    }
    return future;
}

monitor:

1. Submit submission monitoring

See "Monitoring point ①" in the code. Add a monitoring point in the submit method. For monitoring keys, you need to add the thread name prefix of the thread pool encapsulated object to distinguish specific thread pool objects.

"Monitoring point ①" is used to monitor whether the action of adding tasks is normal, so as to fine-tune the thread pool object and policy parameters.

public <T> Future<T> submit(Callable<T> task) {
    // - 监控点①
    CallerInfo callerInfo = Profiler.registerInfo(UmpConstant.THREAD_POOL_WAP + threadNamePrefix,
                UmpConstant.APP_NAME,
                UmpConstant.UMP_DISABLE_HEART,
                UmpConstant.UMP_ENABLE_TP);
    RejectedExecutionException exception = null;
    Future<T> future = null;
    for (int i = 0; i < this.poolPolicy.getResubmitTimes(); i++) {
        try {
            // - 添加任务
            future = this.executorService.submit(task);
            exception = null;
            break;
        } catch (RejectedExecutionException e) {
            exception = e;
            this.theadSleep(this.poolPolicy.getResubmitSleepMillis());
        }
    }
    if (exception != null) {
        // - 监控点①
        Profiler.functionError(callerInfo);
        throw exception;
    }
    // - 监控点①
    Profiler.registerInfoEnd(callerInfo);
    return future;
}

 

2. Thread pool parallel tasks

See the "monitoring point ②" of the code, respectively after adding the task and after the task is completed.

"Monitoring point ②" counts the total number of tasks executed in the thread in real time and is used to evaluate the full load level of the number of tasks in the thread pool.

/** 任务并行数量统计 */
private AtomicInteger parallelTaskCount = new AtomicInteger(0);

public <T> Future<T> submit(Callable<T> task) {
    RejectedExecutionException exception = null;
    Future<T> future = null;
    for (int i = 0; i < this.poolPolicy.getResubmitTimes(); i++) {
        try {
            // - 添加任务
            future = this.executorService.submit(()-> {
                T rst = task.call();
                // - 监控点②
                log.info("{} - Parallel task count {}", this.threadNamePrefix,  this.parallelTaskCount.decrementAndGet());
                return rst;
            });
            // - 监控点②
            log.info("{} + Parallel task count {}", this.threadNamePrefix,  this.parallelTaskCount.incrementAndGet());
            exception = null;
            break;
        } catch (RejectedExecutionException e) {
            exception = e;
            this.theadSleep(this.poolPolicy.getResubmitSleepMillis());
        }
    }
    if (exception != null) {
        throw exception;
    }
    return future;
}

3. Adjustment

Adjustment timing of thread pool encapsulated object strategy

1) Stress testing phase based on traffic estimation before going online;

2) After going online, follow up the monitoring data and the full load level of the tasks in the thread pool for manual fine-tuning, or automatically adjust it at the specified time through JOB;

3) Before the big promotion, adjust relevant parameters according to the peak value of previous big promotions.

Adjustment experience of thread pool encapsulated object strategy

1) When the access duration requirement is low, we can consider reducing the number of threads and blocking queues, and appropriately increasing the waiting time and number of submission task retries to reduce resource usage.

2) When the access duration requirement is high, it is necessary to increase the number of threads and ensure a relatively small blocking queue, and decrease the retry waiting time and number of submission tasks or even adjust them to 0 and 1 respectively (that is, turn off the retry submission logic) .

Author: Jingdong Retail Wang Wenming

Source: JD Cloud Developer Community Please indicate the source when reprinting

Lei Jun: The official version of Xiaomi’s new operating system ThePaper OS has been packaged. A pop-up window on the Gome App lottery page insults its founder. The U.S. government restricts the export of NVIDIA H800 GPU to China. The Xiaomi ThePaper OS interface is exposed. A master used Scratch to rub the RISC-V simulator and it ran successfully. Linux kernel RustDesk remote desktop 1.2.3 released, enhanced Wayland support After unplugging the Logitech USB receiver, the Linux kernel crashed DHH sharp review of "packaging tools": the front end does not need to be built at all (No Build) JetBrains launches Writerside to create technical documentation Tools for Node.js 21 officially released
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/10120877