Project optimization (asynchronous)

Project optimization (asynchronous)

1. Understand asynchronization

1.1 Synchronous and asynchronous

  • Synchronization: After one thing is done, do another thing, and other tasks cannot be performed at the same time.
  • Asynchronous: You can do another thing without waiting for one thing to complete. When the first thing is completed, you can receive a notification informing you that it is done, and you can proceed with the follow-up processing.

1.2 Standard asynchronous business process ⭐

  1. When the user wants to perform a long-time operation, such as clicking submit, there is no need to wait for a long time in the interface. Instead, the task should be saved in the database and recorded.

  2. When a user wants to perform a new task:

    1. Task submitted successfully:

      • When the program still has extra idle threads, it can perform this task immediately.

      • When the threads of the program are busy and cannot continue processing, then they are placed in the waiting queue.

    2. Task submission failed: For example, all threads of the program are busy, the task queue is full.

      • Reject this task and never perform it again.

      • You can see the tasks that failed to be submitted through the records saved in the database, and when the program is idle, you can call the task back from the database to the program and execute the task again. 3

  3. The program (thread) takes out tasks from the task queue and executes them in sequence, and the status of the task is modified every time a thing is completed.

  4. Users can query the execution status of tasks, or receive notifications (emails, system message prompts, text messages) when task execution is successful or failed, thereby optimizing the experience.

  5. If the task we want to perform is very complex and contains many links, when each small task is completed, the execution status (progress) of the task must be recorded in the program (database).

2. Thread pool

  1. What is a thread pool: A thread pool is a concurrent programming technology used tooptimize the performance and stability of multi-threaded applications. It can create a set of reusable threads when the application starts and assign work tasks to these threads to avoid repeatedly creating and destroying threads, thereby improving the application's throughput, response time and resource utilization.

  2. Thread pool advantages:

    1. Reduces thread creation and destruction overhead and improves performance and efficiency.

    2. This avoids the problems of exhaustion of system resources and increased thread scheduling overhead caused by too many threads.

    3. Allows the thread pool size to be adjusted to meet the needs of different applications.

    4. It can improve the maintainability and reusability of the code, avoid thread-related errors, and make the code more robust and reliable.

  3. The role of the thread pool: Helps you easily manage threads and coordinate the execution of tasks

  4. Implementation of thread pool: (you don’t need to write it yourself)

    1. If it is in Spring, it can be implemented using ThreadPoolTaskExecutor with @Async annotations. (Not recommended: encapsulated)

    2. Implementation method of JUC thread pool (JUC concurrent programming package, ThreadPoolExecutor in it to achieve very flexible customization of thread pool.)

      1. Create a configuration thread pool class

        /**
         * 配置线程池类
         * 可以在yml文件中写配置,实现自动注入
         */
        @Configuration
        public class ThreadPoolExecutorConfig {
                  
                  
        
            /**
             * 线程池的实现类
             * @return
             */
            @Bean
            public ThreadPoolExecutor threadPoolExecutor(){
                  
                  
                ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor();
                return threadPoolExecutor;
            }
        }
        
      2. Parameter explanation

        Parameters are adjusted according to actual scenarios, tested, and continuously optimized.

        According to the BI system, the number of threads must be configured with AI capabilities (AI capabilities are the bottleneck). AI supports 4 threads and allows 20 tasks to be queued (parameters are adjusted according to conditions)

        Resource isolation strategy: Tasks of different levels are divided into different queues, such as one queue for VIPs and one queue for ordinary users.

        public ThreadPoolExecutor(int corePoolSize,//核心线程数:正常情况下,系统能同时工作的数量,处于随时就绪的状态
                                      int maximumPoolSize,//最大线程数,极限情况下,线程池有多少个线程
                                      long keepAliveTime,//空闲线程存活时间,非核心线程在没有任务的情况下多久删除,从而释放线程资源
                                      TimeUnit unit,//空闲线程存活时间单位
                                      BlockingQueue<Runnable> 															workQueue, //工作队列,用于存放给线程执行的任务,存在队列最大长度(一定要设置不可以为无限)
                                ThreadFactory threadFactory,//线程工厂,控制每个线程的生产
                                      RejectedExecutionHandler handler//(拒绝策略)线程池拒绝策略,==任务队列满的时候,我们采取什么措施,比如抛异常、不抛异常、自定义策略==。
                                 ) {
                  
                  
            
        }
        
        1. How thread pool works

          1. At the beginning: there are no task threads, nor any tasks.
          1. The initial number of core threads, the maximum number of threads, and the number in the task queue are:

          corePoolSize = 0;maximumPoolSize = 0,workQueue.size = 0

          Insert image description here

          1. Came to the first mission and discovered our employeesOfficial headcount has not been reached, an employee came to handle the task directly.
          1. When the first task arrives, the number of core threads, the maximum number of threads, and the number existing in the task queue are:

          corePoolSize = 1;maximumPoolSize = 1,workQueue.size = 0

          Insert image description here

          1. Another task came, and we found that our employees stillThe number of full-time employees has not been reached, another employee will handle this task directly.

          3. When the second task arrives, the number of core threads, the maximum number of threads, and the number existing in the task queue are:

          corePoolSize = 2;maximumPoolSize = 2,workQueue.size = 0

          (One person handles one task, one thread handles one task)

          Insert image description here

          1. Another task comes, but our number of formal employees is already full (current number of threads = corePoolSize = 2), put the latest task into the task queue (maximum lengthworkQueue.size It is 2) to wait instead of adding new employees.

          4. When the third and fourth tasks arrive, the number of core threads, the maximum number of threads, and the number existing in the task queue are:

          corePoolSize = 2;maximumPoolSize = 2,workQueue.size = 2

          (One person handles one task, one thread handles one task)

          Insert image description here

          1. Another task comes, but our task queue is already full (the current number of threads is greater than corePoolSize=2, the number of existing tasks = maximum lengthworkQueue.size = 2), new Add threads (maximumPoolSize = 4) to process tasks instead of discarding tasks.
          1. When the fifth task arrives, the number of core threads, the maximum number of threads, and the number existing in the task queue are:

          corePoolSize = 2;maximumPoolSize = 3,workQueue.size = 2

          (A temporary worker was hired to handle this task for a new member of the team)

          Insert image description here

          1. When task 7 arrives, our task queue is full and temporary workers are also full (current number of threads = maximumPoolSize = 4, number of existing tasks = maximum length< /span> rejection policy to handle redundant tasks. workQueue.size = 2) Call the RejectedExecutionHandler
          1. When the sixth task arrives, the number of core threads, the maximum number of threads, and the number in the task queue are:

          corePoolSize = 2;maximumPoolSize = 4,workQueue.size = 2

          (Another temporary worker was found to process the frontmost task 4 in this queue, and then the sixth new thread entered the task queue and waited)

          1. When the seventh task arrives, the number of core threads, the maximum number of threads, and the number in the task queue are:

          corePoolSize = 2;maximumPoolSize = 4,workQueue.size = 2

          (At this time, the number of core threads, the maximum number of threads, and the task queue are all full, and new tasks cannot be received, so task 7 can only be rejected)

          7. Finally, if the current number of threads exceeds corePoolSize (number of formal employees) and there are no new tasks for it, then the thread can be released after the keepAliveTime time is reached.

        2. Determine the parameters of the thread pool

          1. corePoolSize(Number of core threads => Number of formal employees): Under normal circumstances, it can be set to 2 - 4
          2. maximumPoolSize(Maximum number of threads): Set as extreme case, set to <= 4
          3. keepAliveTime(Idle thread survival time): Generally set to seconds or minutes
          4. TimeUnit unit(Unit of idle thread survival time): minutes, seconds
          5. workQueue(Work queue): Set according to the actual situation, you can set it to 20 (2n+1)
          6. threadFactory(Thread Factory): Controls the generation of each thread and the properties of the thread (such as thread name)
          7. RejectedExecutionHandler(Rejection policy): Throw an exception and mark the task status of the database as "Task full and rejected"
        3. Thread setting type

          • IO-intensive: consumes bandwidth/memory/hard disk read and write resources. corePoolSize can be set larger. The general experience value is about 2n, but it is recommended to focus on IO capabilities.
          • Computing-intensive: consuming CPU resources, such as audio and video resources, image processing, mathematical calculations, etc. Generally, corePoolSize is set to the number of CPU cores + 1 (free thread pool)

          Considering importing millions of data into a database, is it an IO-intensive task or a computing-intensive task?

          Answer: Consider importing millions of data into a database an IO-intensive task. Importing data requires reading a large amount of data from the outside and then writing it to the database. The computational workload in this process is not very high, but relatively more disk IO and network IO are required. Therefore, the performance bottleneck of this task is usually in IO operations rather than calculation.

        4. Custom thread pool

          /**
           * 配置线程池类
           * 可以在yml文件中写配置,实现自动注入
           */
          @Configuration
          public class ThreadPoolExecutorConfig {
                      
                      
          
              @Bean
              public ThreadPoolExecutor threadPoolExecutor() {
                      
                      
                  ThreadFactory threadFactory = new ThreadFactory() {
                      
                      
                      private int count = 1;
          
                      @Override
                      public Thread newThread(@NotNull Runnable r) {
                      
                      
                          // 一定要将这个 r 放入到线程当中
                          Thread thread = new Thread(r);
                          thread.setName("线程:" + count);
                          // 任务++
                          count++;
                          return thread;
                      }
                  };
                  ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(2, 4, 100, TimeUnit.SECONDS,
                          new ArrayBlockingQueue<>(100), threadFactory);
                  return threadPoolExecutor;
              }
          }
          
        5. Submit tasks to a custom thread pool

          @RestController
          @RequestMapping("/queue")
          @Slf4j
          @Profile({
                      
                       "dev", "local" })
          @Api(tags = "QueueController")
          @CrossOrigin(origins = "http://localhost:8000", allowCredentials = "true")
          public class QueueController {
                      
                      
          
              @Resource
              private ThreadPoolExecutor threadPoolExecutor;
          
              @GetMapping("/add")
              public void add(String name) {
                      
                      
                  CompletableFuture.runAsync(() -> {
                      
                      
                      log.info("任务执行中:" + name + ",执行人:" + Thread.currentThread().getName());
                      try {
                      
                      
                          Thread.sleep(60000);
                      } catch (InterruptedException e) {
                      
                      
                          throw new RuntimeException(e);
                      }
                  },threadPoolExecutor);
              }
          
              @GetMapping("/get")
              public String get() {
                      
                      
                  Map<String, Object> map = new HashMap<>();
                  int size = threadPoolExecutor.getQueue().size();
                  map.put("队列长度:", size);
                  long taskCount = threadPoolExecutor.getTaskCount();
                  map.put("任务总数:", taskCount);
                  long completedTaskCount = threadPoolExecutor.getCompletedTaskCount();
                  map.put("已完成任务数:", completedTaskCount);
                  int activeCount = threadPoolExecutor.getActiveCount();
                  map.put("正在工作的线程数:", activeCount);
                  return JSONUtil.toJsonStr(map);
              }
          }
          

3. Use asynchronous optimization in practical projects

  1. System problem analysis:

    1. Users wait longer
    2. The business server may have many requests being processed, causing system resource constraints. In severe cases, the server may crash or be unable to handle new requests (many users making the same request in the system will degrade the system experience.)
    3. The processing capabilities of the third-party services called (such as AI capabilities) are limited. For example, only one request is processed every 3 seconds. When there are multiple requests, the AI ​​will not be able to handle them. In severe cases, the AI ​​may deny service to the background system.
  2. Solution=>Asynchronous

    • Asynchronous usage scenarios: The called service has limited processing capabilities, or the interface takes a long time to process (return), so consider asynchronous
  3. Comparison before and after asynchronous optimization

    • Architecture diagram before optimization

      Insert image description here

    • Optimized architecture diagram

      Insert image description here

  4. Asynchronous (new Rhread) implementation

    1. What should be the maximum capacity of the task queue?
    2. How does the program take out tasks from the task queue for execution, how is the task queue process implemented, and how is it ensured that the program can execute a maximum number of tasks at the same time?
      • blocking queue
      • Thread Pool
      • Add more manpower?
    1. Process summary:

      1. Add task status fields (such as queued, executing, completed, failed) and task execution information fields (used to record some information about task execution or failure) to the chart table.

        -- 图表信息表
        create table if not exists chart
        (
            id          bigint auto_increment comment 'id' primary key,
            goal        text                                   null comment '分析目标',
            chartName   varchar(256)                           null comment '图表名称',
            chartData   text                                   null comment '图表数据',
            chartType   varchar(256)                           null comment '图表类型',
            genChart    text                                   null comment '生成的图表信息',
            genResult   text                                   nul l comment '生成的分析结论',
            chartStatus varchar(128) default 'wait'            not null comment 'wait-等待,running-生成中,succeed-成功生成,failed-生成失败',
            execMessage text                                   null comment '执行信息',
            userId      bigint                                 null comment '创建图标用户 id',
            createTime  datetime     default CURRENT_TIMESTAMP not null comment '创建时间',
            updateTime  datetime     default CURRENT_TIMESTAMP not null on update CURRENT_TIMESTAMP comment '更新时间',
            isDelete    tinyint      default 0                 not null comment '是否删除'
        ) comment '图表信息表' collate = utf8mb4_unicode_ci;
        
      2. When the user clicks the submit button on the intelligent analysis page, the chart is immediately saved to the database (as a task)

      3. Task: First modify the chart task status to "Executing". After the execution is successful, the status is changed to "Completed" and the execution results are saved; after the execution fails, the status is changed to "Failed" and the task failure information is recorded.

      4. Users can view the information and status of all charts in the chart management interface.

        • generated
        • Generating
        • failed to generate
      5. Users can modify the chart information that failed to generate and click to regenerate the chart.

Guess you like

Origin blog.csdn.net/weixin_52154534/article/details/134889894