Are you still burying your head in eight-legged essays? It's better to add some hard dishes to your interview first: how to improve throughput and timeliness in delayed task scenarios?

foreword

Before, I shared a lot of interviews with large factories, and I also received many private messages from my classmates.

But the interview is not only about eight-legged essays, we also need to fill ourselves with in-depth hard-core technology and program design.

In fact, most of the pure coders who like to write code prefer volumes. For example, if a requirement can be used in implementation, it is probably P5. If the function is not only usable, but also very easy to use, it is P6. , In addition to being easy to use, it also condenses common requirements to develop a common component service is P7. Every coder who has grown up has verified his ideas and put them into practice again and again on the way to build wheels. It is definitely not enough to exhaust a high-level technical expert with eight-part essays.

2. Delayed task scenarios

What is a delayed task?

In our actual business demand scenario, there are some status changes before the start of activities, T+1 reconciliation after order settlement, and the generation of loan single interest charges, all of which require the use of delayed tasks to reach them. The actual operation generally has Quartz and Schedule to scan and process your database table data regularly. When the conditions are met, the data status will be changed or new data will be inserted into the table.

Such a simple requirement is the initial requirement of the delayed task. If the content in the early stage of the requirement is small and easy to use, it may be just a single machine in the actual development. However, with the development of business requirements and the increase in the complexity of functions, it is often not so simple to feed back to the R&D design and implementation. For example, you need to ensure that the scan processing of large-scale data volume is completed with the lowest possible delay, otherwise it will be like The generation of the loan single interest fee has arrived on the second day, and the user has not seen his interest fee information or reconciled after repayment, and may have a customer complaint at this time.

So, how to design a scene like this?

3. Delayed task design

The usual task center processing flow is mainly that the scheduled task scans the task library table, scans the task information that is about to reach the timeout time to the processing queue (memory/MQ message), and then processes the task by the business system, and updates the library after the processing is completed. Task status in the table.

Edit switch to center

Add image comments, no more than 140 words (optional)

question:

  1. Task list data with massive data and large scale needs to be quickly scanned under sub-database and sub-table.

  2. The task scanning service is coupled with business logic processing, and is not universal and reusable.

  3. Some of the subdivided task systems need to be delayed and cannot wait for too long.

1. Task list method

In addition to some minor status change scenarios, for example, the library table of the respective business contains a status field. On the one hand, this field has the status of the program logic processing the change, and it is also automatically changed and processed by the task service after reaching the specified expiration time. The operation, generally this kind of function, can be directly designed into its own library table.

Then there are some larger and more frequently used scenarios. If such fields are added to each of the N tables required by each system for maintenance, it will be very redundant and not so easy. maintain. Therefore, for such a scenario, it is very suitable to make a general task delay system. Each business system submits the actions that need to be delayed to be executed to the delay system, and then there is a delay system that calls back at a specified time, and the callback action It can be reached by interface or MQ message. For example, you can design such a task schedule:

Edit switch to center

Add image comments, no more than 140 words (optional)

  1. The extracted task schedule is mainly about what tasks to get and when to initiate actions, and the specific action processing is handed over to the business engineering for processing.

  2. For centralized processing of a large number of tasks of their respective businesses, it is necessary to design a sub-database and sub-table to meet the growth of the subsequent business volume.

  3. The house number design is aimed at scanning a table. If the amount of data is large, and you don't want to scan a table for only one task, you can scan a table for multiple tasks and add it to the scanning volume. At this time, a house number is needed to isolate the scanning range of different tasks and avoid scanning duplicate task data.

2. Low latency method

The low-latency processing scheme is based on the task table method, and the newly added time control processing. It can put the tasks that are about to expire in the previous period of time into the Redis cluster team, and pop them out of the queue when they are consumed, so that the processing time of the tasks can be approached more quickly, avoiding the large interval between scanning the database. Delay task execution.

Edit switch to center

Add image comments, no more than 140 words (optional)

  • When receiving delayed tasks submitted by the business system, they are placed in the task library or synchronized to the Redis cluster according to the execution time. Some tasks with late execution time can be placed in the task library first, and then added to the task library by scanning. Timeout task execution queue.

  • Then the core of the design of this piece lies in the use of Redis queues, and in order to ensure the reliability of consumption, it is necessary to introduce two-stage consumption and register the ZK registry to ensure at least one consumption processing. The focus of this article is on the design of Redis queues, and other more logical processing can be extended and improved according to business needs

Redis consumption queue

  • Calculate the slot index = CRC32 & 7 to which the corresponding data belongs according to the message body

  • StoreQueue uses Slot to sort by execution task score according to SlotKey = #{topic}_#{index} and Sorted Set data structure to store task execution information. Timing messages use the timestamp as a score, and pop up a message with a score less than the current timestamp each time when consuming

  • In order to ensure that each message can be consumed at least once, the consumer does not directly pop the elements in the ordered collection, but moves the elements from the StoreQueue to the PrepareQueue and returns the message to the consumer. After the consumption is successful, it will be deleted from the PrepareQueue. If the consumption fails, it will be moved from the PrepareQueue to the StoreQueue again, so that the two-stage consumption is processed.

  • Reference document: 2021 Alibaba Technician's Baibao Black Book PDF document, low-latency timeout center implementation

Simple case

@Test
public void test_delay_queue() throws InterruptedException {
    RBlockingQueue<Object> blockingQueue = redissonClient.getBlockingQueue("TASK");
    RDelayedQueue<Object> delayedQueue = redissonClient.getDelayedQueue(blockingQueue);
    new Thread(() -> {
        try {
            while (true){
                Object take = blockingQueue.take();
                System.out.println(take);
                Thread.sleep(10);
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }).start();
    int i = 0;
    while (true){
        delayedQueue.offerAsync("测试" + ++i, 100L, TimeUnit.MILLISECONDS);
        Thread.sleep(1000L);
    }
}
复制代码

Test Data

2022-02-13  WARN 204760 --- [      Finalizer] i.l.c.resource.DefaultClientResources    : io.lettuce.core.resource.DefaultClientResources was not shut down properly, shutdown() was not called before it's garbage-collected. Call shutdown() or shutdown(long,long,TimeUnit) 
测试1
测试2
测试3
测试4
测试5

Process finished with exit code -1
复制代码
  • Source code: github.com/fuzhengwei/…
  • Description: Use DelayedQueue in redisson as a message queue, and wait for consumption time for POP consumption after writing.

4. Summary

  • The use of scheduling tasks is very frequent in actual scenarios. For example, we often use xxl-job, and there are also some distributed task scheduling components developed by large manufacturers. These may originally be small and simple functions, but after abstraction and integration , refined, and turned into a core general middleware service.
  • When we consider the use of task scheduling, no matter which method of design and implementation, we need to consider the iteration and maintainability of this function. If it is only a very small scene and not many people use it, then in the You can toss on your own machine. Transitional design and use sometimes also substitute R&D resources into the quagmire
  • In fact, the knowledge points of various technologies are like tools, knives, guns, sticks, axes, and hooks. How to combine their own characteristics and use these weapons is the process of a programmer's continuous growth. If you want to know more such in-depth technical content, you can join the Lottery distributed lottery seckill system to learn more valuable and more resistant actual combat methods.

Author: Brother Xiao Fu
Link: https://juejin.cn/post/7064732138339303431
Source: Rare Earth Nuggets
The copyright belongs to the author. For commercial reprints, please contact the author for authorization, and for non-commercial reprints, please indicate the source.

Guess you like

Origin blog.csdn.net/wdjnb/article/details/124278808