Design and implementation of distributed batch processing framework in big promotion scenarios





Before this Double Eleven, we launched a new version of the batch processing framework, which fully supports the promotion of investment. Through SDK access, task logic can be directly implemented in business applications, and access is convenient; through centralized scheduling and task distribution, the processing efficiency is significantly improved.


background


In B-side systems, batch processing capabilities are indispensable. It can help users complete a series of actions in batches and reduce the cost of repeated operations. In the big promotion investment promotion system, we also need an online batch processing framework to support merchants’ batch product registration, batch main image marking, one-click publishing, export of submitted products, etc., so that merchants can batch upload data and manage Operation records, view detailed results of batch operations. The input data for these tasks comes from various sources, including Excel, DB, OpenSearch, etc. The framework needs to be able to support the parsing of various types of input data. At the same time, the investment promotion system has a large number of applications, and it needs to be able to support convenient access to applications in various domains at the same time. It is best to only introduce the package and then implement the task logic. In scenarios with large amounts of data, the framework needs to support refined scheduling of different types of task instances while ensuring system throughput and stability.


Overall program


▐Architectural design  


The business container is an application that accesses the framework, and the task center is the central application of the batch processing framework. The scheduling and status changes of the instance are completed in the task center, which facilitates centralized management; the execution logic of the instance is implemented in the business container, so the business container needs to be called back during execution.



▐Model   design


To schedule task instances in the dimension of a single data item, in addition to task registration information and task instances, a model of subtask instances also needs to be introduced. The task registration information object contains information such as the task type and execution current limit value of a certain task. Each time a user uploads data in batches, a main task instance will be generated, and a single data item corresponds to a subtask instance.


▐Main process  


The core process draws on the idea of ​​MapReduce, splitting a large task and distributing it to multiple machines for execution, and finally summarizing the results. When the business container is connected, it needs to implement the main instance splitting, sub-instance execution and result merging logic of a task. When the main instance is split, user input data will be parsed into sub-instances and stored in DB; sub-instance execution is the execution logic of a single data item; result merging is to display the processing results of sub-instances after statistics (such as generating task details in Excel).


After the instance is triggered by scheduling, the task center will call the main instance splitting method implemented in the business container. After the main instance is split, data needs to be returned in batches. When the task center executes the instance, it will sweep out the sub-instances and distribute the sub-instances to the entire cluster of the task center through rocketmq spontaneous collection. After receiving the message, it calls the business container to execute the sub-instances. After getting the results, the sub-instance status is updated and the consumption is successful. . Regularly scan the sub-instance status of the executing main instance through the ScheduleX task. When all sub-instances are executed, the business container execution result merging logic is called back, and the task is finally archived.


▐State machine  


main task

Subtasks


Key technical points


▐Scheduling execution  


  • Current limiting component


The current limiting component uses the implementation in the guava package. When registering a task, you need to configure the current limiting value for the main instance and sub-instance execution respectively. Current limiting is also done in the task type dimension. During scheduling, the corresponding current limiting value is obtained according to the key of the task. device. The current limiter is cached locally on the machine. After expiration, the task registration information is re-queried and a new current limiter is created. Currently, current limiting is only implemented in the single-machine dimension. Divide the cluster current limiting number by the number of cluster machines in the task center to obtain the single-machine current limiting value.



  • Master instance scheduling


After the task instance is created, it will try to obtain the token once. If it can be obtained, the main instance will be directly triggered to execute the subsequent process. If the token is not obtained, the task will stay in the pending trigger state, waiting for the ScheduleX task to be picked up periodically and retry execution.



  • Subinstance scheduling


When the main instance is executed, it will first distribute all sub-instances to the entire cluster through rocketmq in the task center, and then synchronously call the business container to execute and obtain the results of the sub-instances. Current limiting of sub-instances is achieved by controlling the message consumption rate. Blocks the acquisition of tokens when consuming rocketmq in the task center. Because all task types share a topic, the message distribution rate is also capped. Otherwise, large instances will cause a backlog of messages and block the execution of other types of tasks.


If the machine is restarted before updating the sub-instance status, using the retry mechanism of mq, idempotent task types can be re-executed directly. For task types that do not support idempotent tasks, the sub-instance can be directly updated to fail when the message is retried.


▐Communication between center and client  


When executing the implementation logic of a task, the business container needs to be called back, and the SDK needs to support a communication method actively initiated by the task center. The specific implementation is:

  1. Client side: Provide Dubbo interface, register services when business applications are started, and use Dubbo groups to distinguish different applications (groups cannot be repeated, so use the application name directly as the group)

  2. Task center side: Register as a consumer of all business application services. When it is necessary to call back the business container, first find the corresponding consumer based on the application registered for the task, and initiate a call to the business application through the consumer.


The main instance is split and the results are merged using asynchronous calls. Considering that the sub-instance is already the result of the split, it currently only supports synchronous calls.


▐Example exploration  


Considering that the splitting and result merging of the main instance take a long time to execute, and hundreds of thousands of data are read and written in extreme scenarios, the splitting and result merging of the main instance are all asynchronous calls to the client. In an asynchronous scenario, you need to consider how to detect and retry business applications. Otherwise, once the machine is restarted, the executing task instance will stay in an intermediate state and generate a large amount of dirty data.


The solution used for live detection is to report heartbeats on the client side and detect heartbeats with scheduled tasks on the task center side. After receiving the request, the client will regularly report the heartbeat of the task instance locally, that is, update the heartbeat time of the instance in the DB, and will no longer report it after the execution is completed. The task center scans the ScheduleX task table to detect task instances with heartbeat timeout, and re-initiates a request to the client.



The above solves the problem of instance execution interruption when business applications are restarted. If the task center application is restarted, it will also cause the interruption of some operations on the instance (such as sub-instance distribution). It is also solved by using heartbeat time probing for timeout instances. , re-execute the operation in the current state to prevent the instance from permanently staying in the intermediate state.

▐Client   implementation


When the business container is connected, you only need to implement the MapReduceTask class in the SDK and implement the methods of splitting main task instances, executing subtask instances, and merging results. The rest of the logic is built in the SDK, so you don’t need to worry about it. When defining a class, you also need to declare the unique identification key of the task type, which is used to match the task type and specific implementation on the client. The specific demo is as follows:
@BatchTask(key = "myDemo")public class MyDemo implements MapReduceTask {    /**     * 主任务实例拆分     *     * @param context context     * @return {@link TaskResult}     */    @Override    public TaskResult processInstance(ExecuteContext executeContext) {        while (true) {            // 分批读取输入数据            ...
// 生成子实例 List<SubInstance> subInstances = ...;
// 提交数据 executeContext.commit(subInstances); } return TaskResult.success(); }
/** * 子任务实例执行 * * @param context context * @return {@link TaskResult} */ @Override public TaskResult map(SubExecuteContext subExecuteContext) { // do something ... return TaskResult.success(); }
/** * 结果合并 * * @param context 上下文 * @return {@link TaskResult} */ @Override public TaskResult reduce(ExecuteContext executeContext) { do { // 读取子实例 List<SubInstance> subInstanceList = executeContext.read(pageSize); if (CollectionUtils.isEmpty(subInstanceList)) { break; } // 构建结果明细 ... } while (subInstanceList.size() == pageSize);
// 生成结果数据 Map<String, Object> resultInfoMap = ...; return TaskResult.success(resultInfoMap); }}

In the splitting and result merging logic of the main task, read and write operations will be required on the sub-instances respectively, so in the task execution context ExecuteContext, the commit() and read() methods are provided for the client to call, among which the writing logic The sub-instance object needs to be constructed in the business container and submitted.


Considering that if the task center is called through Dubbo every time for reading and writing sub-instances, high-frequency reading and writing will increase the number of exceptions such as network timeouts, so a direct connection to the DB solution is adopted, and the SDK has a built-in interaction layer with the DB. In order to improve the efficiency of writing to the DB, the committed data will be stored in the buffer of the task execution context. After the threshold is exceeded, the data will be inserted into the DB in batches. Finally, the buffer will be cleared and the remaining data will be inserted.


Effects and prospects


After the new version of the batch processing framework was launched, it has now completed the migration and access of multiple key merchant operations such as batch registration, one-click registration, export of submitted products, batch cancellation, etc., fully and stably supporting the investment promotion of this Double Eleven promotion. Processed 130W+ task instances. At present, the system as a whole is relatively stable, tasks can be configured flexibly, and the system as a whole can be monitored.


Regarding the direction of future optimization: Because sub-instances are distributed through mq, if a large number of sub-instances are executed and current-limited, messages from subsequent sub-instances that are not current-limited for other tasks may be blocked. Currently, messages are isolated by business identity, and the upper limit of the sending rate is roughly controlled. In the future, we will try to improve the throughput of the entire system in extreme scenarios by matching the rates of message single-machine sending and cluster consumption, and group consumption of different types of tasks. quantity.


team introduction


淘天核心技术团队,持续建设全网比价、用增、交易、招商、选品、搭建、投放等能力,支撑大促(双11、618等)和日销业务。简单、高效、纯粹的技术文化,在使命与责任中互相成就,共同成长。
Base杭州职位热招中:Java开发工程师、前端工程师、测试开发工程师、数据分析师。详情联系:[email protected]


¤  拓展阅读  ¤

3DXR技术 |  终端技术 |  音视频技术
服务端技术  |  技术质量 |  数据算法


本文分享自微信公众号 - 大淘宝技术(AlibabaMTT)。
如有侵权,请联系 [email protected] 删除。
本文参与“OSC源创计划”,欢迎正在阅读的你也加入,一起分享。

博通宣布终止现有 VMware 合作伙伴计划 deepin-IDE 版本更新,旧貌换新颜 WAVE SUMMIT 迎来第十届,文心一言将有最新披露! 周鸿祎:鸿蒙原生必将成功 GTA 5 完整源代码被公开泄露 Linus:圣诞夜我不看代码,明年再发布新版 Java 工具集 Hutool-5.8.24 发布,一起发发牢骚 Furion 商业化探索:轻舟已过万重山,v4.9.1.15 苹果发布开源多模态大语言模型 Ferret 养乐多公司确认 95 G 数据被泄露
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4662964/blog/10320782