Distributed Task Scheduling Platform——XXL-JOB

1. Why do we need a task scheduling platform?

1.1. The traditional timing task implementation scheme is insufficient

In Java, traditional timing task implementation schemes, such as Timer, Quartz, etc., have more or less problems:

  1. Does not support clustering, does not support statistics, has no management platform, does not have failure alarms, does not monitor, etc.
  2. In the current distributed architecture, there are some scenarios that require distributed task scheduling: when the tasks of multiple instances of the same service are mutually exclusive, unified scheduling is required; task scheduling needs to support high availability, monitoring, and fault alarms. It is necessary to manage and track the task scheduling results of each service node in a unified manner, and to record and save task attribute information, etc.

Obviously, the traditional scheduled tasks are no longer suitable for the current distributed architecture, so a distributed task scheduling platform is needed. At present, elasticjob and xxl-job are the mainstream ones.

1.2. Disadvantages of Quartz

As the leader in open source job scheduling, Quartz is the first choice for job scheduling. However, in the cluster environment, Quartz uses API to manage tasks, so as to avoid the above problems, but there are also the following problems:

Problem 1: The method of calling the API to operate tasks is inhumane;
Problem 2: It is necessary to persist the business QuartzJobBean to the underlying data table, and the system is quite intrusive.
Question 3: The scheduling logic and QuartzJobBean are coupled in the same project, which will lead to a problem. When the number of scheduling tasks is gradually increasing, and the scheduling task logic is gradually aggravating, the performance of the scheduling system will be greatly limited by the business at this time;
Question 4: The underlying layer of quartz acquires DB locks in a "preemptive manner" and the successful preemption node is responsible for running the task, which will lead to a very large difference in node load; while XXL-JOB realizes the "cooperative distribution" running task through the executor, giving full play to the advantages of the cluster , the load of each node is balanced.
XXL-JOB makes up for the above shortcomings of quartz.

2. Comparison between elasticjob and xxl-job

The original intention of elasticjob is to face high-concurrency and complex businesses. Even when the business volume is large and there are many servers, it can do a good job of task scheduling and use server resources as much as possible. Using ZooKeeper makes it highly available, consistent, and scalable.

Elasticjob is decentralized. The main server is elected through ZooKeeper's election mechanism. If the main server hangs up, a new main server will be re-elected. Therefore, elasticjob has good scalability and availability, but it is somewhat complicated to use and operate.

xxl-job is the opposite. It uses a central scheduling platform to schedule multiple executors to execute tasks. The scheduling center uses DB locks to ensure the consistency of cluster distributed scheduling. In this way, expanding executors will increase the pressure on DB, but if In fact, the database here is only responsible for the scheduling and execution of tasks. But if there are not a large number of executors and tasks, it will not cause pressure on the database. In fact, most companies have fewer tasks and fewer executors.

Relatively speaking, the xxl-job central scheduling platform is lightweight, out-of-the-box, easy to operate, quick to use, and has a very good integration with SpringBoot, and the monitoring interface is integrated in the scheduling center, and the interface is simple. The cost of maintenance is not high, and there are email alerts for failures and so on. This makes many companies choose xxl-job as the scheduling platform.

3、xxl-job

3.1. Introduction

1 Overview

XXL-JOB is a distributed task scheduling platform. Its core design goals are rapid development, easy learning, lightweight, and easy expansion. The source code is now open and connected to the online product lines of many companies, out of the box.

2. Official website and community

1. Official website
2. Community

3. Features

  1. Simple: Support CRUD operations on tasks through web pages, easy to operate, and get started in one minute;
  2. Dynamic: Supports dynamic modification of task status, start/stop task, and termination of running tasks, which take effect immediately;
  3. Dispatch Center HA (Central): The dispatch adopts a central design. The "Dispatch Center" develops its own scheduling components and supports cluster deployment, which can ensure the dispatch center HA;
  4. Executor HA (distributed): task distributed execution, task "executor" supports cluster deployment, which can ensure task execution HA;
  5. Registration center: The executor will automatically register tasks periodically, and the dispatch center will automatically discover the registered tasks and trigger execution. At the same time, it also supports manual entry of the executor address;
  6. Elastic expansion and contraction: Once a new executor machine goes online or offline, tasks will be reassigned in the next scheduling;
  7. Trigger strategy: Provide rich task trigger strategies, including: Cron trigger, fixed interval trigger, fixed delay trigger, API (event) trigger, manual trigger, parent-child task trigger;
  8. Scheduling expiration strategy: the compensation processing strategy for the dispatch center to miss the dispatch time, including: ignore, immediately compensate and trigger once, etc.;
  9. Blocking processing strategy: the processing strategy when the scheduling is too intensive and the executor is too late to process, the strategies include: single-machine serial (default), discarding subsequent scheduling, and overriding previous scheduling;
  10. Task timeout control: Support custom task timeout, task running timeout will actively interrupt the task;
  11. Task failure retries: support custom task failure retries, when the task fails, it will actively retry according to the preset failure retries; among them, fragmentation tasks support fragmentation granularity failure retries;
  12. Task failure alarm; email failure alarm is provided by default, and an expansion interface is reserved at the same time, which can easily expand alarm methods such as SMS and DingTalk;
  13. Routing strategy: Rich routing strategies are provided when the executor cluster is deployed, including: first, last, polling, random, consistent HASH, least frequently used, least recently used, failover, busy transfer, etc.;
  14. Fragmented broadcast task: When the executor cluster is deployed, if the task routing strategy selects "fragmented broadcast", a task scheduling will broadcast and trigger all executors in the cluster to execute a task, and the fragmented task can be developed according to the fragmentation parameters;
  15. Dynamic sharding: sharding broadcast tasks are sharded in the dimension of executors, which supports dynamic expansion of executor clusters to dynamically increase the number of shards and coordinate business processing; it can significantly improve task processing capabilities when performing large-scale business operations and speed.
  16. Failover: When the task routing policy selects "Failover", if a machine in the executor cluster fails, it will automatically failover and switch to a normal executor to send scheduling requests.
  17. Task progress monitoring: support real-time monitoring of task progress;
  18. Rolling real-time log: supports online viewing of scheduling results, and supports real-time viewing of the complete execution log output by the executor in Rolling mode;
  19. GLUE: Provides a Web IDE that supports online development of task logic codes, dynamic publishing, real-time compilation and validation, and omits the process of deployment and online. Backtracking of 30 versions of historical versions is supported.
  20. Script tasks: support developing and running script tasks in GLUE mode, including Shell, Python, NodeJS, PHP, PowerShell and other types of scripts;
  21. Command line task: Natively provides a common command line task Handler (Bean task, "CommandJobHandler"); the business side only needs to provide the command line;
  22. Task dependency: Support configuration of subtask dependencies. When the execution of the parent task is completed and executed successfully, it will actively trigger the execution of a subtask. Multiple subtasks are separated by commas;
  23. Consistency: "Scheduling Center" ensures the consistency of cluster distributed scheduling through DB locks, and one task scheduling will only trigger one execution;
  24. Custom task parameters: support online configuration scheduling task input parameters, effective immediately;
  25. Scheduling thread pool: multi-threaded scheduling system triggers scheduling operation to ensure accurate execution of scheduling without being blocked;
  26. Data encryption: The communication between the dispatch center and the actuator is encrypted to improve the security of dispatch information;
  27. Email alarm: Support email alarm when the task fails, and support the configuration of multiple email addresses to send alarm emails;
  28. Push maven central warehouse: the latest stable version will be pushed to maven central warehouse, which is convenient for users to access and use;
  29. Operation report: supports real-time viewing of operation data, such as the number of tasks, scheduling times, number of executors, etc.; and scheduling reports, such as scheduling date distribution diagram, scheduling success distribution diagram, etc.;
  30. Fully asynchronous: The task scheduling process is fully asynchronously designed and implemented, such as asynchronous scheduling, asynchronous operation, asynchronous callback, etc., which can effectively cut traffic peaks for intensive scheduling, and theoretically support the operation of tasks of any duration;
  31. Cross-language: The dispatch center and the executor provide language-independent RESTful API services, and any third-party language can connect to the dispatch center or implement the executor. In addition, it also provides other cross-language solutions such as "multitasking mode" and "httpJobHandler";
  32. Internationalization: The dispatch center supports internationalization settings, providing two optional languages, Chinese and English, and the default is Chinese;
  33. Containerization: provide the official docker image, and update and push dockerhub in real time to further realize the out-of-the-box use of the product;
  34. Thread pool isolation: The scheduling thread pool is isolated and split, and slow tasks are automatically downgraded to the "Slow" thread pool to avoid running out of scheduling threads and improve system stability;
  35. User management: support online management system users, there are two roles of administrator and ordinary user;
  36. Permission control: Permission control is performed in the executor dimension, administrators have full permissions, and ordinary users need to be assigned executor permissions before allowing related operations;

4. Development

In mid-2015, Xu Xueli created the XXL-JOB project warehouse on github and submitted the first commit, followed by system structure design, UI selection, interaction design...

On 2021-12-06, XXL-JOB participated in the "2021 OSC China Open Source Project Selection" evaluation, competed among more than 10,000 open source projects that had been entered at that time, and was finally selected as the "Most Popular Project".

So far, XXL-JOB has been connected to the online product lines of many companies, and access scenarios such as e-commerce business, O2O business and big data operations, etc. As of the latest statistical time, the companies that XXL-JOB has connected include but are not limited to :
Dianping [Meituan Dianping], Michelin (China) [Michelin], China Ping An Technology Co., Ltd. [China Ping An], Uxin Used Car [Uxin], Migu Interactive Entertainment Co., Ltd. [China Mobile], Inspur Software Group, 360 Finance [360], Lenovo Group [Lenovo], Jingdong [Jingdong], Evergrande Real Estate [Evergrande], Hisense Group [Hisense], UFIDA Financial Information Technology Co., Ltd. [Yonyou], Beijing Sohu-Fox Friends [ Sohu], HKUST Xunfei [HKUST Xunfei], Hello Travel [Hello], Huolala [Huolala APP]... 600+ large Internet companies.

5. Download

Source code warehouse address:

Source code warehouse address Release Download
https://github.com/xuxueli/xxl-job Download
http://gitee.com/xuxueli0323/xxl-job Download

Central warehouse address:

<!-- http://repo1.maven.org/maven2/com/xuxueli/xxl-job-core/ -->
<dependency>
    <groupId>com.xuxueli</groupId>
    <artifactId>xxl-job-core</artifactId>
    <version>${最新稳定版本}</version>
</dependency>

If using the gradle repository:

implementation('com.xuxueli:xxl-job-core:2.3.0')

6. Environment

Maven3+
Jdk1.8+
Mysql5.7+

3.2. Quick start

1. Initialize the "scheduling database"

Please download the project source code and unzip it, get the "Scheduling Database Initialization SQL Script" and execute it.

The location of "Scheduling Database Initialization SQL Script" is:

/xxl-job/doc/db/tables_xxl_job.sql

The dispatch center supports cluster deployment, and each node must be connected to the same mysql instance in the case of a cluster;

If mysql is the master-slave, the dispatch center cluster node must force the master library;

2. Compile the source code

Unzip the source code, import the source code into the IDE according to the maven format, and compile it with maven. The source code structure is as follows:

xxl-job-admin:调度中心
xxl-job-core:公共依赖
xxl-job-executor-samples:执行器Sample示例(选择合适的版本执行器,可直接使用,也可以参考其并将现有项目改造成执行器)
    :xxl-job-executor-sample-springboot:Springboot版本,通过Springboot管理执行器,推荐这种方式;
    :xxl-job-executor-sample-frameless:无框架版本;

3. Configure and deploy "dispatch center"

调度中心项目:xxl-job-admin
作用:统一管理任务调度平台上调度任务,负责触发调度执行,并且提供任务管理平台。

Step 1: dispatch center configuration:
dispatch center configuration file address:

/xxl-job/xxl-job-admin/src/main/resources/application.properties

Description of dispatch center configuration content:

### 调度中心JDBC链接:链接地址请保持和 3.2.1章节 所创建的调度数据库的地址一致
spring.datasource.url=jdbc:mysql://127.0.0.1:3306/xxl_job?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&serverTimezone=Asia/Shanghai
spring.datasource.username=root
spring.datasource.password=root_pwd
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
### 报警邮箱
spring.mail.host=smtp.qq.com
spring.mail.port=25
[email protected]
spring.mail.password=xxx
spring.mail.properties.mail.smtp.auth=true
spring.mail.properties.mail.smtp.starttls.enable=true
spring.mail.properties.mail.smtp.starttls.required=true
spring.mail.properties.mail.smtp.socketFactory.class=javax.net.ssl.SSLSocketFactory
### 调度中心通讯TOKEN [选填]:非空时启用;
xxl.job.accessToken=
### 调度中心国际化配置 [必填]: 默认为 "zh_CN"/中文简体, 可选范围为 "zh_CN"/中文简体, "zh_TC"/中文繁体 and "en"/英文;
xxl.job.i18n=zh_CN
## 调度线程池最大线程配置【必填】
xxl.job.triggerpool.fast.max=200
xxl.job.triggerpool.slow.max=100
### 调度中心日志表数据保存天数 [必填]:过期日志自动清理;限制大于等于7时生效,否则, 如-1,关闭自动清理功能;
xxl.job.logretentiondays=30

Step 2: Deploy the project:
If the above configuration has been performed correctly, the project can be compiled, packaged and deployed.
Dispatch center access address: http://localhost:8080/xxl-job-admin (this address will be used by the executor as the callback address)
The default login account is "admin/123456", and the running interface after login is shown in the figure below.
insert image description here
So far the "dispatch center" project has been deployed successfully.

Step 3: Dispatch center cluster (optional):
The dispatch center supports cluster deployment to improve disaster recovery and availability of the dispatch system.

When deploying a dispatch center cluster, there are several requirements and suggestions:

  1. The DB configuration remains consistent;
  2. The clocks of the cluster machines are kept consistent (neglected by the stand-alone cluster);
  3. Suggestion: It is recommended to use nginx to do load balancing for the dispatch center cluster and assign domain names. Operations such as scheduling center access, executor callback configuration, and calling API services are all performed through this domain name.

Others: Docker image way to build a dispatch center:
Download the image:

// Docker地址:https://hub.docker.com/r/xuxueli/xxl-job-admin/     (建议指定版本号)
docker pull xuxueli/xxl-job-admin

Create a container and run

docker run -p 8080:8080 -v /tmp:/data/applogs --name xxl-job-admin  -d xuxueli/xxl-job-admin:{
    
    指定版本}
/**
* 如需自定义 mysql 等配置,可通过 "-e PARAMS" 指定,参数格式 PARAMS="--key=value  --key2=value2"* 配置项参考文件:/xxl-job/xxl-job-admin/src/main/resources/application.properties
* 如需自定义 JVM内存参数 等配置,可通过 "-e JAVA_OPTS" 指定,参数格式 JAVA_OPTS="-Xmx512m"*/
docker run -e PARAMS="--spring.datasource.url=jdbc:mysql://127.0.0.1:3306/xxl_job?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&serverTimezone=Asia/Shanghai" -p 8080:8080 -v /tmp:/data/applogs --name xxl-job-admin  -d xuxueli/xxl-job-admin:{
    
    指定版本}

4. Configure and deploy the "executor project"

1.“执行器”项目:xxl-job-executor-sample-springboot (提供多种版本执行器供选择,现以 springboot 版本为例,可直接使用,也可以参考其并将现有项目改造成执行器)
2.作用:负责接收“调度中心”的调度并执行;可直接部署执行器,也可以将执行器集成到现有业务项目中。

Step 1: Maven dependency
Confirm that the maven dependency of "xxl-job-core" is introduced into the pom file;
Step 2: Executor configuration
Executor configuration, configuration file address:

/xxl-job/xxl-job-executor-samples/xxl-job-executor-sample-springboot/src/main/resources/application.properties

Actuator configuration, configuration description:

### 调度中心部署根地址 [选填]:如调度中心集群部署存在多个地址则用逗号分隔。执行器将会使用该地址进行"执行器心跳注册"和"任务结果回调";为空则关闭自动注册;
xxl.job.admin.addresses=http://127.0.0.1:8080/xxl-job-admin
### 执行器通讯TOKEN [选填]:非空时启用;
xxl.job.accessToken=
### 执行器AppName [选填]:执行器心跳注册分组依据;为空则关闭自动注册
xxl.job.executor.appname=xxl-job-executor-sample
### 执行器注册 [选填]:优先使用该配置作为注册地址,为空时使用内嵌服务 ”IP:PORT“ 作为注册地址。从而更灵活的支持容器类型执行器动态IP和动态映射端口问题。
xxl.job.executor.address=
### 执行器IP [选填]:默认为空表示自动获取IP,多网卡时可手动设置指定IP,该IP不会绑定Host仅作为通讯实用;地址信息用于 "执行器注册" 和 "调度中心请求并触发任务";
xxl.job.executor.ip=
### 执行器端口号 [选填]:小于等于0则自动获取;默认端口为9999,单机部署多个执行器时,注意要配置不同执行器端口;
xxl.job.executor.port=9999
### 执行器运行日志文件存储磁盘路径 [选填] :需要对该路径拥有读写权限;为空则使用默认路径;
xxl.job.executor.logpath=/data/applogs/xxl-job/jobhandler
### 执行器日志文件保存天数 [选填] : 过期日志自动清理, 限制值大于等于3时生效; 否则, 如-1, 关闭自动清理功能;
xxl.job.executor.logretentiondays=30

Step 3: Actuator component configuration
Actuator component, configuration file address:

/xxl-job/xxl-job-executor-samples/xxl-job-executor-sample-springboot/src/main/java/com/xxl/job/executor/core/config/XxlJobConfig.java

Actuator components, configuration description:

@Bean
public XxlJobSpringExecutor xxlJobExecutor() {
    
    
    logger.info(">>>>>>>>>>> xxl-job config init.");
    XxlJobSpringExecutor xxlJobSpringExecutor = new XxlJobSpringExecutor();
    xxlJobSpringExecutor.setAdminAddresses(adminAddresses);
    xxlJobSpringExecutor.setAppname(appname);
    xxlJobSpringExecutor.setIp(ip);
    xxlJobSpringExecutor.setPort(port);
    xxlJobSpringExecutor.setAccessToken(accessToken);
    xxlJobSpringExecutor.setLogPath(logPath);
    xxlJobSpringExecutor.setLogRetentionDays(logRetentionDays);
    return xxlJobSpringExecutor;
}

Step 4: Deploy the executor project:
If the above configurations have been performed correctly, the executor project can be compiled, packaged and deployed. The system provides a variety of executor Sample projects, just select one of them, and the respective deployment methods are as follows.

xxl-job-executor-sample-springboot:项目编译打包成springboot类型的可执行JAR包,命令启动即可;
xxl-job-executor-sample-frameless:项目编译打包成JAR包,命令启动即可;

So far the "Actuator" project has been deployed.

Step 5: Executor cluster (optional):

The executor supports cluster deployment, improves the availability of the scheduling system, and improves task processing capabilities.

When deploying executor clusters, there are several requirements and suggestions:

  1. The executor callback address (xxl.job.admin.addresses) needs to be consistent; the executor performs operations such as executor automatic registration according to this configuration.
  2. The AppName (xxl.job.executor.appname) in the same executor cluster needs to be consistent; the scheduling center dynamically discovers the online executor lists of different clusters according to this configuration.

5. Develop the first task "Hello World"

This example takes creating a new "GLUE mode (Java)" task as an example. For more detailed configuration of tasks, please refer to "Chapter 3: Detailed Explanation of Tasks".
(The execution code of "GLUE mode (Java)" is hosted to the dispatch center for online maintenance, which is simpler and lighter than the "Bean mode task" that needs to be developed and deployed in the executor project.)

Prerequisite: Please confirm that the "dispatch center" and "executor" projects have been successfully deployed and started;
Step 1: Create a new task:
log in to the dispatch center, click the "New Task" button as shown in the figure below to create a sample task. Then, refer to the parameter configuration of the task in the screenshot below, and click Save.
insert image description here
insert image description here
Step 2: "GLUE mode (Java)" task development:
Please click the "GLUE" button on the right side of the task to enter the "GLUE editor development interface", as shown in the figure below. The task in the "GLUE mode (Java)" running mode has initialized the sample task code by default, that is, printing Hello World.
(The "GLUE mode (Java)" operation mode task is actually a piece of Java class code inherited from IJobHandler. It runs in the executor project and can use @Resource/@Autowire to inject other services in the executor. Details Please refer to Chapter 3)
insert image description here
Step 3: Trigger execution:
Please click the "Execute" button on the right side of the task to manually trigger a task execution (usually, the task scheduling is triggered by configuring a Cron expression).
Step 4: Check the log:
Please click the "Log" button on the right side of the task to go to the task log interface to view the task log.

In the task log interface, you can view the historical scheduling records of the task, as well as the task scheduling information, execution parameters and execution information of each scheduling. Click the "Execution Log" button on the right side of the running task to enter the log console to view the real-time execution log.
insert image description here
In the log console, you can view the log information output by the task running on the executor side in real time in Rolling mode, and monitor the progress of the task in real time;
insert image description here

3.3. Task details

Configuration property details:

基础配置:
    - 执行器:任务的绑定的执行器,任务触发调度时将会自动发现注册成功的执行器, 实现任务自动发现功能; 另一方面也可以方便的进行任务分组。每个任务必须绑定一个执行器, 可在 "执行器管理" 进行设置;
    - 任务描述:任务的描述信息,便于任务管理;
    - 负责人:任务的负责人;
    - 报警邮件:任务调度失败时邮件通知的邮箱地址,支持配置多邮箱地址,配置多个邮箱地址时用逗号分隔;
触发配置:
    - 调度类型:
        无:该类型不会主动触发调度;
        CRON:该类型将会通过CRON,触发任务调度;
        固定速度:该类型将会以固定速度,触发任务调度;按照固定的间隔时间,周期性触发;
        固定延迟:该类型将会以固定延迟,触发任务调度;按照固定的延迟时间,从上次调度结束后开始计算延迟时间,到达延迟时间后触发下次调度;
    - CRON:触发任务执行的Cron表达式;
    - 固定速度:固定速度的时间间隔,单位为秒;
    - 固定延迟:固定延迟的时间间隔,单位为秒;
任务配置:
    - 运行模式:
        BEAN模式:任务以JobHandler方式维护在执行器端;需要结合 "JobHandler" 属性匹配执行器中任务;
        GLUE模式(Java):任务以源码方式维护在调度中心;该模式的任务实际上是一段继承自IJobHandler的Java类代码并 "groovy" 源码方式维护,它在执行器项目中运行,可使用@Resource/@Autowire注入执行器里中的其他服务;
        GLUE模式(Shell):任务以源码方式维护在调度中心;该模式的任务实际上是一段 "shell" 脚本;
        GLUE模式(Python):任务以源码方式维护在调度中心;该模式的任务实际上是一段 "python" 脚本;
        GLUE模式(PHP):任务以源码方式维护在调度中心;该模式的任务实际上是一段 "php" 脚本;
        GLUE模式(NodeJS):任务以源码方式维护在调度中心;该模式的任务实际上是一段 "nodejs" 脚本;
        GLUE模式(PowerShell):任务以源码方式维护在调度中心;该模式的任务实际上是一段 "PowerShell" 脚本;
    - JobHandler:运行模式为 "BEAN模式" 时生效,对应执行器中新开发的JobHandler类“@JobHandler”注解自定义的value值;
    - 执行参数:任务执行所需的参数;     
高级配置:
    - 路由策略:当执行器集群部署时,提供丰富的路由策略,包括;
        FIRST(第一个):固定选择第一个机器;
        LAST(最后一个):固定选择最后一个机器;
        ROUND(轮询):;
        RANDOM(随机):随机选择在线的机器;
        CONSISTENT_HASH(一致性HASH):每个任务按照Hash算法固定选择某一台机器,且所有任务均匀散列在不同机器上。
        LEAST_FREQUENTLY_USED(最不经常使用):使用频率最低的机器优先被选举;
        LEAST_RECENTLY_USED(最近最久未使用):最久未使用的机器优先被选举;
        FAILOVER(故障转移):按照顺序依次进行心跳检测,第一个心跳检测成功的机器选定为目标执行器并发起调度;
        BUSYOVER(忙碌转移):按照顺序依次进行空闲检测,第一个空闲检测成功的机器选定为目标执行器并发起调度;
        SHARDING_BROADCAST(分片广播):广播触发对应集群中所有机器执行一次任务,同时系统自动传递分片参数;可根据分片参数开发分片任务;
    - 子任务:每个任务都拥有一个唯一的任务ID(任务ID可以从任务列表获取),当本任务执行结束并且执行成功时,将会触发子任务ID所对应的任务的一次主动调度。
    - 调度过期策略:
        - 忽略:调度过期后,忽略过期的任务,从当前时间开始重新计算下次触发时间;
        - 立即执行一次:调度过期后,立即执行一次,并从当前时间开始重新计算下次触发时间;
    - 阻塞处理策略:调度过于密集执行器来不及处理时的处理策略;
        单机串行(默认):调度请求进入单机执行器后,调度请求进入FIFO队列并以串行方式运行;
        丢弃后续调度:调度请求进入单机执行器后,发现执行器存在运行的调度任务,本次请求将会被丢弃并标记为失败;
        覆盖之前调度:调度请求进入单机执行器后,发现执行器存在运行的调度任务,将会终止运行中的调度任务并清空队列,然后运行本地调度任务;
    - 任务超时时间:支持自定义任务超时时间,任务运行超时将会主动中断任务;
    - 失败重试次数;支持自定义任务失败重试次数,当任务失败时将会按照预设的失败重试次数主动进行重试;

1. BEAN mode (class form)

Bean mode tasks support class-based development, and each task corresponds to a Java class.

  • Advantages: It does not limit the project environment and has good compatibility. Even frameless projects, such as projects directly started by the main method, can also provide support, you can refer to the sample project "xxl-job-executor-sample-frameless";
  • Disadvantages: Each task needs to occupy a Java class, resulting in a waste of classes; it does not support automatic scanning of tasks and injection into the executor container, and manual injection is required.

Step 1: In the executor project, develop the Job class:

1、开发一个继承自"com.xxl.job.core.handler.IJobHandler"的JobHandler类,实现其中任务方法。
2、手动通过如下方式注入到执行器容器。
XxlJobExecutor.registJobHandler("demoJobHandler", new DemoJobHandler());

Step 2: Scheduling center, create a new scheduling task
The subsequent steps are consistent with "3.2.2 BEAN mode (method form)", you can refer to it.

2. BEAN mode (method form)

The Bean mode task supports the method-based development method, and each task corresponds to a method.

  • Advantages: Only one method needs to be developed for each task, and the "@XxlJob" annotation can be added, which is more convenient and faster. Supports automatic scanning of tasks and injection into the executor container.
  • Disadvantages: Spring container environment is required;

For method-based tasks, the bottom layer will generate a JobHandler proxy. Like the class-based method, tasks will also exist in the executor task container in the form of JobHandler.

Step 1: In the executor project, develop the Job method:

1、任务开发:在Spring Bean实例中,开发Job方法;
2、注解配置:为Job方法添加注解 "@XxlJob(value="自定义jobhandler名称", init = "JobHandler初始化方法", destroy = "JobHandler销毁方法")",注解value值对应的是调度中心新建任务的JobHandler属性的值。
3、执行日志:需要通过 "XxlJobHelper.log" 打印执行日志;
4、任务结果:默认任务结果为 "成功" 状态,不需要主动设置;如有诉求,比如设置任务结果为失败,可以通过 "XxlJobHelper.handleFail/handleSuccess" 自主设置任务结果;
// 可参考Sample示例执行器中的 "com.xxl.job.executor.service.jobhandler.SampleXxlJob" ,如下:
@XxlJob("demoJobHandler")
public void demoJobHandler() throws Exception {
    
    
    XxlJobHelper.log("XXL-JOB, Hello World.");
}

Step 2: Scheduling center, create a new scheduling task

Refer to the above "Detailed description of configuration properties" to configure parameters for the newly created task, select "BEAN mode" as the running mode, and fill in the value defined in the task annotation "@XxlJob" for the JobHandler property; the native built-in Bean mode task is for user reference
insert image description here
and
quick Use, the example executor provides multiple Bean-mode task handlers natively, which can be directly configured and practical, as follows:
demoJobHandler: simple example task, simulates time-consuming task logic inside the task, and users can experience functions such as Rolling Log online;
shardingJobHandler: sharding example Tasks, task internal simulation processing fragmentation parameters, you can refer to Familiar with fragmentation tasks;
httpJobHandler: General HTTP task handler; the business side only needs to provide HTTP links and other information, no language and platform restrictions. The input parameters of the sample task are as follows:

  url: http://www.xxx.com
  method: get 或 post
  data: post-data

commandJobHandler: general command line task handler; the business side only needs to provide the command line; such as the "pwd" command;

3. GLUE mode (Java)

The task is maintained in the dispatch center in the form of source code, supports online update through the Web IDE, and compiles and takes effect in real time, so there is no need to specify a JobHandler. The development process is as follows:
Step 1: Scheduling Center, create a new scheduling task:
refer to the above "Detailed Description of Configuration Properties" to configure parameters for the newly created task, and select "GLUE Mode (Java)" as the operating mode;
insert image description here
Step 2: Develop task code:
Select Specify a task, click the "GLUE" button on the right side of the task, and you will go to the Web IDE interface of the GLUE task, which supports the development of the task code (you can also copy and paste it into the editor after the development is completed in the IDE).

Version backtracking function (version backtracking of 30 versions is supported): In the Web IDE interface of the GLUE task, select the drop-down box "Version Backtracking" in the upper right corner, and the update history of the GLUE will be listed. Select the corresponding version to display the version code. After saving, the GLUE code will roll back to the corresponding historical version;
insert image description here

4. GLUE mode (Shell)

Step 1: Scheduling Center, create a new scheduling task
Refer to the above "Configuration Properties Detailed Description" to configure the parameters of the newly created task, select "GLUE mode (Shell)" as the operating mode; Step 2: Develop task
code:
select the specified task, click the The "GLUE" button on the right side of the task will lead to the Web IDE interface of the GLUE task, which supports the development of the task code (you can also copy and paste it into the editor after the development is completed in the IDE).

The task of this mode is actually a "shell" script;
insert image description here

3.4. Operation Guide

1. Configure the executor

Click to enter the "Executor Management" interface, as shown in the figure below:
insert image description here

1、"调度中心OnLine:"右侧显示在线的"调度中心"列表, 任务执行结束后, 将会以failover的模式进行回调调度中心通知执行结果, 避免回调的单点风险;
2、"执行器列表" 中显示在线的执行器列表, 可通过"OnLine 机器"查看对应执行器的集群机器。

Click the button "+ Add Actuator" and the pop-up box is as shown in the figure below, you can add actuator configuration:
insert image description here
Actuator property description

AppName: 是每个执行器集群的唯一标示AppName, 执行器会周期性以AppName为对象进行自动注册。可通过该配置自动发现注册成功的执行器, 供任务调度时使用;
名称: 执行器的名称, 因为AppName限制字母数字等组成,可读性不强, 名称为了提高执行器的可读性;
排序: 执行器的排序, 系统中需要执行器的地方,如任务新增, 将会按照该排序读取可用的执行器列表;
注册方式:调度中心获取执行器地址的方式;
    自动注册:执行器自动进行执行器注册,调度中心通过底层注册表可以动态发现执行器机器地址;
    手动录入:人工手动录入执行器的地址信息,多地址逗号分隔,供调度中心使用;
机器地址:"注册方式""手动录入"时有效,支持人工维护执行器的地址信息;

2. Create a new task

Enter the task management interface, click the "Add Task" button, configure the task properties in the pop-up "Add Task" interface and save it. For the details page, refer to the chapter "3. Detailed Task Explanation".

3. Edit task

Enter the task management interface, select the specified task. Click the "Edit" button on the right side of the task, update the task properties in the pop-up "Edit Task" interface and save it. You can modify the set task property information:

4. Edit GLUE code

This operation is only for GLUE tasks.

Select the specified task and click the "GLUE" button on the right side of the task to go to the Web IDE interface of the GLUE task, which supports the development of the task code. Refer to chapter "33...3 GLUE Mode (Java)".

5. Start/stop tasks

Tasks can be "started" and "stopped".
It should be noted that the start/stop here is only for the subsequent scheduling trigger behavior of the task, and will not affect the scheduled tasks that have already been triggered. If you need to terminate the scheduled tasks that have been triggered, please refer to "4.9 Terminate Running Tasks"
insert image description here

6. Manually trigger a schedule

Click the "Execute" button to manually trigger a task scheduling without affecting the original scheduling rules.
insert image description here

7. View the scheduling log

Click the "Log" button to view the task history scheduling log. On the history transfer log interface, you can view the scheduling results and execution results of each task scheduling, and click the "Execution Log" button to view the complete log of the executor.
insert image description here
insert image description here

调度时间:"调度中心"触发本次调度并向"执行器"发送任务执行信号的时间;
调度结果:"调度中心"触发本次调度的结果,200表示成功,500或其他表示失败;
调度备注:"调度中心"触发本次调度的日志信息;
执行器地址:本次任务执行的机器地址
运行模式:触发调度时任务的运行模式,运行模式可参考章节 "三、任务详解"
任务参数:本地任务执行的入参
执行时间:"执行器"中本次任务执行结束后回调的时间;
执行结果:"执行器"中本次任务执行的结果,200表示成功,500或其他表示失败;
执行备注:"执行器"中本次任务执行的日志信息;
操作:
    "执行日志"按钮:点击可查看本地任务执行的详细日志信息;详见“4.8 查看执行日志”;
    "终止任务"按钮:点击可终止本地调度对应执行器上本任务的执行线程,包括未执行的阻塞任务一并被终止;

8. Check the execution log

Click the "Execution Log" button on the right side of the execution log to jump to the execution log interface, where you can view the complete log printed in the business code, as shown in the figure below;
insert image description here

9. Terminate running tasks

Only for executing tasks.
On the task log interface, click the "Terminate Task" button on the right, and a task termination request will be sent to the executor corresponding to this task, and this task will be terminated, and the entire task execution queue will be cleared at the same time.
insert image description here
When the task is terminated, it is implemented by "interrupting" the execution thread, and an "InterruptedException" exception will be triggered. Therefore, if the exception is caught and digested inside the JobHandler, the task termination function will not be available.

Therefore, if the above-mentioned task termination is unavailable, it is necessary to perform special handling (throwing upward) for the "InterruptedException" exception in the JobHandler. The correct logic is as follows:

try{
    
    
    // do something
} catch (Exception e) {
    
    
    if (e instanceof InterruptedException) {
    
    
        throw e;
    }
    logger.warn("{}", e);
}

Moreover, when the child thread is started in the JobHandler, the child thread cannot catch and handle "InterruptedException", and should actively throw it upward.

When the task is terminated, the "destroy()" method corresponding to the JobHandler will be executed, which can be used to handle some resource recovery logic.

10. Delete the execution log

In the task log interface, after selecting the executor and task, click the "Delete" button on the right, and a "Log Cleanup" pop-up box will appear. The pop-up box supports the selection of different types of log cleanup strategies. After selecting it, click the "OK" button. Can perform log cleanup operations;
insert image description here
insert image description here

11. Delete task

Click the Delete button to delete the corresponding task.
insert image description here

12. User management

Enter the "User Management" interface to view and manage user information;

Currently users are divided into two roles:

  • Administrator: has full authority, supports online management of user information, assigns authority to users, and the granularity of authority allocation is executor;
  • Ordinary users: only have the executors assigned permissions, and the operation permissions of related tasks;

insert image description here
insert image description here

3.5. Overall design

3.5.1. Introduction to Source Code Directory

- /doc :文档资料
- /db :“调度数据库”建表脚本
- /xxl-job-admin :调度中心,项目源码
- /xxl-job-core :公共Jar依赖
- /xxl-job-executor-samples :执行器,Sample示例项目(大家可以在该项目上进行开发,也可以将现有项目改造生成执行器项目)

3.5.2, "Scheduling database" configuration

The XXL-JOB scheduling module is based on self-developed scheduling components and supports cluster deployment. The scheduling database table is described as follows:

- xxl_job_lock:任务调度锁表;
- xxl_job_group:执行器信息表,维护任务执行器信息;
- xxl_job_info:调度扩展信息表: 用于保存XXL-JOB调度任务的扩展信息,如任务分组、任务名、机器地址、执行器、执行入参和报警邮件等等;
- xxl_job_log:调度日志表: 用于保存XXL-JOB任务调度的历史信息,如调度结果、执行结果、调度入参、调度机器和执行器等等;
- xxl_job_log_report:调度日志报表:用户存储XXL-JOB任务调度日志的报表,调度中心报表功能页面会用到;
- xxl_job_logglue:任务GLUE日志:用于保存GLUE更新历史,用于支持GLUE的版本回溯功能;
- xxl_job_registry:执行器注册表,维护在线的执行器和调度中心机器地址信息;
- xxl_job_user:系统用户表;

3.5.3, architecture design

3.5.3.1, design thinking

The dispatching behavior is abstracted into a public platform of "dispatching center", and the platform itself does not undertake business logic, and the "dispatching center" is responsible for initiating dispatching requests.

The tasks are abstracted into scattered JobHandlers, which are managed by the "executor", and the "executor" is responsible for receiving scheduling requests and executing the business logic in the corresponding JobHandler.

Therefore, the two parts of "scheduling" and "task" can be decoupled from each other to improve the overall stability and scalability of the system;

3.5.3.2. System composition

Scheduling module (scheduling center):
responsible for managing scheduling information, sending scheduling requests according to scheduling configuration, and not responsible for business codes. The scheduling system is decoupled from the tasks, which improves the availability and stability of the system. At the same time, the performance of the scheduling system is no longer limited by the task modules; it
supports visual, simple and dynamic management of scheduling information, including task creation, update, deletion, GLUE development and task Alarm, etc., all the above operations will take effect in real time, and support monitoring scheduling results and execution logs, and support executor failover.
Execution module (executor):
responsible for receiving scheduling requests and executing task logic. The task module focuses on operations such as task execution, making development and maintenance easier and more efficient; it
receives execution requests, termination requests, and log requests from the "scheduling center".

3.5.3.3, architecture diagram

insert image description here

3.5.4. Analysis of scheduling module

3.5.4.1 Insufficiency of quartz

As the leader in open source job scheduling, Quartz is the first choice for job scheduling. However, in the cluster environment, Quartz uses API to manage tasks, so as to avoid the above problems, but there are also the following problems:

  • Problem 1: The method of calling the API to operate the task is not humanized;
  • Question 2: It is necessary to persist the business QuartzJobBean to the underlying data table, and the system is quite intrusive.
  • Question 3: The scheduling logic and QuartzJobBean are coupled in the same project, which will lead to a problem. When the number of scheduling tasks is gradually increasing, and the scheduling task logic is gradually aggravating, the performance of the scheduling system will be greatly limited by the business at this time;
  • Question 4: The underlying layer of quartz acquires DB locks in a "preemptive manner" and the successful preemption node is responsible for running the task, which will lead to a very large difference in node load; while XXL-JOB realizes the "cooperative distribution" running task through the executor, giving full play to the advantages of the cluster , the load of each node is balanced.

XXL-JOB makes up for the above shortcomings of quartz.

3.5.4.2. Self-developed scheduling module

XXL-JOB finally chose the self-developed scheduling component (the early scheduling component was based on Quartz); on the one hand, it is to simplify the system and reduce redundant dependencies; on the other hand, it is to provide the controllability and stability of the system;

The "scheduling module" and "task module" in XXL-JOB are completely decoupled. When the scheduling module performs task scheduling, it will analyze different task parameters and initiate remote calls to invoke their respective remote executor services. This invocation model is similar to RPC invocation. The dispatch center provides the function of invoking the agent, while the executor provides the function of remote service.

3.5.4.3. Dispatch center HA (cluster)

The database-based cluster solution uses Mysql as the database; when scheduled tasks are scheduled in a cluster distributed concurrent environment, the tasks will be reported on each node and stored in the database, and the trigger will be taken out from the database for execution during execution. If the name and execution time are the same, only one node will execute this task.

3.5.4.4. Scheduling thread pool

Scheduling is implemented in a thread pool to avoid task scheduling delays caused by single-thread blocking.

3.5.4.5. Parallel scheduling

The XXL-JOB scheduling module adopts a parallel mechanism by default. In the case of multi-thread scheduling, the probability of the scheduling module being blocked is very low, which greatly improves the load capacity of the scheduling system.

Different tasks of XXL-JOB are scheduled and executed in parallel.
A single task of XXL-JOB runs in parallel for multiple executors, and executes serially for a single executor. Also supports task termination.

3.5.4.6, Expiration processing strategy

Processing strategy when task scheduling misses the trigger time:

Possible Causes:

  • service restart;
  • The scheduling thread is blocked and the thread is exhausted;
  • The last scheduling continued to block, and the next scheduling was missed;

Processing strategy:

  • Overdue for more than 5s: Ignored this time, the current time starts to calculate the next trigger time
  • Within 5 seconds of expiration: trigger once immediately, and the current time will start to calculate the next trigger time

3.5.4.7, log callback service

When the "scheduling center" of the scheduling module is deployed as a web service, on the one hand, it assumes the function of the scheduling center, and on the other hand, it also provides API services for the executors.

The code location of the "log callback service API service" provided by the dispatch center is as follows:

xxl-job-admin#com.xxl.job.admin.controller.JobApiController.callback

After receiving the task execution request, the executor will execute the task, and will call back the execution result to notify the "scheduling center" after the execution is completed;

3.5.4.8. Task HA (Failover)

If the executors are deployed in a cluster, the dispatch center will perceive all online executors, such as "127.0.0.1:9997, 127.0.0.1:9998, 127.0.0.1:9999".

When the task "routing policy" selects "failover (FAILOVER)", when the scheduling center initiates a scheduling request each time, it will send a heartbeat detection request to the executors in order, and the first executor detected as alive will be Select and send a dispatch request.

After the scheduling is successful, you can view the "Scheduling Remarks" on the log monitoring interface, as follows:
insert image description here
"Scheduling Remarks" can show the running track of the local scheduling, the "Registration Method", "Address List" of the executor and the "Routing Policy" of the task. Under the "FAILOVER" routing strategy, the dispatch center first performs heartbeat detection on the first address, and the heartbeat fails, so it is automatically skipped, and the second heartbeat detection still fails...

Until the third address "127.0.0.1:9999" is successfully detected by the heartbeat, it is selected as the "target executor"; then a scheduling request is sent to the "target executor", and the scheduling process ends, waiting for the executor to call back the execution result.

3.5.4.9, scheduling log

Every time the scheduling center performs task scheduling, it will record a task log. The task log mainly includes the following three parts:

  1. Task information: including attributes such as "executor address", "JobHandler" and "execution parameters", which can be viewed by clicking the task ID button. According to these parameters, the specific machine and task code for task execution can be accurately located;
  2. Scheduling information: including "scheduling time", "scheduling result" and "scheduling log", etc., according to these parameters, you can know the specific situation when the "scheduling center" initiates a scheduling request.
  3. Execution information: including "execution time", "execution result" and "execution log", etc., according to these parameters, you can understand the specific situation of task execution at the "executor" end;

Scheduling log, for a single scheduling, the property description is as follows:

  • Executor address: the address of the machine where the task is executed;
  • JobHandler: Bean mode indicates the name of JobHandler for task execution;
  • Task parameters: input parameters for task execution;
  • Scheduling time: the dispatching center, the time when dispatching is initiated;
  • Scheduling result: the scheduling center, the result of initiating scheduling, SUCCESS or FAIL;
  • Scheduling Remarks: Dispatch Center, the remark information for initiating scheduling, such as address heartbeat detection logs, etc.;
  • Execution time: the executor, the callback time after the task execution ends;
  • Execution result: executor, the result of task execution, SUCCESS or FAIL;
  • Execution notes: executors, notes on task execution, such as exception logs, etc.;
  • Execution log: During task execution, the complete execution log printed in the business code, see "4.8 Viewing Execution Log";

3.5.4.10. Task dependency

Principle: Each task in XXL-JOB corresponds to a task ID. At the same time, each task supports setting the attribute "subtask ID". Therefore, the task dependency can be matched through the "task ID".

When the execution of the parent task is completed and the execution is successful, the subtask dependency will be matched according to the "subtask ID". If a subtask is matched, the execution of the subtask will be actively triggered.

In the task log interface, click the "View" button of the "Execution Notes" of the task, and you can see the log information of the matching subtask and the triggering subtask execution. If there is no information, it means that the subtask execution was not triggered. Refer to the figure below.
insert image description here
insert image description here

3.5.4.11, fully asynchronous & lightweight

Fully asynchronous design: The business logic in the XXL-JOB system is executed on the remote executor, and the trigger process is fully asynchronously designed. Compared with directly executing business logic in the scheduling center, it greatly reduces the time occupied by scheduling threads;

Asynchronous scheduling: The scheduling center only sends a scheduling request once each time a task is triggered. The scheduling request is first pushed to the "asynchronous scheduling queue" and then pushed to the remote executor asynchronously

Asynchronous execution: The executor will store the request in the "asynchronous execution queue" and immediately respond to the dispatch center to run asynchronously.

Lightweight design: The logic of each JOB in the XXL-JOB scheduling center is very "light". On the basis of full asynchronization, the average running time of a single JOB is basically within "10ms" (basically the network overhead of a request ); therefore, it is guaranteed to use limited threads to support a large number of JOBs running concurrently;

Thanks to the above two optimizations, in theory, the scheduling center under the default configuration can support 5,000 tasks running concurrently and stably on a single machine;

In actual scenarios, due to the different ping delays between the scheduling center and the executor network, the different time-consuming DB reads and writes, and the different levels of task scheduling intensity, the upper limit of the task volume will fluctuate up and down.

If you need to support more tasks, you can optimize it by "increasing the number of scheduling threads", "reducing the ping delay between the scheduling center and the executor", and "improving the machine configuration".

3.5.4.12, balanced scheduling

When the dispatch center is deployed in the cluster, it will automatically distribute tasks evenly, triggering components to obtain tasks related to the number of thread pools each time (the dispatch center supports custom dispatch thread pool size), avoiding the concentration of a large number of tasks on a single dispatch center cluster node;

3.5.5. Analysis of task "operation mode"

3.5.5.1, "Bean mode" task

Development steps: refer to "Chapter 3";
principle: each Bean pattern task is a Spring Bean class instance, which is maintained in the Spring container of the "Actuator" project. The task class needs to be annotated with "@JobHandler(value="name")", because the "executor" will identify the tasks in the Spring container according to the annotation. The task class needs to inherit the unified interface "IJobHandler", and the task logic is developed in the execute method, because when the "executor" receives the scheduling request from the dispatch center, it will call the execute method of "IJobHandler" to execute the task logic.

3.5.5.2, "GLUE mode (Java)" task

Development steps: refer to "Chapter 3";
principle: the code of each "GLUE mode (Java)" task is actually "a class code of an implementation class inherited from "IJobHandler", and the "executor" receives the " When dispatching requests from the "Scheduling Center", this code will be loaded through the Groovy class loader, instantiated into a Java object, and injected into the Spring service declared in this code (please ensure that the service and class references in the Glue code are in the "executor" project exists), and then call the execute method of the object to execute the task logic.

3.5.5.3, GLUE mode (Shell) + GLUE mode (Python) + GLUE mode (PHP) + GLUE mode (NodeJS) + GLUE mode (Powershell)

Development steps: refer to "Chapter 3";
Principle: The source code of the script task is hosted in the dispatch center, and the script logic runs on the executor. When a script task is triggered, the executor will load the script source code to generate a script file on the executor machine, and then call the script through Java code; and write the script output log to the task log file in real time, so that the dispatch center can real-time Monitor script execution;

Currently supported script types are as follows:

- shell脚本:任务运行模式选择为 "GLUE模式(Shell)"时支持 "Shell" 脚本任务;
- python脚本:任务运行模式选择为 "GLUE模式(Python)"时支持 "Python" 脚本任务;
- php脚本:任务运行模式选择为 "GLUE模式(PHP)"时支持 "PHP" 脚本任务;
- nodejs脚本:任务运行模式选择为 "GLUE模式(NodeJS)"时支持 "NodeJS" 脚本任务;
- powershell:任务运行模式选择为 "GLUE模式(PowerShell)"时支持 "PowerShell" 脚本任务;

The script task judges the task execution result through the Exit Code, and the status code can refer to the chapter "5.15 Description of the task execution result";

3.5.5.4. Actuator

The executor is actually an embedded server with a default port of 9999 (configuration item: xxl.job.executor.port).

When the project starts, the executor will identify the "Bean mode task" in the Spring container through "@JobHandler", and manage it with the value attribute of the annotation as the key.

When the "executor" receives the scheduling request from the "scheduling center", if the task type is "Bean mode", it will match the "Bean mode task" in the Spring container, and then call its execute method to execute the task logic. If the task type is "GLUE mode", the GLue code will be loaded, the Java object will be instantiated, and the dependent Spring service will be injected (note: the Spring service injected in the Glue code must exist in the Spring container of the "executor" project) , and then call the execute method to execute the task logic.

3.5.5.5. Task log

XXL-JOB will generate a separate log file for each scheduling request, you need to print the execution log through "XxlJobHelper.log", and the "Scheduling Center" will load the corresponding log file when viewing the execution log.

(The historical version is implemented by rewriting LOG4J's Appender, which has dependency restrictions, and this method has been abandoned in the new version)

The location where the log files are stored can be customized in the "executor" configuration file. The default directory format is: /data/applogs/xxl-job/jobhandler/"formatted date"/"primary key ID.log of database scheduling log records" .

When the child thread is enabled in the JobHandler, the child thread will print the log in the execution log of the parent thread, JobHandler, to facilitate log tracking.

3.5.6. Analysis of communication module

3.5.6.1, a complete task scheduling communication process

- 1、“调度中心”向“执行器”发送http调度请求: “执行器”中接收请求的服务,实际上是一台内嵌Server,默认端口9999;
- 2、“执行器”执行任务逻辑;
- 3、“执行器”http回调“调度中心”调度结果: “调度中心”中接收回调的服务,是针对执行器开放一套API服务;

3.5.6.2. Communication data encryption

When the scheduling center sends the scheduling request to the executor, two objects, RequestModel and ResponseModel, are used to encapsulate the scheduling request parameters and response data. Before the communication, the bottom layer will serialize the above two objects, and perform data protocol and timestamp inspection, so as to achieve Data encryption function;

3.5.7. Task registration, automatic task discovery

Since the v1.5 version, the task cancels the "task execution machine" attribute, and instead dynamically obtains the address of the remote executor and executes it through task registration and automatic discovery.

AppName: 每个执行器机器集群的唯一标示, 任务注册以 "执行器" 为最小粒度进行注册; 每个任务通过其绑定的执行器可感知对应的执行器机器列表;
注册表: 见"xxl_job_registry", "执行器" 在进行任务注册时将会周期性维护一条注册记录,即机器地址和AppName的绑定关系; "调度中心" 从而可以动态感知每个AppName在线的机器列表;
执行器注册: 任务注册Beat周期默认30s; 执行器以一倍Beat进行执行器注册, 调度中心以一倍Beat进行动态任务发现; 注册信息的失效时间为三倍Beat; 
执行器注册摘除:执行器销毁时,将会主动上报调度中心并摘除对应的执行器机器信息,提高心跳注册的实时性;

In order to ensure that the system is "lightweight" and reduce the cost of learning and deployment, Zookeeper is not used as the registration center, and the DB method is used for task registration and discovery;

3.5.8. Task execution result

Since v1.6.2, the task execution result is judged by the return value "ReturnT" of "IJobHandler":
when the return value conforms to "ReturnT.code == ReturnT.SUCCESS_CODE", it means that the task execution is successful, otherwise it means that the task execution failed, and you can Call back the error message to the dispatch center through "ReturnT.msg"; thus, the task execution result can be conveniently controlled in the task logic;

3.5.9, fragmented broadcast & dynamic fragmentation

When the executor cluster is deployed, if the task routing strategy selects "shard broadcast", a task schedule will broadcast to trigger all executors in the corresponding cluster to execute a task, and the system will automatically pass the shard parameters; slice task;

"Shard Broadcasting" uses executors as the dimension to perform sharding, and supports dynamic expansion of executor clusters to dynamically increase the number of shards and coordinate business processing; it can significantly improve task processing capability and speed when performing large-scale business operations.

The development process of "shard broadcast" is the same as that of ordinary tasks, the difference is that the slice parameters can be obtained, and the fragment business processing can be carried out by obtaining the slice parameters.

Ways to obtain fragmentation parameters for Java language tasks: BEAN, GLUE mode (Java):

// 可参考Sample示例执行器中的示例任务"ShardingJobHandler"了解试用 
int shardIndex = XxlJobHelper.getShardIndex();
int shardTotal = XxlJobHelper.getShardTotal();

Scripting language tasks to obtain fragmentation parameters: GLUE mode (Shell), GLUE mode (Python), GLUE mode (Nodejs):

// 脚本任务入参固定为三个,依次为:任务传参、分片序号、分片总数。以Shell模式任务为例,获取分片参数代码如下
echo "分片序号 index = $2"
echo "分片总数 total = $3"

Fragment parameter attribute description:

index:当前分片序号(0开始),执行器集群列表中当前执行器的序号;
total:总分片数,执行器集群的总机器数量;

This feature applies to scenarios such as:

1. Sharding task scenario: A cluster of 10 executors processes 100,000 pieces of data, and each machine only needs to process 10,000 pieces of data, reducing the time-consuming by 10 times;

2. Broadcast task scenarios: broadcast executor machines run shell scripts, broadcast cluster nodes perform cache updates, etc.

3.5.10, access token (AccessToken)

In order to improve the security of the system, the dispatch center and the executor perform security verification, and only when the AccessTokens of both parties match are allowed to communicate;

The dispatch center and executor can set the AccessToken through the configuration item "xxl.job.accessToken".

There are only two settings for the dispatch center and the actuator if normal communication is required:

  • Setting 1: AccessToken is not set for dispatch center and executor; security verification is turned off;
  • Setting 2: The dispatch center and the executor are set with the same AccessToken;

3.5.11. Failover & retry on failure

A complete task process includes two stages of "scheduling (scheduling center) + execution (executor)":

"Failover" occurs in the scheduling phase. When the executor cluster is deployed, if an executor fails, this strategy supports automatic failover switching to a normal executor machine and completes the scheduling request process.

"Failure retry" occurs in the two phases of "scheduling + execution". It supports customizing the number of task failure retries. When the task fails, it will actively retry according to the preset number of failed retries;

3.5.12, Actuator grayscale online

The dispatching center is decoupled from the business, and it only needs to be deployed once and requires no maintenance for many years. However, business jobs are hosted and run in the executor, and the executor needs to be restarted when the job is online and changed, especially the Bean mode task.

Executor restarts may interrupt running tasks. However, XXL-JOB benefits from the self-built executor and self-built registration center, and can go online in grayscale to avoid the problem of task interruption caused by restarting.

Proceed as follows:

  1. The actuator is changed to manual registration, half of the machine list (group A) is offline, and the other half of the machine list (group B) is running online;
  2. Wait for the machine tasks in group A to finish and compile and go online; the executor registration address is replaced with group A;
  3. Wait for the machine tasks in group B to finish and compile and go online; the executor registration address is replaced with group A+group B;

operation ended;

3.5.13. Description of task execution results

The system judges the task execution result according to the following criteria, which can be referred to.
insert image description here

3.5.14. Task timeout control

Support setting the task timeout time, when the task runs overtime, it will actively interrupt the task;

It should be noted that the task timeout interrupt is similar to the task termination mechanism (see "3.4.9 Terminating Running Tasks"), and the task is also interrupted through "interrupt", so the business code needs to throw "InterruptedException", otherwise the function unavailable.

3.5.15. Cross-language

XXL-JOB is a cross-language task scheduling platform, which is mainly reflected in the following aspects:

  1. RESTful API: The dispatch center and the executor provide language-independent RESTful API services, and any third-party language can connect to the dispatch center or implement the executor. (Refer to the chapter "Scheduling Center/Executor RESTful API")
  2. Multi-task mode: provide more than a dozen task modes such as Java, Python, PHP, etc., please refer to the chapter "3.5.5 Task "running mode""; theoretically, any language task mode can be expanded;
  3. Provide an HTTP-based task handler (Bean task, JobHandler="httpJobHandler"); the business side only needs to provide relevant information such as HTTP links, without restrictions on language and platform; (refer to the chapter "Native Built-in Bean Mode Task")

3.5.16. Task failure alarm

Email failure alarm is provided by default, and SMS, DingTalk, etc. can be extended. If you need to add an alarm method, you only need to add an alarm implementation that implements the "com.xxl.job.admin.core.alarm.JobAlarm" interface. You can refer to "EmailJobAlarm" provided by default to provide email alarms.

3.5.17, Docker image construction of dispatching center

The dispatch center can be quickly built and started running with the following commands;

mvn clean package
docker build -t xuxueli/xxl-job-admin ./xxl-job-admin
docker run --name xxl-job-admin -p 8080:8080 -d xuxueli/xxl-job-admin

3.5.18. Avoid repeated task execution

Scheduling intensive or time-consuming tasks may cause task blocking, and in clusters, the scheduling component will trigger repeatedly in a small probability; for the
above situation, you can combine "single-machine routing strategy (such as: first, consistent hash)" + "Blocking strategies (such as: single-machine serial, discarding subsequent scheduling)" to avoid, and ultimately avoid repeated execution of tasks.

3.5.19. Command line tasks

Natively provide a common command line task Handler (Bean task, "CommandJobHandler"); the business side only needs to provide the command line;
for example, the task parameter "pwd" will execute the command and output data;

3.5.20. Automatic log cleaning

The XXL-JOB log mainly includes the following two parts, both of which support automatic log cleaning, as follows:

  • Log table data in the dispatch center: You can use the configuration item "xxl.job.logretentiondays" to set the number of days to save the log table data, and the expired logs will be automatically cleared; for details, please refer to the above configuration instructions;
  • Executor log file data: You can use the configuration item "xxl.job.executor.logretentiondays" to set the number of days to save log file data, and expired logs will be automatically cleared; for details, please refer to the above configuration instructions;

3.5.21, scheduling result loss processing

The executor will lose the task scheduling result due to network jitter callback failure or abnormal conditions such as downtime. Since the scheduling center relies on the executor callback to perceive the scheduling results, the scheduling log will always be in the "running" state.

To solve this problem, the scheduling center provides built-in components to handle it. The logic is: if the scheduling record stays in the "running" state for more than 10 minutes, and the corresponding executor heartbeat registration fails and is offline, the local scheduling will be actively marked as failed;

3.6. RESTful API of scheduling center/actuator

XXL-JOB is a cross-platform, cross-language task scheduling specification and protocol.

For Java applications, you can directly access and use the dispatch center through the official dispatch center and executor. You can refer to the "Quick Start" section above.

For non-Java applications, multi-language support can be easily realized with the help of XXL-JOB's standard RESTful API.

  • Dispatch center RESTful API:

Description: The API provided by the scheduling center for executors; not limited to official executors, third parties can use this API to implement executors;
API list: executor registration, task result callback, etc.;

  • Actuator RESTful API:

Description: The API provided by the executor to the scheduling center; the official executor has been implemented by default, and the third-party executor needs to be implemented and connected to the scheduling center;
API list: task trigger, task termination, task log query... etc.;

The RESTful API here is mainly used for non-Java language custom personalized executors to achieve cross-language. In addition, if there is a need to operate the dispatch center through the API, the "dispatch center RESTful API" can be personalized and used.

3.6.1, dispatch center RESTful API

API service location: com.xxl.job.core.biz.AdminBiz ( com.xxl.job.admin.controller.JobApiController )
API service request reference code: com.xxl.job.adminbiz.AdminBizTest

a. Task callback

说明:执行器执行完任务后,回调任务结果时使用
------
地址格式:{
    
    调度中心根地址}/api/callback
Header:
    XXL-JOB-ACCESS-TOKEN : {
    
    请求令牌}
请求数据格式如下,放置在 RequestBody 中,JSON格式:
    [{
    
    
        "logId":1,              // 本次调度日志ID
        "logDateTim":0,         // 本次调度日志时间
        "handleCode":200,       // 200 表示任务执行正常,500表示失败
        "handleMsg": null
        }
    }]
响应数据格式:
    {
    
    
      "code": 200,      // 200 表示正常、其他失败
      "msg": null      // 错误提示消息
    }

b. Actuator Registration

说明:执行器注册时使用,调度中心会实时感知注册成功的执行器并发起任务调度
------
地址格式:{
    
    调度中心根地址}/api/registry
Header:
    XXL-JOB-ACCESS-TOKEN : {
    
    请求令牌}
请求数据格式如下,放置在 RequestBody 中,JSON格式:
    {
    
    
        "registryGroup":"EXECUTOR",                     // 固定值
        "registryKey":"xxl-job-executor-example",       // 执行器AppName
        "registryValue":"http://127.0.0.1:9999/"        // 执行器地址,内置服务跟地址
    }
响应数据格式:
    {
    
    
      "code": 200,      // 200 表示正常、其他失败
      "msg": null      // 错误提示消息
    }

c. Removal of actuator registration

说明:执行器注册摘除时使用,注册摘除后的执行器不参与任务调度与执行
------
地址格式:{
    
    调度中心根地址}/api/registryRemove
Header:
    XXL-JOB-ACCESS-TOKEN : {
    
    请求令牌}
请求数据格式如下,放置在 RequestBody 中,JSON格式:
    {
    
    
        "registryGroup":"EXECUTOR",                     // 固定值
        "registryKey":"xxl-job-executor-example",       // 执行器AppName
        "registryValue":"http://127.0.0.1:9999/"        // 执行器地址,内置服务跟地址
    }
响应数据格式:
    {
    
    
      "code": 200,      // 200 表示正常、其他失败
      "msg": null      // 错误提示消息
    }

3.6.2. Actuator RESTful API

API service location: com.xxl.job.core.biz.ExecutorBiz
API service request reference code: com.xxl.job.executorbiz.ExecutorBizTest
a. Heartbeat detection

说明:调度中心检测执行器是否在线时使用
------
地址格式:{
    
    执行器内嵌服务根地址}/beat
Header:
    XXL-JOB-ACCESS-TOKEN : {
    
    请求令牌}
请求数据格式如下,放置在 RequestBody 中,JSON格式:
响应数据格式:
    {
    
    
      "code": 200,      // 200 表示正常、其他失败
      "msg": null       // 错误提示消息
    }

b. Busy detection

说明:调度中心检测指定执行器上指定任务是否忙碌(运行中)时使用
------
地址格式:{
    
    执行器内嵌服务根地址}/idleBeat
Header:
    XXL-JOB-ACCESS-TOKEN : {
    
    请求令牌}
请求数据格式如下,放置在 RequestBody 中,JSON格式:
    {
    
    
        "jobId":1       // 任务ID
    }
响应数据格式:
    {
    
    
      "code": 200,      // 200 表示正常、其他失败
      "msg": null       // 错误提示消息
    }

c. Trigger tasks

Guess you like

Origin blog.csdn.net/YYBDESHIJIE/article/details/132142590