Distributed task scheduling framework xxl-Job

Distributed Task Scheduling Framework


study-oriented



Most distributed scheduling platforms reach the same goal by different routes.
About xxl-JOB

1. Concept

XXL-JOB is a lightweight distributed task scheduling platform. Its main features are platformization, easy deployment, rapid development, simple learning, lightweight, and easy expansion. The code is still being updated continuously. It is a task scheduling console itself, and does not undertake business logic. We often use this as a scheduled task, and some features will not be introduced here. The focus is on the general process and principles.

2. Structure diagram

insert image description here

xxl-job is actually implemented on the basis of quartz, but the mode of task scheduling is modified, and task scheduling is implemented by registration and RPC calling.
Before version 2.1.0, the core scheduling modules were all based on the quartz framework. From version 2.1.0, self-developed scheduling components were removed, quartz dependencies were removed, and time wheel scheduling was used.
(The underlying change of RPC, 2.0.1 uses RPC of Jetty service, 2.0.2 uses RPC of Nettty service)

3. The process of task scheduling

insert image description here
1: The XXL-Jobadmin platform creates an executor (the actual execution address of the Job)
2: The XXL-Jobadmin platform creates a new task and fills in the corresponding executor
3: In the Job server code, use the JobHandler to indicate that this class is a Job execution method
4: When the task is executed Now, the XXL-Jobadmin scheduling platform will first execute once to obtain the executor in the task, and then go to the corresponding executor address server to execute the corresponding
task
. The platform itself does not undertake business logic, and the "dispatch center" is responsible for initiating dispatch requests.
The tasks are abstracted into scattered JobHandlers, which are managed by the "executor", and the "executor" is responsible for receiving scheduling requests and executing the business logic in the corresponding JobHandler.
Therefore, the two parts of "scheduling" and "task" can be decoupled from each other to improve the overall stability and scalability of the system;

How is the scheduled trigger task implemented? : Implemented using a time wheel

xxl_job_infoThe table is a db table that records scheduled tasks. There is a trigger_next_time(Long)field in it, which indicates that the task time is modified at the time point of the next trigger / after each task is triggered, the next trigger timestamp can be calculated according to the cronb expression:

Date nextValidTime = new CronExpression(jobInfo.getJobCron()).getNextValidTimeAfter(new Date())

update trigger_next_timefield

Timing execution task logic:

  • Timed task scheduleThread: continuously read out the tasks to be executed within 5 seconds from the db, trigger immediately/put them in the time wheel to wait for the trigger, and update trigger_next_time
  • Get the current time now
  • Polling db to find trigger_next_timetasks within 5 seconds from now

3.1 For tasks after reaching the now time (5 seconds beyond now)

​ (1) Skip directly without execution;
​ (2) Reset trigger_next_time

3.2 For tasks after reaching the now time (within 5 seconds beyond now)

​ (1) Open a thread to execute the trigger logic;
​ (2) If the next trigger time of the task is within 5 seconds, put it in the time wheel (Map<Integer, List> seconds (1-60) => task id list);
​ (3) Reset trigger_next_time

3.3 For tasks that have not reached the now time

​ (1) Put it directly into the time wheel;
​ (2) Reset trigger_next_time

  • Timed task ringThread: time wheel implements point-triggered tasks

4.1 Time wheel data structure: Map<Integer, List> key is the number of seconds (1-60), value is the list of task ids

  • Get the current time in seconds
  • Remove the task list ids 2 seconds before the current second from the time wheel (to avoid processing that takes too long, cross the scale, and check a scale forward), and trigger tasks one by one;

How to prevent multiple servers in the cluster from scheduling tasks at the same time?

When the xxl-job application itself is deployed in a cluster (to achieve high availability HA), how to avoid multiple servers in the cluster from scheduling tasks at the same time?
Realize distributed lock through mysql pessimistic lock (for update statement)

  • setAutoCommit(false) closes the implicit autocommit transaction and starts the transaction
  • select lock for update (explicit exclusive lock, other transactions cannot enter & cannot implement for update)
  • Read db task information -> pull task to memory time wheel -> update db task information
  • commit commits the transaction, and at the same time releases the exclusive lock for update (pessimistic lock)

How is the task executor registry implemented?

Use the db table xxl_job_group to record the executor information:

Executor AppName, Executor name title, Executor address list address_list (multiple addresses separated by commas)

How to implement the routing of the task executor?

Rich routing strategies are provided when the executor cluster is deployed, including:

First, last, polling, random, consistent HASH, least frequently used, least recently used, failover, busy transfer, etc.;

  • The first, last, polling, random: simply read address_list
  • Consistent HASH: TreeSet implements consistent hash algorithm
  • Least frequently used, least recently used: HashMap, LinkedHashMap
  • Failover: When traversing the address_list to obtain an address, check the heartbeat of the address one by one (the request returns the status); only the address with a normal heartbeat is returned for use
  • Busy transfer: When traversing the address_list to obtain addresses, check whether the addresses are busy one by one (the request returns the status); only the addresses whose status is idle are returned for use

How to implement task fragmentation and parallel execution?

  • Pull out the execution machine list of the task, set the index/total one by one, and distribute the index/total to the task executor
  • The task executor can develop shard tasks according to the index/total parameters

insert image description here
insert image description here

学习1:https://www.kuaiyong.icu/xxl_job%e5%ae%9a%e6%97%b6%e4%bb%bb%e5%8a%a1%e7%ae%a1%e7%90%86 %e5%b9%b3%e5%8f%b0/

Learning 2: https://blog.csdn.net/weixin_40816738/article/details/123720235

Guess you like

Origin blog.csdn.net/qq_41810415/article/details/132611477