Distributed job scheduling framework - ElasticJob

1 Introduction

ElasticJob is a distributed scheduling solution for Internet ecology and massive tasks. It consists of two independent sub-projects, ElasticJob-Lite and ElasticJob-Cloud. It creates a distributed scheduling solution suitable for Internet scenarios through the functions of flexible scheduling, resource management and control, and job governance, and provides a diversified job ecology through an open architecture design. Each of its products uses a unified job API, and developers only need to develop it once and deploy it at will.

1.1. Development history

Elastic job is a Java distributed timing task developed and open-sourced based on Zookepper and Quartz by Dangdang architects Zhang Liang, Cao Hao, and Jiang Shujian. It solves the disadvantage that Quartz does not support distribution. The main functions of Elastic job include support for elastic expansion, centralized management and monitoring of jobs through Zookepper, support for failover, etc. These are unmatched by other scheduled tasks such as Quartz.

After polishing the alpha, beta, and RC1 versions, ElasticJob version 3.0.0 is officially released! This is the first official release of the ElasticJob project since it was restarted on May 28, 2020 and became a sub-project of Apache ShardingSphere.

1.2. Open source protocol

ElasticJob has become a sub-project of Apache ShardingSphere on May 28, 2020.

2. Function list

2.1. Flexible Scheduling

  • Support task fragmentation and high availability in distributed scenarios
  • Capable of horizontally scaling the throughput and execution efficiency of tasks
  • Task processing capacity can be elastically scaled with resource allocation

2.2. Resource allocation

  • Assign the right resource to the task at the right time and make it effective
  • The same task is aggregated to the same executor for unified processing
  • Dynamically allocate additional resources to newly assigned tasks

2.3. Resource Allocation

  • Assign the right resource to the task at the right time and make it effective
  • The same task is aggregated to the same executor for unified processing
  • Dynamically allocate additional resources to newly assigned tasks

2.4. Job management

  • failover
  • Missed jobs rerun
  • self-diagnostic repair

2.5. Job dependency (TODO)

  • Inter-job dependencies based on directed acyclic graph (DAG)
  • Dependency between job shards based on directed acyclic graph (DAG)

2.6. The open ecology of homework

  • Extensible unified interface for job types
  • Rich job type library, such as data flow, script, HTTP, file, big data, etc.
  • Easy to connect business operations and seamlessly integrate with Spring dependency injection

2.7. Visual control terminal

  • Job control terminal
  • Job Execution History Data Tracking
  • Registry Management

3. Architecture design and principle

3.1, architecture design

insert image description here
Positioned as a lightweight decentralized solution, it provides coordination services for distributed tasks in the form of jars.

3.2. Function principle

3.2.1. Scheduling model

In-process scheduling: ElasticJob-Lite is a thread-level scheduling framework for in-process. Through it, jobs can be transparently combined with business application systems. It can be easily used in conjunction with Java frameworks such as Spring and Dubbo. Beans injected by Spring can be freely used in operations, such as data source connection pools, Dubbo remote services, etc., which is more convenient for business development.

3.2.2. Flexible Scheduling

Elastic scheduling is the most important function of ElasticJob, and it is also the origin of the product name. It is a task processing system that allows tasks to be scaled horizontally through sharding.

Elastic scheduling is the most important function of ElasticJob, and it is also the origin of the product name. It is a task processing system that allows tasks to be scaled horizontally through sharding.

The distributed execution of tasks requires splitting a task into multiple independent task items, and then the distributed server executes one or several fragmented items respectively. For example, if the job is divided into 4 slices and executed by two servers, each server will be divided into 2 slices, responsible for 50% of the load of the job, as shown in the figure below.

insert image description here
ElasticJob-Lite implementation principle:
ElasticJob-Lite does not have a job scheduling center node, but the programs based on the deployment job framework trigger scheduling when they reach the corresponding time point. The registry is only used for job registration and monitoring information storage. The main job node is only used to handle functions such as sharding and cleaning.
Resilient distributed implementation:

  • The first server comes online to trigger the master server election. Once the master server goes offline, the election will be retriggered, and the election process will be blocked. Only after the master server election is completed, other tasks will be performed.
  • When a job server goes online, it will automatically register the server information to the registration center, and when it goes offline, it will automatically update the server status.
  • The re-sharding flag is updated when the master node is elected, the server goes online and offline, and the total number of shards changes.
  • When the scheduled task is triggered, if re-sharding is required, the master server will be used to shard, and the sharding process will be blocked, and the task can only be executed after the sharding ends. If the master server goes offline during the fragmentation process, the master server will be elected first, and then fragmented.
  • From the previous description, we can see that in order to maintain the stability of the job running, only the fragmentation status will be marked during the running process, and the fragmentation will not be re-sharded. Fragmentation can only happen until the next task is triggered.
  • Each sharding will be sorted by server IP to ensure that the sharding results will not fluctuate greatly.
  • Realize the failover function, actively grab unallocated fragments after a certain server is executed, and actively search for available servers to perform tasks after a certain server goes offline.

4. Integration scheme

ElasticJob-Lite supports 3 ways of using native Java, Spring Boot Starter and Spring custom namespace.
Based on ElasticJob Spring Boot Starter uses ElasticJob, users do not need to manually create instances such as CoordinatorRegistryCenter, JobBootstrap, etc., only need to implement the core job logic and supplemented by a small amount of configuration, and then use the lightweight, decentralized ElasticJob to solve distributed scheduling problems.

Guess you like

Origin blog.csdn.net/YYBDESHIJIE/article/details/132231934