The use of distributed timing task framework Elastic-Job

I. Introduction

Elastic-Job is an excellent distributed job scheduling framework.

Elastic-Job is a distributed scheduling solution consisting of two independent sub-projects, Elastic-Job-Lite and Elastic-Job-Cloud.

Elastic-Job-Lite is positioned as a lightweight and decentralized solution that provides coordination services for distributed tasks in the form of jar packages.

Elastic-Job-Cloud uses the Mesos + Docker solution to provide additional services such as resource management, application distribution, and process isolation.

1. Elastic-Job-Lite

Distributed scheduling coordination
Elastic expansion and contraction
failover
Missed execution job retrigger
Job shard consistency ensures that only one execution instance of the same shard exists in a distributed environment
Self-diagnose and fix problems caused by distributed instability
Support parallel scheduling
Support for job lifecycle operations
Rich job types
Spring integration and namespace provisioning
Operation and maintenance platform

2. Elastic-Job-Cloud

Application automatic distribution
Fenzo-based elastic resource allocation
Distributed scheduling coordination
Elastic expansion and contraction
failover
Missed execution job retrigger
Job shard consistency ensures that only one execution instance of the same shard exists in a distributed environment
Support parallel scheduling
Support for job lifecycle operations
Rich job types
Spring integration
Operation and maintenance platform
Docker-based process isolation (TBD)

2. Introduction

1. The core idea of Elastic-Job

2. Basic use of Elastic-Job

3. The core idea of Elastic-Job

For distributed computing, sharding is the most basic idea, and Elastic-Job also follows this idea. Each job runs part of the data, and all jobs are executed to complete the full amount of data. The SimpleJob example given on the official website is as follows:

public class MyElasticJob implements SimpleJob {
    
    @Override
    public void execute(ShardingContext context) {
        switch (context.getShardingItem()) {
            case 0: 
                // do something by sharding item 0
                break;
            case 1: 
                // do something by sharding item 1
                break;
            case 2: 
                // do something by sharding item 2
                break;
            // case n: ...
        }
    }
}

Use the switch case loop to correspond to the business logic of the shard, and the index of the case shard to enter the business logic execution. Of course, there are also unsuitable scenarios. It is not suitable for scenarios similar to MapReduce that require shuffle. For example, if you want to group and aggregate the results globally according to a certain field, it may be unreasonable to shard at this time, because each shard is unreasonable. It can only process 1/N of the data, and there is no way to shuffle and then aggregate it. This point should also be used according to the specific business.

So what information can ShardingContext get? The source code is as follows

public final class ShardingContext {
    
    /**
     * 作业名称.
     */
    private final String jobName;
    
    /**
     * 作业任务ID.
     */
    private final String taskId;
    
    /**
     * 分片总数.
     */
    private final int shardingTotalCount;
    
    /**
     * 作业自定义参数.
     * 可以配置多个相同的作业, 但是用不同的参数作为不同的调度实例.
     */
    private final String jobParameter;
    
    /**
     * 分配于本作业实例的分片项.
     */
    private final int shardingItem;
    
    /**
     * 分配于本作业实例的分片参数.
     */
    private final String shardingParameter;
    
    public ShardingContext(final ShardingContexts shardingContexts, final int shardingItem) {
        jobName = shardingContexts.getJobName();
        taskId = shardingContexts.getTaskId();
        shardingTotalCount = shardingContexts.getShardingTotalCount();
        jobParameter = shardingContexts.getJobParameter();
        this.shardingItem = shardingItem;
        shardingParameter = shardingContexts.getShardingItemParameters().get(shardingItem);
    }
}

In the above code, jobParameter and shardingItem are the most useful parameters. shardingItem determines the direction of the switch case loop. shardingParameter can be used for business query conditions, or it can be used to assemble very complex parameters by string splicing for specific business.

Fourth, the basic use of Elastic-Job

1. Job configuration items

public class ElasticJobConfig {
	private static CoordinatorRegistryCenter createRegistryCenter() {

		ZookeeperConfiguration zookeeperConfiguration = new ZookeeperConfiguration("127.0.0.1:2181", "elastic-job");
		CoordinatorRegistryCenter regCenter = new ZookeeperRegistryCenter(zookeeperConfiguration);
		regCenter.init();
		return regCenter;
	}

	private static LiteJobConfiguration createJobConfiguration() {

		JobCoreConfiguration simpleCoreConfig = JobCoreConfiguration.newBuilder("jobdemo", "0/5 * * * * ?", 3)
				.shardingItemParameters("0=A,1=A,2=B").failover(true).misfire(true).build();
		SimpleJobConfiguration simpleJobConfig = new SimpleJobConfiguration(simpleCoreConfig,
				MyElasticJob.class.getCanonicalName());
		LiteJobConfiguration simpleJobRootConfig = LiteJobConfiguration.newBuilder(simpleJobConfig).overwrite(true)
				.build();
		return simpleJobRootConfig;
	}

	public static void main(String[] args) {
		new JobScheduler(createRegistryCenter(), createJobConfiguration()).init();
	}
}

A few notes:

The registry configuration item, set the zookeeper cluster address, I use a local single node here, so there is only one, of course, you can configure the task name, namespace (namespace, which will essentially generate a directory in zk), timeout, and maximum retry times, etc.

LiteJobConfiguration simpleJobRootConfig = LiteJobConfiguration.newBuilder(simpleJobConfig).overwrite(true).build(), the overwrite parameter is very important. If this parameter is set to true, the data in zookeeper will be overwritten after the job configuration information has been modified, otherwise it will not take effect.

2. Implementation of SimpleJob

public class MyElasticJob implements SimpleJob {

	@Override
	public void execute(ShardingContext shardingContext) {
		switch (shardingContext.getShardingItem()) {
		case 0: {
			System.out.println("当前分片：" + shardingContext.getShardingItem() + "=====" + "参数："
					+ shardingContext.getShardingParameter() + " =====" + Thread.currentThread());
			break;
		}
		case 1: {
			System.out.println("当前分片：" + shardingContext.getShardingItem() + "=====" + "参数："
					+ shardingContext.getShardingParameter() + " =====" + Thread.currentThread());
			break;
		}
		case 2: {
			System.out.println("当前分片：" + shardingContext.getShardingItem() + "=====" + "参数："
					+ shardingContext.getShardingParameter() + " =====" + Thread.currentThread());
			break;
		}
		default: {
			System.out.println("当前分片：" + shardingContext.getShardingItem() + "=====" + "参数："
					+ shardingContext.getShardingParameter() + " =====" + Thread.currentThread());
			break;
		}
		}
	}
}

The above settings are executed every 5 seconds, and the main method of ElasticJobConfig is executed. The execution results are as follows:

From the above results, it can be seen that the task of executing each shard is actually executed in a thread pool, and the corresponding shard information and parameter information can be obtained in the shardingContext, which is very convenient to implement the business.

Finally, if multiple JVMs are started, these tasks will be distributed to each node. If a node goes down, the next time the task is triggered, the sharded task will be thrown to a healthy machine for execution, which achieves node fault tolerance. However, if the task of a shard fails during execution, the execution of the sharded task will not be re-triggered here.

The use of distributed timing task framework Elastic-Job

Guess you like