The use of distributed timing task framework Elastic-Job

I. Introduction

    Elastic-Job is an excellent distributed job scheduling framework.

    Elastic-Job is a distributed scheduling solution consisting of two independent sub-projects, Elastic-Job-Lite and Elastic-Job-Cloud.

    Elastic-Job-Lite is positioned as a lightweight and decentralized solution that provides coordination services for distributed tasks in the form of jar packages.

    Elastic-Job-Cloud uses the Mesos + Docker solution to provide additional services such as resource management, application distribution, and process isolation.

1. Elastic-Job-Lite

  • Distributed scheduling coordination

  • Elastic expansion and contraction

  • failover

  • Missed execution job retrigger

  • Job shard consistency ensures that only one execution instance of the same shard exists in a distributed environment

  • Self-diagnose and fix problems caused by distributed instability

  • Support parallel scheduling

  • Support for job lifecycle operations

  • Rich job types

  • Spring integration and namespace provisioning

  • Operation and maintenance platform

2. Elastic-Job-Cloud

  • Application automatic distribution

  • Fenzo-based elastic resource allocation

  • Distributed scheduling coordination

  • Elastic expansion and contraction

  • failover

  • Missed execution job retrigger

  • Job shard consistency ensures that only one execution instance of the same shard exists in a distributed environment

  • Support parallel scheduling

  • Support for job lifecycle operations

  • Rich job types

  • Spring integration

  • Operation and maintenance platform

  • Docker-based process isolation (TBD)

2. Introduction

    1. The core idea of ​​Elastic-Job

    2. Basic use of Elastic-Job

3. The core idea of ​​Elastic-Job

    For distributed computing, sharding is the most basic idea, and Elastic-Job also follows this idea. Each job runs part of the data, and all jobs are executed to complete the full amount of data. The SimpleJob example given on the official website is as follows:

public class MyElasticJob implements SimpleJob {
    
    @Override
    public void execute(ShardingContext context) {
        switch (context.getShardingItem()) {
            case 0: 
                // do something by sharding item 0
                break;
            case 1: 
                // do something by sharding item 1
                break;
            case 2: 
                // do something by sharding item 2
                break;
            // case n: ...
        }
    }
}

    Use the switch case loop to correspond to the business logic of the shard, and the index of the case shard to enter the business logic execution. Of course, there are also unsuitable scenarios. It is not suitable for scenarios similar to MapReduce that require shuffle. For example, if you want to group and aggregate the results globally according to a certain field, it may be unreasonable to shard at this time, because each shard is unreasonable. It can only process 1/N of the data, and there is no way to shuffle and then aggregate it. This point should also be used according to the specific business.

   So what information can ShardingContext get? The source code is as follows

    

public final class ShardingContext {
    
    /**
     * 作业名称.
     */
    private final String jobName;
    
    /**
     * 作业任务ID.
     */
    private final String taskId;
    
    /**
     * 分片总数.
     */
    private final int shardingTotalCount;
    
    /**
     * 作业自定义参数.
     * 可以配置多个相同的作业, 但是用不同的参数作为不同的调度实例.
     */
    private final String jobParameter;
    
    /**
     * 分配于本作业实例的分片项.
     */
    private final int shardingItem;
    
    /**
     * 分配于本作业实例的分片参数.
     */
    private final String shardingParameter;
    
    public ShardingContext(final ShardingContexts shardingContexts, final int shardingItem) {
        jobName = shardingContexts.getJobName();
        taskId = shardingContexts.getTaskId();
        shardingTotalCount = shardingContexts.getShardingTotalCount();
        jobParameter = shardingContexts.getJobParameter();
        this.shardingItem = shardingItem;
        shardingParameter = shardingContexts.getShardingItemParameters().get(shardingItem);
    }
}

    In the above code, jobParameter and shardingItem are the most useful parameters. shardingItem determines the direction of the switch case loop. shardingParameter can be used for business query conditions, or it can be used to assemble very complex parameters by string splicing for specific business.

Fourth, the basic use of Elastic-Job

    1. Job configuration items

public class ElasticJobConfig {
	private static CoordinatorRegistryCenter createRegistryCenter() {

		ZookeeperConfiguration zookeeperConfiguration = new ZookeeperConfiguration("127.0.0.1:2181", "elastic-job");
		CoordinatorRegistryCenter regCenter = new ZookeeperRegistryCenter(zookeeperConfiguration);
		regCenter.init();
		return regCenter;
	}

	private static LiteJobConfiguration createJobConfiguration() {

		JobCoreConfiguration simpleCoreConfig = JobCoreConfiguration.newBuilder("jobdemo", "0/5 * * * * ?", 3)
				.shardingItemParameters("0=A,1=A,2=B").failover(true).misfire(true).build();
		SimpleJobConfiguration simpleJobConfig = new SimpleJobConfiguration(simpleCoreConfig,
				MyElasticJob.class.getCanonicalName());
		LiteJobConfiguration simpleJobRootConfig = LiteJobConfiguration.newBuilder(simpleJobConfig).overwrite(true)
				.build();
		return simpleJobRootConfig;
	}

	public static void main(String[] args) {
		new JobScheduler(createRegistryCenter(), createJobConfiguration()).init();
	}
}

    A few notes:

    The registry configuration item, set the zookeeper cluster address, I use a local single node here, so there is only one, of course, you can configure the task name, namespace (namespace, which will essentially generate a directory in zk), timeout, and maximum retry times, etc.

    LiteJobConfiguration simpleJobRootConfig = LiteJobConfiguration.newBuilder(simpleJobConfig).overwrite(true).build(), the overwrite parameter is very important. If this parameter is set to true, the data in zookeeper will be overwritten after the job configuration information has been modified, otherwise it will not take effect.

    2. Implementation of SimpleJob

public class MyElasticJob implements SimpleJob {

	@Override
	public void execute(ShardingContext shardingContext) {
		switch (shardingContext.getShardingItem()) {
		case 0: {
			System.out.println("当前分片:" + shardingContext.getShardingItem() + "=====" + "参数:"
					+ shardingContext.getShardingParameter() + " =====" + Thread.currentThread());
			break;
		}
		case 1: {
			System.out.println("当前分片:" + shardingContext.getShardingItem() + "=====" + "参数:"
					+ shardingContext.getShardingParameter() + " =====" + Thread.currentThread());
			break;
		}
		case 2: {
			System.out.println("当前分片:" + shardingContext.getShardingItem() + "=====" + "参数:"
					+ shardingContext.getShardingParameter() + " =====" + Thread.currentThread());
			break;
		}
		default: {
			System.out.println("当前分片:" + shardingContext.getShardingItem() + "=====" + "参数:"
					+ shardingContext.getShardingParameter() + " =====" + Thread.currentThread());
			break;
		}
		}
	}
}

    The above settings are executed every 5 seconds, and the main method of ElasticJobConfig is executed. The execution results are as follows:

    

    From the above results, it can be seen that the task of executing each shard is actually executed in a thread pool, and the corresponding shard information and parameter information can be obtained in the shardingContext, which is very convenient to implement the business.

    Finally, if multiple JVMs are started, these tasks will be distributed to each node. If a node goes down, the next time the task is triggered, the sharded task will be thrown to a healthy machine for execution, which achieves node fault tolerance. However, if the task of a shard fails during execution, the execution of the sharded task will not be re-triggered here.

 

 

    

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324142251&siteId=291194637