How does the house price network use the distributed job framework elastic-job

What is Elastic-Job?

Elastic-Job is a distributed scheduling solution consisting of two independent sub-projects, Elastic-Job-Lite and Elastic-Job-Cloud.

Elastic-Job-Lite is positioned as a lightweight and decentralized solution, providing coordination services for distributed tasks in the form of jar packages; Elastic-Job-Cloud adopts the self-developed Mesos Framework solution, which additionally provides resource management and application distribution. and process isolation.

Official website address: elasticjob.io/

Github:github.com/elasticjob/…

Why use Elastic-Job

At present, our company uses a Linux Crontab-based timed task executor.

The following problems exist:

  • Inability to centrally manage tasks
  • Cannot scale horizontally
  • No visual interface operation
  • There is a single point of failure

In addition to Linux Crontab's solution in java, there is also Quartz, but Quartz lacks the function of distributed parallel scheduling.

The problems are also obvious:

  • When my project is a single application, I start a scheduled task based on Quartz, which can run happily
  • When my project is loaded and expanded to 3 nodes, the tasks on the 3 nodes will be executed at the same time, and the data will be messed up
  • At the same time, to ensure that there is no problem with the data, it is necessary to introduce distributed locks for scheduling, which increases the difficulty.

How to deal with it?

1. Self-developed framework

In this case, you may need to develop a scheduling framework that can meet the company's business needs. The cost is high and is not recommended.

I also thought about writing one myself before. I have the idea, but it has not started yet. As long as the scheduling framework is a scheduling problem, like Elastic-Job, it does a very good job. It allows you to define the rules of sharding, and then according to you The data of a given slice is scheduled for you, and you control what data each node processes.

If the department adopts this method and does not write data distribution, then I think the easiest way is to use message queues to achieve it.

Zookeeper is used for scheduling, storing task data, and defining a common interface, which is divided into two parts, as follows:

public interface Job {
    void read();
    void process(Object data);
}

Then the user reads the data to be processed by implementing the above interface, and processes the distributed data in the process

As for distribution, a task can be marked by annotation to use a queue, or a general one can be used, so that multiple consumers can consume at the same time, even if one of them dies, it does not affect the entire task, and there is no need to consider failover.

But what needs to be controlled is the read method, which must be executed by only one node, otherwise the data will be distributed repeatedly.

The above is just a simple idea. Of course, there are web page management tasks, and tasks can also be performed manually.

2. Choose an open source solution

TBSchedule: Ali's early open source distributed task scheduling system. The code is slightly old and uses a timer instead of a thread pool to perform task scheduling. It is well known that timers are flawed in handling exception conditions. Moreover, the TBSchedule job type is relatively simple, and it can only be a mode of obtaining/processing data. There is a serious lack of documentation.

Spring Batch: Spring Batch is a lightweight, fully Spring-oriented batch framework that can be applied to enterprise-level large-scale data processing systems. Spring Batch is based on POJOs and the well-known Spring framework, making it easier for developers to access and leverage enterprise-level services. Spring Batch can provide a large number of repeatable data processing functions, including logging/tracing, transaction management, job processing statistics job restart, skip, and resource management and other important functions.

Elastic-Job: Domestic open source product, Chinese documentation, quick entry, easy to use, complete functions, active community, led by Dangdang architect Zhang Liang, and has invested a lot of time in open source.

Why choose Elastic-Job?

  • Distributed scheduling coordination
  • Elastic expansion and contraction
  • failover
  • Missed execution job retrigger
  • Job shard consistency ensures that only one execution instance of the same shard exists in a distributed environment
  • Self-diagnose and fix problems caused by distributed instability
  • Support parallel scheduling
  • Support for job lifecycle operations
  • Rich job types
  • Spring integration and namespace provisioning
  • Operation and maintenance platform

Introduction to job types

Simple: Simple job, commonly used, meaning simple implementation, without any type of encapsulation. Need to implement SimpleJob interface. This interface only provides a single method for overriding, and this method will be executed periodically. Similar to the native interface of Quartz, but provides functions such as elastic scaling and sharding.

public class MyElasticJob implements SimpleJob {
    
    @Override
    public void execute(ShardingContext context) {
        switch (context.getShardingItem()) {
            case 0: 
                // do something by sharding item 0
                break;
            case 1: 
                // do something by sharding item 1
                break;
            case 2: 
                // do something by sharding item 2
                break;
            // case n: ...
        }
    }
}

DataFlow: The Dataflow type is used to process data flow and needs to implement the DataflowJob interface. This interface provides two methods for overriding, which are used to fetch (fetchData) and process (processData) data respectively.

public class MyElasticJob implements DataflowJob<Foo> {
    
    @Override
    public List<Foo> fetchData(ShardingContext context) {
        switch (context.getShardingItem()) {
            case 0: 
                List<Foo> data = // get data from database by sharding item 0
                return data;
            case 1: 
                List<Foo> data = // get data from database by sharding item 1
                return data;
            case 2: 
                List<Foo> data = // get data from database by sharding item 2
                return data;
            // case n: ...
        }
    }
    
    @Override
    public void processData(ShardingContext shardingContext, List<Foo> data) {
        // process data
        // ...
    }
}

Script: Script type job means script type job, and supports all types of scripts such as shell, python, and perl. Just configure scriptCommandLine via console or code, no coding required. The execution script path can contain parameters. After the parameters are passed, the job framework will automatically append the last parameter as the job runtime information.

In fact, I suggest adding a task type, which is a pipeline task. For this reason, I also specially mentioned an issues:

github.com/elasticjob/…

在特定的业务需求下,A任务执行完之后,需要执行B任务,以此类推,这种具有依赖性的流水式的任务。

在目前可以将这些任务合在一起,通过代码调用的方式来达到效果。

但我希望能增加这样一个功能,比如加一个配置,job-after="com.xxx.job.XXXJob" 在执行完这个任务之后,自动调用另一个任务BB,BB任务只需要配置任务信息,把cron去掉就可以,因为BB是依靠别的任务触发执行的。

当然这些任务必须在同一个zk的命名空间下,如果能支持夸命名空间就更好了

这样就能达到,流水式的任务操作了,并且每个任务可以用不同的分片key

start using

1. I will not explain how to build the framework and how to configure it. The official website documentation is definitely better than what I wrote. Generally, open source frameworks have demos. You can download them and import them into the IDE to run them.

demo site: github.com/elasticjob/…

2. Introduce some experience in use

  • It is recommended to divide tasks by product. One product corresponds to a project of one task. When the team is relatively large, one team may be responsible for one product, so that it will not be mixed with others.
  • The description of the task must be written clearly, what is it used for, there is a configuration described in the configuration task, fill in clearly
/**
 * 用户维度统计任务<br>统计出用户的房产,置换,贷款等信息
 * @author yinjihuan
 */
public class UserStatJob implements SimpleJob {

	private Logger logger = LoggerFactory.getLogger(UserStatJob.class);
	
	@Autowired
	private EnterpriseProductUserService enterpriseProductUserService;
	
	@Autowired
	private UserStatService userStatService;
	
	@Autowired
	private HouseInfoService houseInfoService;
	
	@Autowired
	private HouseSubstitutionService houseSubstitutionService;
	
	@Autowired
	private LoanApplyService loanApplyService;
	
	@Override
	public void execute(ShardingContext shardingContext) {
		logger.info("开始执行UserStatJob");
		long total = enterpriseProductUserService.queryCount();
		int pages = PageBean.calcPages(total, 1000);
		for (int i = 1; i <= pages; i++) {
			List<EnterpriseProductUser> users = enterpriseProductUserService.queryByPage(i, 1000);
			for (EnterpriseProductUser user : users) {
				try {
					processStat(user);
				} catch (Exception e) {
					logger.error("用户维度统计任务异常", e);
					DingDingMessageUtil.sendTextMessage("用户维度统计任务异常:" + e.getMessage());
				}
			}
		}
		logger.info("UserStatJob执行结束");
	}
	
	private void processStat(EnterpriseProductUser user) {
		UserStat stat = userStatService.getByUid(user.getEid(), user.getUid());
		Long eid = user.getEid();
		String uid = user.getUid();
		if (stat == null) {
			stat = new UserStat();
			stat.setEid(eid);
			stat.setUid(uid);
			stat.setUserAddTime(user.getAddTime());
			stat.setCity(user.getCity());
			stat.setRegion(user.getRegion());
		}
		stat.setHouseCount(houseInfoService.queryCountByEidAndUid(eid, uid));
		stat.setHousePrice(houseInfoService.querySumMoneyByEidAndUid(eid, uid));
		stat.setSubstitutionCount(houseSubstitutionService.queryCount(eid, uid));
		stat.setSubstitutionMaxPrice(houseSubstitutionService.queryMaxBudget(eid, uid));
		stat.setLoanEvalCount(loanApplyService.queryUserCountByType(eid, uid, 2));
		stat.setLoanEvalMaxPrice(loanApplyService.queryMaxEvalMoney(eid, uid));
		stat.setLoanCount(loanApplyService.queryUserCountByType(eid, uid, 1));
		stat.setModifyDate(new Date());
		userStatService.save(stat);
	}

}
 <!-- 用户统计任务 每天1点10分执行 -->
 <job:simple id="userStatJob" class="com.fangjia.job.fsh.job.UserStatJob" registry-center-ref="regCenter"
    	 sharding-total-count="1" cron="0 10 1 * * ?" sharding-item-parameters=""
    	 failover="true" description="【房生活】用户维度统计任务,统计出用户的房产,置换,贷款等信息 UserStatJob"
    	 overwrite="true" event-trace-rdb-data-source="elasticJobLog" job-exception-handler="com.fangjia.job.fsh.handler.CustomJobExceptionHandler">
    	 
    	  <job:listener class="com.fangjia.job.fsh.listener.MessageElasticJobListener"></job:listener>
    	  
 </job:simple>
  • Configure a unified listener for each task to notify the execution and end of the task, which can be text messages, emails or others. I use DingTalk's robot to send messages to DingTalk
/**
 * 作业监听器, 执行前后发送钉钉消息进行通知
 * @author yinjihuan
 */
public class MessageElasticJobListener implements ElasticJobListener {
    
    @Override
    public void beforeJobExecuted(ShardingContexts shardingContexts) {
    	String date = DateUtils.date2Str(new Date());
    	String msg = date + " 【FSH-" + shardingContexts.getJobName() + "】任务开始执行====" + JsonUtils.toJson(shardingContexts);
    	DingDingMessageUtil.sendTextMessage(msg);
    }
    
    @Override
    public void afterJobExecuted(ShardingContexts shardingContexts) {
    	String date = DateUtils.date2Str(new Date());
    	String msg = date + " 【FSH-" + shardingContexts.getJobName() + "】任务执行结束====" + JsonUtils.toJson(shardingContexts);
    	DingDingMessageUtil.sendTextMessage(msg);
    }

}
  • An annotation can be defined on each task class. The annotation is used to identify who developed the task, and then the corresponding DingTalk message will be sent to whom. I personally recommend that you create a group, and then everyone is in it, because if you are alone Send it to a developer, unless he is very motivated, it is useless. I personally recommend posting it in the group, so that the leader will say who is who, your task is wrong, go and find out the reason. I posted it uniformly here, and there are no defined annotations.

  • The exception handling of tasks can handle exceptions in tasks. In addition to recording logs, it is also notified by sending DingTalk messages in a unified package. To know whether there is an exception in the task in real time, you can check my code above.

  • There is also an uncaught exception, how to notify the group, you can customize the exception handling class to achieve, through configurationjob-exception-handler="com.fangjia.job.fsh.handler.CustomJobExceptionHandler"

/**
 * 自定义异常处理,在任务异常时使用钉钉发送通知
 * @author yinjihuan
 */
public class CustomJobExceptionHandler implements JobExceptionHandler {
	
	private Logger logger = LoggerFactory.getLogger(CustomJobExceptionHandler.class);
	
	@Override
	public void handleException(String jobName, Throwable cause) {
		logger.error(String.format("Job '%s' exception occur in job processing", jobName), cause);
		DingDingMessageUtil.sendTextMessage("【"+jobName+"】任务异常。" + cause.getMessage());
	}

}
  • You can judge whether the job node is down by monitoring whether the job_name\instances\job_instance_id node exists. This node is a temporary node. If the job server goes offline, the node will be deleted. Of course, other tools can also be used for monitoring.

  • The writing of the task should take into account the horizontal scalability as much as possible. The example I posted above is actually not considered. It is just a simple task, because I did not use the shardingParameter to process the data of the corresponding slice. In fact, it is recommended that you consider it. , if the task time is short. The data to be processed is small, and it can be written like me. If it can be predicted that there will be a large amount of data to be processed in the future, and the time is long, it is best to configure the sharding rules, and write the code to be processed by sharding, so that you can directly modify the configuration and add the next node later.

For more technology sharing, please pay attention to the WeChat public account: Yuantiandi

image.png

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325340347&siteId=291194637