Ferret out details of XXL-JOB

Foreword

National Day is approaching, the project has ended, has recently been summarized, received a XXL-JOBresearch task. In the "official time" fishing in troubled waters, I was very happy, without further ado, directly to the question.

I believe we XXL-JOBall understand that it is not too much paper source introduced, is focused look at the source of thought during several knowledge points , not necessarily to please the great God who criticized the correction.

XXL-JOB Profile

  • XXL-JOBIs a lightweight distributed task scheduling platform, its core design goal is to develop rapid, simple to learn, lightweight, easy to expand. Open source and now online companies access to the product line, out of the box.
  • XXL-JOBDivided into dispatch centers, actuators, data center, dispatch center is responsible for task management and scheduling, actuators management, log management, is responsible for the implementation and execution of the task execution results callback.

Task Scheduler - to achieve "class time wheel"

Time round

From the time the wheel Nettyis HashedWheelTimer, it is an annular structure, clock analogy, there are many clock face bucket, each bucketcan be stored on a plurality of tasks using a Listall of the tasks due to save time, while a pointer as time goes by a rotatable frame by frame, and performs the corresponding bucketall due tasks. Tasks modulo decision which should be put bucket. And HashMapa similar principle, newTaskcorrespondence put, use Listto resolve conflicts Hash.

FIG above example, assume that a bucketis 1 second, the rotation period of the pointer is represented by a 8S, assumed that the current pointer to 0, this case requires a scheduling tasks After 3s, should obviously be added to (0 + 3 = 3 ) squares, the pointer walk 3s times to be carried out; if the tasks to be performed in the 10s, and so should finish a zero pointer 2 squares and then execute, and therefore should be placed in 2, at the same time round(1)save the task. Due Tasks performed only when the check roundis 0, bucketthe other tasks roundminus 1.

Of course, there optimized "hierarchical time round," the realization, please refer https://cnkirito.moe/timer/.

XXL-JOB the "wheel of time"

  • XXL-JOB scheduler from manner Quartzinto a self-developed robin fashion, much like the wheel of time, may be understood as 60 bucketand each bucketof one second, but not the roundconcept.

  • Specific look FIG.

  • XXL-JOB responsible for task scheduling has two threads, respectively, ringThreadand scheduleThreadits role as follows.

1, scheduleThread: task information read, read-ahead future 5s task is about to trigger, put time round. 2, ringThread: current bucketand bucketfetch and execute tasks.

  • The following combination of source code look, why is' class time round ", the key code attached a comment, please pay attention to watch.
// 环状结构
private volatile static Map<Integer, List<Integer>> ringData = new ConcurrentHashMap<>();

// 任务下次启动时间(单位为秒) % 60
int ringSecond = (int)((jobInfo.getTriggerNextTime()/1000)%60);

// 任务放进时间轮
private void pushTimeRing(int ringSecond, int jobId){
        // push async ring
        List<Integer> ringItemData = ringData.get(ringSecond);
        if (ringItemData == null) {
            ringItemData = new ArrayList<Integer>();
            ringData.put(ringSecond, ringItemData);
        }
        ringItemData.add(jobId);
    }
复制代码
// 同时取两个时间刻度的任务
List<Integer> ringItemData = new ArrayList<>();
int nowSecond = Calendar.getInstance().get(Calendar.SECOND);  
// 避免处理耗时太长,跨过刻度,向前校验一个刻度;
for (int i = 0; i < 2; i++) {
	List<Integer> tmpData = ringData.remove( (nowSecond+60-i)%60 );
	if (tmpData != null) {
		ringItemData.addAll(tmpData);
	}
}
// 运行
for (int jobId: ringItemData) {
	JobTriggerPoolHelper.trigger(jobId, TriggerTypeEnum.CRON, -1, null, null);
}
复制代码

Hash routing consistency in the Hash Algorithm

  • We all know that, XXL-JOBin carrying out its mandate, the specific tasks in which the actuator is running according to the routing policies to determine which strategy is to have a consistent strategy Hash (source in ExecutorRouteConsistentHash.java), naturally thought of consistency Hash algorithm .
  • Consistency Hash algorithm is a distributed system to solve the problem of load balancing can be used when Hash algorithm allows a fixed part of the request falls on the same server, each server so that the fixed handle part of the request (and maintain information on these requests), from to load balancing effect.
  • Common remainder hash (hash (such as user id)% Number of server machine) algorithm scalability poor, offline when new or server machines when mapping between user id and the server will be a lot of failures. Consistency is the use of hash hash ring be improved.
  • Hash algorithm consistency in practice, when there will be less of a server node when the festival said consistency hash tilt of the problem, a solution is to pay more machines, but there is a cost plus machine, then add virtual nodes .
  • Please refer to the specific principles https://www.jianshu.com/p/e968c081f563.
  • Hash below shows the ring with virtual nodes, wherein ip1-1 ip1 is the virtual node, ip2-1 is the virtual node ip2, ip3 IP3-1 is a virtual node.

Visible , the key is consistency Hash Algorithm Hash algorithm to ensure the virtual node and Hash result uniformity, and uniformity can be understood as reduce conflict Hash , Hash knowledge conflict, please refer to the HashMap, Redis dictionary see [Hash]. . . .

  • XXL-JOB Consistency Hash Hash function as follows.
// jobId转换为md5
// 不直接用hashCode() 是因为扩大hash取值范围,减少冲突
byte[] digest = md5.digest();

// 32位hashCode
long hashCode = ((long) (digest[3] & 0xFF) << 24)
	| ((long) (digest[2] & 0xFF) << 16)
	| ((long) (digest[1] & 0xFF) << 8)
	| (digest[0] & 0xFF);

long truncateHashCode = hashCode & 0xffffffffL;
复制代码
  • Hash functions see on the map, it reminds me HashMapof Hash Functions
f(key) = hash(key) & (table.length - 1) 
// 使用>>> 16的原因,hashCode()的高位和低位都对f(key)有了一定影响力,使得分布更加均匀,散列冲突的几率就小了。
hash(key) = (h = key.hashCode()) ^ (h >>> 16)
复制代码
  • Similarly, the level of jobId of md5 encoded bits affect the results of Hash, Hash makes the probability of collisions is reduced.

Achieve fragmentation of tasks - maintenance thread context

  • XXL-JOB ren of a distributed task execution, in fact, the author is the focus of this call, many regular tasks are stand-alone perform daily development, follow-up data for a large task best to have a distributed solution.

  • Routing policy fragmentation tasks, source code, the authors propose a fragmentation broadcasting concept, just started a little lost in the mind, read the source code gradually clear up.

  • Must have seen the source also experienced such an episode, routing policies ye did not realize? As shown below.

public enum ExecutorRouteStrategyEnum {

    FIRST(I18nUtil.getString("jobconf_route_first"), new ExecutorRouteFirst()),
    LAST(I18nUtil.getString("jobconf_route_last"), new ExecutorRouteLast()),
    ROUND(I18nUtil.getString("jobconf_route_round"), new ExecutorRouteRound()),
    RANDOM(I18nUtil.getString("jobconf_route_random"), new ExecutorRouteRandom()),
    CONSISTENT_HASH(I18nUtil.getString("jobconf_route_consistenthash"), new ExecutorRouteConsistentHash()),
    LEAST_FREQUENTLY_USED(I18nUtil.getString("jobconf_route_lfu"), new ExecutorRouteLFU()),
    LEAST_RECENTLY_USED(I18nUtil.getString("jobconf_route_lru"), new ExecutorRouteLRU()),
    FAILOVER(I18nUtil.getString("jobconf_route_failover"), new ExecutorRouteFailover()),
    BUSYOVER(I18nUtil.getString("jobconf_route_busyover"), new ExecutorRouteBusyover()),
    // 说好的实现呢???竟然是null
    SHARDING_BROADCAST(I18nUtil.getString("jobconf_route_shard"), null);
复制代码
  • And then continue to pursue been concluded, I slowly come to be, what is the first slice task execution parameters passed? See XxlJobTrigger.triggerfunction section of code.
...
// 如果是分片路由,走的是这段逻辑
if (ExecutorRouteStrategyEnum.SHARDING_BROADCAST == ExecutorRouteStrategyEnum.match(jobInfo.getExecutorRouteStrategy(), null)
                && group.getRegistryList() != null && !group.getRegistryList().isEmpty()
                && shardingParam == null) {
            for (int i = 0; i < group.getRegistryList().size(); i++) {
	            // 最后两个参数,i是当前机器在执行器集群当中的index,group.getRegistryList().size()为执行器总数
                processTrigger(group, jobInfo, finalFailRetryCount, triggerType, i, group.getRegistryList().size());
            }
        } 
...
复制代码
  • After the parameters RPC passed since the inquiry to the actuator, the actuator in charge of task execution JobThread.run, we see the following code.
// 分片广播的参数比set进了ShardingUtil
ShardingUtil.setShardingVo(new ShardingUtil.ShardingVO(triggerParam.getBroadcastIndex(), triggerParam.getBroadcastTotal()));
...
// 将执行参数传递给jobHandler执行
handler.execute(triggerParamTmp.getExecutorParams())
复制代码
  • Then look ShardingUtil, only to find the mystery, look at the code.
public class ShardingUtil {
	// 线程上下文
    private static InheritableThreadLocal<ShardingVO> contextHolder = new InheritableThreadLocal<ShardingVO>();
	// 分片参数对象
    public static class ShardingVO {

        private int index;  // sharding index
        private int total;  // sharding total
		// 次数省略 get/set
    }
	// 参数对象注入上下文
    public static void setShardingVo(ShardingVO shardingVo){
        contextHolder.set(shardingVo);
    }
	// 从上下文中取出参数对象
    public static ShardingVO getShardingVo(){
        return contextHolder.get();
    }

}
复制代码
  • Obviously, the task is responsible slice ShardingJobHandlertaken out of the slice's thread context parameter, here to the code -
@JobHandler(value="shardingJobHandler")
@Service
public class ShardingJobHandler extends IJobHandler {

	@Override
	public ReturnT<String> execute(String param) throws Exception {

		// 分片参数
		ShardingUtil.ShardingVO shardingVO = ShardingUtil.getShardingVo();
		XxlJobLogger.log("分片参数:当前分片序号 = {}, 总分片数 = {}", shardingVO.getIndex(), shardingVO.getTotal());

		// 业务逻辑
		for (int i = 0; i < shardingVO.getTotal(); i++) {
			if (i == shardingVO.getIndex()) {
				XxlJobLogger.log("第 {} 片, 命中分片开始处理", i);
			} else {
				XxlJobLogger.log("第 {} 片, 忽略", i);
			}
		}

		return SUCCESS;
	}

}
复制代码
  • It follows, a distributed implementation is a slice parameter indexand totaldo simple terms, this identification is given actuator, differentiated according to the identifier of the task data or logical, distributed operation can be realized .
  • Digression: As to why the injection mode fragment parameters of external, not directly executepass?

1, it may be because only fragments task was to use these two parameters 2, IJobHandler only parameter of type String

Thoughts after reading the source code

  • 1, look through the source code, XXL-JOB indeed meet the design goals developed rapidly, learn simple, lightweight, easy to expand .
  • 2, as for self-study RPC no specific considerations, specific access should consider the company's RPC framework.
  • 3, the author gives the Quartzscheduling problem, I have to continue in-depth understanding.
  • 4, many compatible framework for downtime, fault, overtime and other abnormal conditions is worth learning.
  • 5, Rolling logs and system logs need to continue to achieve understanding.

references

Guess you like

Origin juejin.im/post/5d8d85696fb9a04dda7087c6