Source resolve how to guarantee the stability xxl-job

 This article is based on article  xxl-job parsing the source code to run  parsing

1. How dispatch center failover (Failover)?

  Failover is at the dispatch stage, execution if a station fails to support automatic switching of a normal actuator scheduling request process is completed and the machine

First, when a new scheduled task, select failover routing policy:

When performing the source XxlJobTrigger processTrigger method, as shown below:

We can see a variety of routing policies, failover corresponding figure is the red box FAILOVER.

Now follow the source processTrigger methods continue to go down:

We can see that it will return results corresponding routing depending on the routing strategy, where we will come ExecutorRouteFailover the route method:

Ip address will do a heartbeat communications, addressList is executed automatically registered with the dispatch center at boot time based on addressList address list in the above code:

Now we assume that there are two actuators ip1, ip2 .. Therefore, in the above-described cycle will be detected based on two heartbeat ip1 and ip2, heartbeat detection parse the dispatch center can know the fact rpc request to perform transmission method performed ExecutorBizImpl beat vessel in accordance with the above operation Source:

If the ip address is returned successfully heartbeat ReturnT.SUCCESS, then the current return address ip, ip address through which performs normal scheduling request. If all registered ip addresses heartbeat detection fails, let's continue to look down processTrigger method:

When all addresses are registered ip heartbeat failure address is empty, then returns a failure result to triggerResult, the execution result corresponding to the update to XxlJobLog.

 

2. The results of the implementation task scheduling dispatch center failed to deal with?

 When the dispatch center to start the daemon will start a thread to listen for scheduling failure log

       2.1 failure retry

      We continue to look down is how to retry failed

I can see it first loads all failed task scheduling log id, then iterate, wherein if the number of retries failures is greater than 0, will go to re-scheduling.

Retries is set when new tasks:

We will see () log.getExecutorFailRetryCount retry --1 this parameter. This is retried if the scheduling fails, will always be retried until the retry count is less than 0, thus completing the dispatch failure retry function

2.2 e-mail alerts

Read on e-mail alert function

If you can see the configuration of the e-mail sender and the recipient will be able to carry out e-mail alerts, and support for multiple e-mail recipients. The sender is configured as follows:

Recipients in the new task can be configured:

After the e-mail alerts completed, the result will be alert to update the schedule log, thus completing the e-mail alert function

 

3. Perform perform tasks timeout or abnormal how to deal with?

       3.1 Actuators task timeout handling mechanism

As is clear from the above, the scheduling task will eventually put triggerQueue trigger queue, to JobThread threading:

Continue to look at how JobThread is handled:

Task processing thread will be removed from triggerQueue, if the timeout is set when the new task is greater than 0:

Then when the task is performed asynchronously FutureTask will be used to obtain the results within the timeout period, if the obtained results timeout throws TimeoutException

And then assigned to executeResult a result of the execution timeout, asynchronous interrupt finally get the results of futureThread thread.

3.2 task execution thrown

The figure know, handler.execute task execution, if an exception is thrown will assign a result of the failure to executeResult

3.3 callback processing results

Continue to look after the implementation of the task, finally in the process:

Here will be the results and related parameters into callBackQueue callback queue, the callback queue starts a thread consumption when the actuator handle guard started:

Read on how callback threads are handled:

Here the task will be removed from the batch callBackQueue callback queue for processing, read on:

Here will be to perform adminBiz.callback way we look at adminBiz initialization process:

GetObject method to see the image above, we would understand (ps: do not understand, you can see my articles on xxl-job parsing the source code to run ), which in turn is a rpc request, except that this time the actuator → request sent by the scheduler, it will adminBiz.callback callback method to execute the control center inside AdminBizImpl:

Read on callback detailed implementation process:

You can see a successful task execution schedule and there will be configured to perform subtasks ID corresponding subtasks (ps: child task ID is configured at the time of the new task), if the scheduled task execution failed (ie timeout or abnormal ), read on:

It can be seen that if a task fails code execution result failure will be corresponding to the failure information updates xxl_job_log table, and then by the failure of the above tasks to listeners perform a corresponding retry or alarm. So far, the results of the first mission scheduled to complete the process

 

to sum up:

Overall xxl-job is executed by the abnormal results into xxl_job_log, and then start scheduling tasks JobFailMonitorHelper result daemon thread by retrying failed and alarms way to handle exceptions

 

Published 25 original articles · won praise 51 · views 20000 +

Guess you like

Origin blog.csdn.net/Royal_lr/article/details/100113760