Interpretation of XXL-JOB core source code and analysis of time wheel principle

Hello, today I want to share with you the core implementation of XXL-JOB. If you are a user of XXL-JOB, then you must have thought about its implementation principle; if you have not been exposed to this product, you can learn about it through this article.

The structure diagram of XXL-JOB (version 2.0) is as follows:

How does it work? From the perspective of the user, the executor must first register with the server. Then you may have a question here, the executor registers with the server? How did you register? How often do you sign up? What communication protocol is used?

After the registration is completed, the server can know which executors exist and trigger task scheduling. So how does the server record the trigger timing of each task and complete precise scheduling? XXL-JOB uses the Quartz scheduling framework. In this article, I plan to replace it with the time wheel scheme.

Finally, how does the executor execute the task when it receives the scheduling request?

With these questions in mind, we start the exploration journey of XXL-JOB. Let me talk about the XXL-JOB project module first, the project module is very simple, there are 2:

  • xxl-job-core: This module is dependent on the executor;

  • xxl-job-admin: corresponds to the scheduling center in the architecture diagram;

The content of this article is relatively dry, please eat it with the source code. The source code version is: 2.0.2

1. Job service automatic registration

The first core technology point, service registration.

Service registration starts with the class xxl-job-coreof the module XxlJobSpringExecutor, which is a Spring Bean, which is defined as follows:

 @Bean(initMethod = "start", destroyMethod = "destroy") 
 public XxlJobSpringExecutor xxlJobExecutor() { 
     XxlJobSpringExecutor xxlJobSpringExecutor = new XxlJobSpringExecutor(); 
     xxlJobSpringExecutor.setAdminAddresses(adminAddresses); 
     // some other registration information 
     return xxlJobSpringExecutor; 
 }

After code tracking, it will eventually be the following call link:

 xxl-job-core模块
 spring bean: XxlJobSpringExecutor # start()
 -> XxlJobExecutor # start() -> initRpcProvider()
 ​
 xxl-rpc-core.jar
 -> XxlRpcProviderFactory # start() 
 -> ServiceRegistry # start()
 -> ExecutorServiceRegistry # start()
 -> ExecutorRegistryThread # start()

ExecutorRegistryThreadThat is, the core of the service registration is realized, and start()the core code of the method is as follows:

 public void start(String appName, String address) {
     registryThread = new Thread(new Runnable() {
         @Override
         public void run() {
             // registry
             while (!toStop) {
                 // do registry
                 adminBiz.registry(registryParam);
                 TimeUnit.SECONDS.sleep(JobConstants.HEARTBEAT_INTERVAL);// 30s
             }
             // registry remove
             adminBiz.registryRemove(registryParam);
         }
     });
     registryThread.setDaemon(true);
     registryThread.start();
 }

You can see that the executor performs registration every 30s, let's continue to look down.

2. Implementation of automatic registration communication technology

Through ExecutorRegistryThread # start()the core code of the above method, you can see that the registration is adminBiz.registry(registryParam)realized through the code, and the calling link is summarized as follows:

 xxl-job-core模块
 ​
 AdminBiz # registry()
 -> AdminBizClient # registry()
 -> XxlJobRemotingUtil # postBody()
 -> POST api/registry (jdk HttpURLConnection)

In the end, the POST request through the HTTP protocol is still used, and the registration data format is as follows:

 {
   "registryGroup": "EXECUTOR",
   "registryKey": "example-job-executor",
   "registryValue": "10.0.0.10:9999"
 }

Seeing this, let's go back to the question section at the beginning of the article.

The executor registers with the server? How did you register? How often do you sign up? What communication protocol is used?

The answer is already obvious.

3. Implementation of task scheduling

Let's look at the second core technical point, task scheduling.

XXL-JOB uses the Quartz scheduling framework. Here I intend to introduce you to the implementation of the time wheel. The core source code is as follows:

 @Component
 public class JobScheduleHandler {
 ​
     private Thread scheduler;
     private Thread ringConsumer;
     private final Map<Integer, List<Integer>> ring;
     
     @PostConstruct
     public void start() {
         scheduler = new Thread(new JobScheduler(), "job-scheduler");
         scheduler.setDaemon(true);
         scheduler.start();
 ​
         ringConsumer = new Thread(new RingConsumer(), "job-ring-handler");
         ringConsumer.setDaemon(true);
         ringConsumer.start();
     }
     
     class JobScheduler implements Runnable {
         @Override
         public void run() {
             sleep(5000 - System.currentTimeMillis() % 1000);
             while (!schedulerStop) {
                 try {
                     lock.lock();
                     // pre read to ring
                 } catch (Exception e) {
                     log.error("JobScheduler error", e);
                 } finally {
                     lock.unlock();
                 }
                 sleep(1000);
             }
         }
     }
     
     class RingConsumer implements Runnable {
         @Override
         public void run() {
             sleep(1000 - System.currentTimeMillis() % 1000);
             while (!ringConsumerStop) {
                 try {
                     int nowSecond = Calendar.getInstance().get(Calendar.SECOND);
                     List<Integer> jobIds = ring.remove(nowSecond % 60);
                     // 触发任务调度
                 } catch (Exception e) {
                     log.error("ring consumer error", e);
                 }
                 sleep(1000 - System.currentTimeMillis() % 1000);
             }
         }
     }
 }

The above is realized through two thread pools, job-schedulerwhich are read-ahead threads job-ring-handlerand time wheel threads. So how does the time wheel realize the precise scheduling of tasks?

The realization principle of the time wheel

Our common clocks can be divided into ticking second hands and flowing second hands according to the type of second hand rotation.

I take the ticking second hand clock as an example. The clock ring can be regarded as an array. The position where the second hand stays from 1 to 60 seconds is used as the array subscript, and 60s is the array subscript 0. Assume that there are currently 3 tasks to be executed, which are as follows:

 jobid: 101 Start execution at 0 seconds, 2s/time 
 jobid: 102 Start execution at 0 seconds, 3s/time 
 jobid: 103 Start execution at 3 seconds, 4s/time

The array model corresponding to the time of 0 seconds is shown in the figure below:

Here I split the 0 moment into three stages, namely:

  • Before execution: read which tasks are waiting to be executed at that moment, and get the task id;

  • Executing: query the operation policy of the task through the task id, and execute the task;

  • After execution: update the next execution time of the task;

Then the time pointer moves forward by one moment, to the 1 second moment. At this moment, the tasks in the time wheel have not changed.

At the second second, the read-ahead thread adds jobid 103 to the time wheel, and executes the tasks under the subscript of the array:

In this way, at the 3rd second, the array subscript of the task will be updated again.

So is there any error in this second-scale time wheel?

The accuracy of task scheduling depends on the scale of the time wheel. For example, we split the 1s at 0 seconds into 1000ms.

Assuming that all tasks are scheduled within 500ms of the second, and a new task is loaded by the pre-reading thread in 501ms, then it is the next time to schedule, and it will wait until the 500ms of the first second. The error differs by one scale, that is, 1s. If 0.5 seconds is used as a scale, then the error becomes smaller, which is 500ms.

So, the smaller the scale, the smaller the error. However, this also depends on the actual situation of the business. After all, if you want to reduce errors, you need to consume more CPU resources.

After understanding the implementation principle of task scheduling, how is the service communication between the scheduler and the executor realized?

4. Realization of task scheduling communication technology

In xxl-job-adminthe module, sort out the call link as follows:

 xxl-job-admin module 
 JobTriggerPoolHelper
 # trigger() 
 -> ThreadPoolExecutor # execute() (splitting fast and slow thread pools) 
 -> XxlJobTrigger # trigger() -> processTrigger() -> runExecutor() 
 -> XxlJobDynamicScheduler # getExecutorBiz()     
 - > ExecutorBiz # run() (dynamic proxy implementation, the run called here will be used as a parameter) [1] 
 -> XxlRpcReferenceBean.new InvocationHandler() # invoke() 
 ​xxl
 -rpc-core.jar 
 -> NettyHttpClient # asyncSend() 
 ( POST...request parameter XxlRpcRequest set methodName to the calling method at [1] (ie "run")

Finally, the communication is carried out through the HTTP protocol, and the core communication code is as follows:

 public void send(XxlRpcRequest xxlRpcRequest) throws Exception {
     byte[] requestBytes = serializer.serialize(xxlRpcRequest);
     DefaultFullHttpRequest request = new DefaultFullHttpRequest(HttpVersion.HTTP_1_1, HttpMethod.POST, new URI(address).getRawPath(), Unpooled.wrappedBuffer(requestBytes));
     request.headers().set(HttpHeaderNames.HOST, host);
     request.headers().set(HttpHeaderNames.CONNECTION, HttpHeaderValues.KEEP_ALIVE);
     request.headers().set(HttpHeaderNames.CONTENT_LENGTH, request.content().readableBytes());
     this.channel.writeAndFlush(request).sync();
 }

After the scheduler sends the execution request to the executor, then it is the work of the executor.

5. Implementation of the executor receiving task interface

For the work of the executor, the calling link is sorted out as follows:

 xxl-job-core module 
 spring bean: XxlJobSpringExecutor # start() 
 -> XxlJobExecutor # start() -> initRpcProvider() 
 ​xxl
 -rpc-core.jar 
 -> XxlRpcProviderFactory # start() 
 -> Server # start() 
 -> NettyHttpServer # start() 
 ​netty
 interface implements 
 NettyHttpServerHandler # channelRead0() -> process() (thread pool execution) 
 -> XxlRpcProviderFactory # invokeService() 
 (reflection call according to the methodName in the request parameter XxlRpcRequest) 
 -> ExecutorBizImpl # run()

We can also view the interface implementation through HTTP requests:

 GET http://localhost:17711/services

The result is as follows:

 <ui>
     <li>com.xxl.job.core.biz.ExecutorBiz: com.xxl.job.core.biz.impl.ExecutorBizImpl@d579177</li>
 </ui>

The executor receives the task. In summary, the following interface is used:

 POST http://localhost:17711

It should be noted that if you call it through Postman here, it will not work, because the serialization method is different from the HTTP protocol.

Next, the executor receives the task logic, and the code link is as follows:

 xxl-job-core模块
 spring bean: XxlJobSpringExecutor # start()
 -> XxlJobExecutor # start() -> initRpcProvider()
 -> new ExecutorBizImpl()
 -> JobThread # pushTriggerQueue()
 ​
 spring bean: XxlJobExecutor # registJobThread() 启动 jobThead
 -> JobThread # run()

So far, we have sorted out the core process.

summary

Through the above combing, if you want to build a distributed task scheduling system from 0, you must have a plan in mind. The time wheel scheme described in this article is also our company's reconstruction scheme based on XXL-JOB, and was later applied to the delayed message implementation of the message middleware.

Welcome to exchange, public number [Yang Yang technotes]

Guess you like

Origin blog.csdn.net/yang237061644/article/details/128193922