【XXL-JOB】 XXL-JOB task sharding

Preface

xxl-job is a distributed task scheduling platform that supports scheduled tasks and sharded tasks. Among them, sharding tasks can split a large task into multiple small tasks and execute them in a distributed manner to improve task execution efficiency and reliability. Among shard tasks, there is a special task type called shard broadcast task, which can broadcast a task to all executor nodes for execution, which is essentially a parallel execution method.

Detailed tutorial on xxl-job shard broadcast task

create task

In the management background of xxl-job, create a fragment broadcast task. Set the basic information of the task, including task name, task description, task type (shard broadcast), executor routing strategy, etc.

Write task code

To write task execution code, you can use Java, Python, Shell and other languages. An execute method needs to be implemented in the code to execute specific task logic. In a shard broadcast task, the execute method will only be executed once on one executor node, so concurrent execution needs to be considered.

Fragmentation parameter settings

On the executor node, you need to set the sharding parameters to specify the sharding information of the task. Sharding parameters include the total number of shards and the current shard items, which can be obtained through the xxl-job API.

perform tasks

On the executor node, start the executor program of xxl-job and wait for task scheduling. When a task is scheduled, the executor will automatically execute the execute method of the task and pass in the sharding parameters. In the execute method, the specific logic of the task can be implemented based on the sharding parameters.

View task execution results

In the management background of xxl-job, you can view the execution status and execution log of the task. If the task execution fails, you can check the log to locate the problem.

Example 1

Code example of xxl-job shard broadcast task:

@XxlJob("broadcastJob")
public void broadcastJob() {
    
    
    int shardCount = 10; // 分片总数
    int shardIndex = XxlJobHelper.getShardIndex(); // 当前分片项

    // 执行任务逻辑
    for (int i = 0; i < 100; i++) {
    
    
        if (i % shardCount == shardIndex) {
    
    
            // 当前分片项需要执行的任务逻辑
            System.out.println("Shard " + shardIndex + " is running: " + i);
        }
    }
}

In the above example, the xxl-job annotation @XxlJob is used to mark a shard broadcast task. The name of the task is broadcastJob, and the execution logic of the task is implemented in the broadcastJob method. First, the total number of shards and the current shard items are obtained, and then specific task logic is executed based on the shard parameters. The task logic is to output numbers in a loop and determine whether it needs to be executed based on the sharding parameters. The tool class XxlJobHelper of xxl-job is used here to obtain the sharding parameters. The getShardIndex method is used to obtain the current shard item, and the getShardTotal method is used to obtain the total number of shards. When the task is executed, xxl-job will automatically pass in the sharding parameters without manual settings.
insert image description here

Example 2

Broadcast sharding processes 16 databases, each database has 32 tables

@XxlJob("broadcastJob")
public void broadcastJob() {
    
    
    int shardCount = 24; // 分片总数
    int shardIndex = XxlJobHelper.getShardIndex(); // 当前分片项

    // 数据库列表
    String[] databases = {
    
    "db1", "db2", "db3", "db4", "db5", "db6", "db7", "db8", "db9", "db10", "db11", "db12", "db13", "db14", "db15", "db16"};

    // 处理每个数据库
    for (String database : databases) {
    
    
        // 表列表
        String[] tables = {
    
    "table1", "table2", "table3", "table4", "table5", "table6", "table7", "table8", "table9", "table10", "table11", "table12", "table13", "table14", "table15", "table16", "table17", "table18", "table19", "table20", "table21", "table22", "table23", "table24", "table25", "table26", "table27", "table28", "table29", "table30", "table31", "table32"};

        // 处理每张表
        for (String table : tables) {
    
    
            if ((shardIndex + table.hashCode()) % shardCount == shardIndex) {
    
    
                // 当前分片项需要处理的表
                System.out.println("Shard " + shardIndex + " is processing database " + database + ", table " + table);
                
                // 执行具体的任务逻辑,例如从数据库中读取数据并进行处理
                // ...
            }
        }
    }
}

In the example, the annotation @XxlJob of xxl-job is used to mark a shard broadcast task. The name of the task is broadcastJob, and the execution logic of the task is implemented in the broadcastJob method. First, the total number of shards and the current shard items are obtained, and then each table in each database is processed according to the sharding parameters. In this example, the task logic is to output the information of the tables that need to be processed, and execute specific task logic, such as reading data from the database and processing it. The hashCode method is used here to convert the table name into an integer, and then it is determined whether it needs to be processed based on the sharding parameters. This method can ensure that the processing tasks of each table are evenly distributed, and will not cause excessive load on some shard items due to the particularity of the table name.

Summarize

Shard broadcast is a task type of xxl-job, which is suitable for some task scenarios that require parallel execution. In a production environment, fragmented broadcast is usually used in the following scenarios:

  1. Data processing tasks: For example, to perform operations such as cleaning, analyzing, and converting a large amount of data, the task can be split into multiple small tasks and executed in a distributed manner to improve the execution efficiency and reliability of the task.
  2. Distributed computing tasks: For example, when performing machine learning and deep learning calculations on large-scale data, computing tasks can be split into multiple small tasks and executed in a distributed manner to speed up the computing process.
  3. Concurrent request tasks: For example, to make concurrent requests to multiple services, the request can be split into multiple small requests and executed in a distributed manner to improve the concurrent processing capability of requests.

Fragmented broadcasting is suitable for scenarios where a task needs to be split into multiple small tasks and executed in a distributed manner, which can improve the execution efficiency and reliability of tasks while reducing the load pressure on a single node.

Guess you like

Origin blog.csdn.net/u011397981/article/details/132745762