Distributed Task Scheduling: PowerJob Advanced Features

1. container

1 Introduction

PowerJob's container technology allows developers to develop Java processors independent of the Worker project. Simply put, it organizes a bunch of Java files (many script processors developed by developers) in the dimension of Maven projects, and then has the development efficiency and maintainability.

The container is a JVM-level container, not an OS-level container (Docker).

2. Application examples

●For example, a database data cleaning task suddenly appears, which has nothing to do with the main business, and it is not elegant to write into the original project project. At this time, you can create a separate container for data operations and complete the development of the processor in it. , is loaded and executed on the Worker cluster through PowerJob's container deployment technology.

●For example, common log cleaning and machine status reporting may not be very good at writing shell scripts for the majority of Java programmers. At this time, you can also borrow agent+container technology to use Java to complete various tasks that originally needed to be done through scripts. operation.

(I feel that the examples given are not very good... This thing can only be understood but cannot be explained in words. Let's try to understand it~ It's super easy to use~)

二、OpenAPI

OpenAPI allows developers to complete manual operations through interfaces, making the system more flexible as a whole. Developers can easily extend the original functions of PowerJob based on the API, for example, fully customize their own task scheduling strategy.

In other words, through OpenAPI, the access party can implement the entire task management and scheduling module of PowerJob.

1. Dependence

For the latest dependency version, please refer to Maven Central Warehouse: Recommended Address & Alternate Address .

<dependency>

  <groupId>tech.powerjob</groupId>

  <artifactId>powerjob-client</artifactId>

  <version>${latest.powerjob.version}</version>

</dependency>

2. Simple example

Stop a task instance through OpenAPI.

// 初始化 client,需要server地址和应用名称作为参数

PowerJobClient client = new PowerJobClient("127.0.0.1:7700", "oms-test", "password");

// 调用相关的API

client.stopInstance(1586855173043L)

3. Workflow

1. What is workflow?

Workflow describes the dependencies between tasks. For example, I now have four tasks A, B, C, and D. I want to run tasks B and C after task A finishes running, and finally run task D. This forms a dependency relationship, which can be described by a directed acyclic graph (DAG), as shown in the figure below.

image.png

4. Processor

For some common tasks, PowerJob has officially written an out-of-the-box Processor for your convenience! You only need to introduce the following dependencies to enjoy all the powerful official processors out there!

Please obtain the latest version from the central warehouse by yourself: click here

<dependency>
<groupId>tech.powerjob</groupId>
<artifactId>powerjob-official-processors</artifactId>
<version>${latest.version}</version>

</dependency>

Please read the documentation carefully for the detailed usage of each official processor. If you have any doubts, it is recommended to read the source code directly !

Since passing many parameters in JSON involves escaping, it is strongly recommended to use Java code generation configuration (JSONObject#put) first, and then call the toJSONString method to generate parameters.

1. Shell Processor

fully qualified class nametech.powerjob.official.processors.impl.script.ShellProcessor

Task parameters: Fill in the Shell script to be processed (directly copy the file content) or the script download link (http://xxx)

2. Python processor

fully qualified class nametech.powerjob.official.processors.impl.script.PythonProcessor

Note: The Python processor will be executed using the python command of the machine, so the python version needs to be consistent with the local python environment!

Task parameters: fill in the Python script to be processed (copy the content of the file directly) or the script download link (http://xxx)

3. HTTP handler

fully qualified class nametech.powerjob.official.processors.impl.HttpProcessor

Task parameters (JSON):

  • method [ required field ]: GET / POST / DELETE / PUT
  • url [ required field ]: request address
  • timeout [optional field]: timeout time in seconds
  • mediaType [optional field]: When using a non-GET request, the data type that needs to be passed, such as*application/json*
  • body [optional field]: the body content when using a non-GET request, the backend uses String to receive, if it is JSON, please pay attention to escaping
  • headers [optional field]: request header, the backend uses Map<String, String> to receive

4. File cleaning processor

**Note: file deletion is a high-risk operation, please use this processor with caution. **By default, this processor is not available, and JVM parameters need to be passed in

-Dpowerjob.official-processor.file-cleanup.enable=trueturn on

fully qualified class nametech.powerjob.official.processors.impl.FileCleanupProcessor

Task parameter (JSONArray): The overall parameter is an array, and each element in the array is JSON, which describes the resources to be cleaned up. The parameters of each node are as follows:

  • dirPath: The folder directory of the file to be deleted (recursively search for all files that meet the requirements in this directory)
  • filePattern: The Java version of the regular expression for the name of the file to be deleted
  • retentionTime: The retention time of the file to be deleted, in hours (the current time - the file whose last edit time of the file to be deleted > retentionTime will be deleted), used to keep some rolling logs, 0 means ignore this rule

Since regular expressions in JSON need to be escaped, it is strongly recommended to use Java code to generate configurations (JSONObject#put, JSONArray#add) first, and then call the toJSONString method to generate parameters.

5. SQL Processor

Currently, there are two built-in SQL processors, both of which support custom SQL verification and parsing logic. The main difference lies in the way of obtaining data source connections.

Task parameters (JSON)

  • dataSourceName: the name of the data source, only SpringDatasourceSqlProcesssorvalid for , not required, defaultthe data source is used by default
  • sql : the SQL statement to be executed, required
  • timeout: SQL timeout (seconds), optional, default value 60
  • jdbcUrl: jdbc database connection, only DynamicDatasourceSqlProcessorvalid for , required
  • showResult: Boolean value, whether to display the SQL execution result in the instance log, optional, default value false

It is recommended that the production environment use AbstractSqlProcessor#registerSqlValidatorthe method to register at least one SQL validator to intercept illegal SQL, such as dangerous operations such as truncate and drop, or to control the permissions of database accounts. If you need to customize the SQL parsing logic , such as macro variable substitution, parameter substitution, etc., you can AbstractSqlProcessor.SqlParserachieve it by specifying .

5.1 SpringDatasourceSqlProcessor

fully qualified class nametech.powerjob.official.processors.impl.sql.SpringDatasourceSqlProcessor

By default, at least one data source needs to be injected during initialization, so it must be manually initialized and registered in the Spring IOC container in advance, and loaded in the form of SpringBean.

SpringDatasourceSqlProcessor#registerDataSource Allows registration of multiple data sources using the method

Suggestion: It is best to separate the database connection pool used by the SQL Processor from the database connection pool used by other business modules **, do not share a connection pool! **

5.2DynamicDatasourceSqlProcessor

By default, this processor is not available, you need to pass in JVM parameters -Dpowerjob.official-processor.dynamic-datasource.enable=trueto enable

fully qualified class nametech.powerjob.official.processors.impl.sql.DynamicDatasourceSqlProcessor

It supports dynamically specifying the data source connection through parameters, and executes SQL in the specified database.

6. Workflow context injection processor

Fully qualified class name tech.powerjob.official.processors.impl.context.InjectWorkflowContextProcessor(since v1.2.0)

This processor loads data from the task parameters, tries to parse it into a Map, and if successful, injects it into the workflow context.

Note that the parameter must be in HashMap<String, Object>the form of a JSON string, otherwise the parsing will fail.

Note: This Processor is mainly used in some workflow scenarios that need to inject a fixed context, and it does not make any sense to execute it as a single task

Guess you like

Origin blog.csdn.net/zhanggqianglovec/article/details/131501272