[Workflow Task Scheduling System--Azkaban]

Azkaban is a batch workflow task scheduler launched by Linkedin, which is used to run a set of jobs and processes in a specific order within a workflow. Azkaban uses job profiles to establish dependencies between tasks and provides an easy-to-use web user interface to maintain and track your workflow.

 

Two existing workflow task scheduling systems. The most well-known should be Apache Oozie, but the process of configuring the workflow is to write a large number of XML configurations, and the code complexity is relatively high, which is not easy for secondary development. Another widely used scheduling system is Airflow, but its development language is Python.

 

 

The reason for Azkaban is based on the following points:

Provides a clear, easy-to-use Web UI interface

Provide job configuration files to quickly establish dependencies between tasks and tasks

Provides a modular and pluggable plug-in mechanism, natively supports command, Java, Hive, Pig, Hadoop

Based on Java development, the code structure is clear and easy for secondary development

 

 

Azkaban has two deployment modes: solo server mode and cluster server mode.

Solo server mode: In this mode, webServer and executorServer run in the same process, and the process name is AzkabanSingleServer. You can use the built-in H2 database or configure mysql data. This mode is suitable for small-scale use.

Cluster server mode (cluster mode): This mode uses MySQL database, webServer and executorServer run in different processes, this mode is suitable for large-scale applications.

 

 

New features in Azkaban2:

1, Web UI

2. Simple workflow upload

3. Easier to set up job dependencies

4. Scheduling Workflow

5. Permission settings

6. Kill and restart workflow

7. Modularization and plug-in

8. Log and design workflows and jobs

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326665690&siteId=291194637