A, Azkaban source compiler
1.1 Download and unzip
Azkaban does not provide corresponding installation package after the 3.0 version, so you need to download the source code to compile.
Download the required version of the source, Azkaban source code is hosted on GitHub, address https://github.com/azkaban/azkaban . You can use git clone
the way of obtaining the source code can be used wget
directly download the corresponding release version of the tar.gz
file, and here I use the second method:
# 下载
wget https://github.com/azkaban/azkaban/archive/3.70.0.tar.gz
# 解压
tar -zxvf azkaban-3.70.0.tar.gz
1.2 Preparation build environment
1. JDK
Azkaban build dependencies JDK 1.8+, JDK installation see this warehouse:
2. Gradle
Azkaban 3.70.0 compiler need to rely on gradle-4.6-all.zip
. Gradle is a project to build an open source automation tools, like Maven, but due to the Groovy language project configuration, it is more flexible than Maven, currently widely used in Android development, build Spring project.
Note that different versions of Azkaban rely Gradle different versions, can be decompressed /gradle/wrapper/gradle-wrapper.properties
view files
At compile time the program will automatically go to the address shown in Figure download, but the download speed is very slow. To avoid affecting the build process, it is recommended to manually download /gradle/wrapper/
directory:
# wget https://services.gradle.org/distributions/gradle-4.6-all.zip
Then modify the configuration file gradle-wrapper.properties
in the distributionUrl
attribute, that using local gradle.
3. Git
Azkaban compilation process needed to download Git JAR package section, it is necessary to pre-installed Git:
# yum install git
1.3 Project compilation
Compile command in the root directory, there will be compiled after the success BUILD SUCCESSFUL
tips:
# ./gradlew build installDist -x test
The compilation process requires attention to the following issues:
- Because the compilation process will need to download a lot of Jar package, the download speed according to the network availability, usually will not soon, if the network is not good, spend half an hour, one hour is normal;
- Download the compilation process will be repeated if the JAR JAR network problem caused can not be downloaded, the compiler may be forced to terminate, this time to repeat the compile command, gradle will have to download JAR cached locally, so do not worry.
Two, Azkaban deployment model
After version 3.0, we provide two modes: the stand alone “solo-server” mode and distributed multiple-executor mode. The following describes thedifferences between the two modes.
Follow the official document, after Azkaban 3.x version provides two operating modes:
- solo server model (single-mode service) : metadata stored in the built-in default H2 database (can be modified to MySQL), the mode
webServer
(Management Server) andexecutorServer
(execution server) running in the same process, the process nameAzkabanSingleServer
. This mode is suitable for small-scale scheduling workflow. - multiple-executor (distributed multi-service mode) : storing metadata database MySQL, MySQL master-slave mode should be used for backup and fault tolerance. In this mode
webServer
andexecutorServer
running in different processes simultaneously and to each other, suitable for use in a production environment.
The following describes the Solo Server
mode.
Three, Solo Server deployment mode
2.1 unzip
Solo Server mode installation package compiled /azkaban-solo-server/build/distributions
directory, to decompress after found:
# 解压
tar -zxvf azkaban-solo-server-3.70.0.tar.gz
2.2 modify time zone
This step is not necessary. But because Azkaban uses a default time zone America/Los_Angeles
, if you have the task of scheduling regular tasks, then you need to make the appropriate changes, and here I changed to commonAsia/Shanghai
2.3 start
Startup command is executed, it is necessary to note that the implementation must be in the root directory, you can not enter bin
the directory to perform, otherwise it will throw an Cannot find 'database.properties'
exception.
# bin/start-solo.sh
2.4 verification
A verification method: Use the jps
command to check whether AzkabanSingleServer
the process:
Verify way: access to port 8081, see the Web UI interface, the default login password is azkaban
, if you need to modify or add users, you can conf/azkaban-users.xml
configure the file:
More big data series can be found GitHub open source project : Big Data Getting Started