azkaban: ======================== Workflow Scheduler crontab: Linux comes with scheduled tasks azkaban: Lightweight Workflow Scheduler linkedIn oozie : Complex task scheduler, heavyweight apache Local log ===> MR program data cleaning ====> load hive ===> hql ===> hdfs job job job job ETL: Extract Transform Load Extract Transform Load Scheduling tools use: ======================= 1. Write the job locally oozie:xml azkaban : text k = v 2. Package the job and submit it to the scheduling executor through the web page 3. Schedule the executor to execute the scheduled task azkaban installation: ======================== Phase 1: Initialize azkaban 1. Create /soft/ azkaban folder 2. Put azkaban-web-server-3.46.0 .tar.gz azkaban-exec-server-3.46.0.tar.gz Unzip to /soft/ azkaban tar -xzvf azkaban-web-server-3.46.0.tar.gz -C /soft/azkaban tar -xzvf azkaban-exec-server-3.46.0.tar.gz -C /soft/azkaban 3. Put create-all-sql-3.46.0.sql under /soft/ azkaban 4. Enter mysql and create a database mysql -uroot -p mysql> create database azkaban 5. Use the source command to run the sql script (mysql command line) mysql> use azkaban; mysql> source /soft/azkaban/create-all-sql-3.46.0.sql azkaban initialization complete Stage 2: Create the SSL Configuration 1. Generate SSL key keytool -keystore keystore -alias jetty -genkey -keyalg RSA Enter keystore password: //hadoop Re-enter new password: //hadoop What is your first and last name? [Unknown]: //回车 What is the name of your organizational unit? [Unknown]: //回车 What is the name of your organization? [Unknown]: //回车 What is the name of your City or Locality? [Unknown]: //回车 What is the name of your State or Province? [Unknown]: //回车 What is the two-letter country code for this unit? [Unknown]: //CN Is CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=CN correct? [no]: //yes Enter key password for <jetty> (RETURN if same as keystore password): //回车 2. Copy the keystore file generated in the current directory to /soft/azkaban/azkaban-web-server-3.46.0 cp keystore /soft/azkaban/azkaban-web-server-3.46.0 At this point, the SSL configuration is complete Phase Three: Configuration File 1. Modify the azkaban-web configuration file /soft/azkaban/azkaban-web-server-3.46.0/conf/azkaban.properties # Default time zone, change to Asia / Shanghai default .timezone.id=Asia/ Shanghai #Modify database information database.type=mysql mysql.port=3306 mysql.host=s101 mysql.database=azkaban mysql.user=root mysql.password=root mysql.numconnections=100 2. Modify the azkaban-exec configuration file /soft/azkaban/azkaban-exec-server-3.46.0/conf/azkaban.properties # Default time zone, change to Asia / Shanghai default .timezone.id=Asia/ Shanghai #Modify database information database.type=mysql mysql.port=3306 mysql.host=s101 mysql.database=azkaban mysql.user=root mysql.password=root mysql.numconnections=100 3. Modify /soft/azkaban/azkaban-web-server-3.46.0/conf/azkaban-users.xml under azkaban-web <azkaban-users> <user groups="azkaban" password="azkaban" roles="admin" username="azkaban"/> <user password="metrics" roles="metrics" username="metrics"/> <!-- add--> <user username="admin" password="admin" roles="admin,metrics"/> <role name="admin" permissions="ADMIN"/> <role name="metrics" permissions="METRICS"/> </azkaban-users> Start azkaban: ========================= 1. Start azkaban's web service /soft/azkaban/azkaban-web-server-3.46.0>$ bin/ start-web.sh 2. Start azkaban's exec service /soft/azkaban/azkaban-exec-server-3.46.0>$ bin/ start-exec.sh 3. Write the above script /usr/local/bin/azweb.sh : increase the execution authority #!/bin/bash if [ $# -ne 1 ] ; then echo param must be 1 ; exit ; fi cmd=$1 cd /soft/azkaban/azkaban-web-server-3.46.0 ; case $cmd in start ) bin/start-web.sh ;; stop ) bin/shutdown-web.sh ;; * ) echo illegal argument ; exit ;; esac echo ================ $cmd azweb ================ /usr/local/bin/azexec.sh : increase execution permissions #!/bin/bash if [ $# -ne 1 ] ; then echo param must be 1 ; exit ; fi cmd=$1 cd /soft/azkaban/azkaban-exec-server-3.46.0 ; case $cmd in start ) bin/start-exec.sh ;; stop ) bin/shutdown-exec.sh ;; * ) echo illegal argument ; exit ;; esac echo ================ $cmd azexec ================ 4. Start web and exec azweb.sh start azexec.sh start 5. Open the web interface of azkaban s101: 8081 6. Enter username and password admin admin Problem 1: The exec process of azkaban can't get up Reason: conf / global.properties file does not exist Resolved: /soft/azkaban/azkaban-exec-server-3.46.0 > $ touch conf/ global.properties Question 2: The web process of azkaban stops automatically Reason: conf / azkaban.properties is misconfigured Solution: Delete all spaces or special characters after the configuration information in the configuration file (including spaces) Problem 3: The web process of azkaban cannot be started Reason: Database not initialized Solution: Go to mysql and enter: mysql> use azkaban; mysql> source /soft/azkaban/create-all-sql-3.46.0.sql Question 4: When azkaban executes the job, it prompts: Request memory, try again Reason: azkaban will check whether the host memory exceeds 3G during execution, otherwise it will prompt the aboveResolved : /soft/azkaban/azkaban-exec-server-3.46.0/plugins/jobtypes/commonprivate.properties Add the following configuration memCheck.enabled=false Job creation and use of azkaban: ============================= 1. Create a single job: command1.job # command1.job type=command command=echo helloworld 2. Package the job Type command1.job into a zip archive 3. Create project (web) Click create project on the web interface and enter name: the name of the job description: job related information (cannot be left blank, can not be Chinese) 4. Upload job Click upload on the project page, select the zip package of the job and upload it 5. Execute the job Click execute flow to execute the job hive -e hive -e "create table xxx(id int, name string)" -f hive -f create.hql create.hql ===> create table xxx(id int, name string)