文章目录
本人在安装oozie的时候遇到了无数的坑,由于oozie在官网上只有源码包,所以得编译。oozie这个东西要多个组件的版本吻合,不然就会各种编译报错,编译过的人都知道。如果想用oozie,直接装一套cdh然后再装对应版本的oozie是最快的,别看要重新部署hadoop集群,这样可比单独编译Oozie要快的多。涉及到的各种安装包的链接在这,下面首先部署cdh
一、安装Hadoop CDH环境
1.上传安装包
put c:/hadoop-2.5.0-cdh5.3.6.tar.gz
2.解压
tar -xzvf hadoop-2.5.0-cdh5.3.6.tar.gz -C /home/hadoop/cdh/
3.修改配置文件
cd /home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/etc/hadoop
3.1hadoop-env.sh
加入jdk的路径
export JAVA_HOME=/usr/local/jdk1.8.0_73
3.2mapred-env.sh
加入jdk的路径
export JAVA_HOME=/usr/local/jdk1.8.0_73
3.3yarn-env.sh
加入jdk的路径
export JAVA_HOME=/usr/local/jdk1.8.0_73
3.4core-site.xml
<configuration>
<!-- 指定 hdfs的访问入口,由于我之前的集群上的namenode端口为9000,这里改为8020以区别 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop01:8020/</value>
</property>
<!-- 指定 hadoop 数据存储目录,这里一定要是一个新目录,不能是之前集群的 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/data/tmp</value>
</property>
<!-- Oozie Server的Hostname -->
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<!-- 允许被Oozie代理的用户组 -->
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
</configuration>
3.5hdfs-site.xml
<configuration>
<!-- 副本数 -->
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<!-- 设置secondary节点 -->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop02:50090</value>
</property>
</configuration>
3.6mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<!-- 设置 mapreduce 的历史服务器地址和端口号 -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop01:10020</value>
</property>
<!-- mapreduce 历史服务器的 web 访问地址 -->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop01:19888</value>
</property>
</configuration>
3.7yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop03</value>
</property>
<!-- 要运行 MapReduce 程序必须配置的附属服务 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- 开启 YARN 集群的日志聚合功能 -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!-- YARN 集群的聚合日志最长保留时长 -->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
<!-- 任务历史服务 -->
<property>
<name>yarn.log.server.url</name>
<value>http://hadoop01:19888/jobhistory/logs/</value>
</property>
</configuration>
3.8slaves
hadoop01
hadoop02
hadoop03
4.远程发送配置文件
scp -r /home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6 hadoop01:/home/hadoop/cdh/
scp -r /home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6 hadoop02:/home/hadoop/cdh/
5.格式化集群并启动
这里一定记得要加路径,不然会跟我们之前的集群产生冲突
/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs namenode -format
看到successful表示格式化成功!
启动hdfs
[hadoop@hadoop01 hadoop]$ /home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/sbin/start-dfs.sh
启动jobhistory
[hadoop@hadoop01 hadoop]/home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/sbin/mr-jobhistory-daemon.sh start historyserver
启动yarn
[hadoop@hadoop03 hadoop]$ /home/hadoop/cdh/hadoop-2.5.0-cdh5.3.6/sbin/start-yarn.sh
启动之后的进程应该是这样
二、部署oozie
1.上传安装包
put c:/oozie-4.0.0-cdh5.3.6.tar.gz
put c:/ext-2.2.zip
2.解压
zip包不用解压
tar -xzvf oozie-4.0.0-cdh5.3.6.tar.gz -C /home/hadoop/cdh/
然后进入oozie文件夹里解压hadooplibs
cd /home/hadoop/cdh/oozie-4.0.0-cdh5.3.6
直接解压到上一级目录
tar -zxvf oozie-hadooplibs-4.0.0-cdh5.3.6.tar.gz -C ../
3.集成jar包
在oozie-4.0.0-cdh5.3.6下创建文件夹libext(这个名字不能更改)
mkdir libext
拷贝刚才解压的hadooplibs
cp -ra hadooplibs/hadooplib-2.5.0-cdh5.3.6.oozie-4.0.0-cdh5.3.6/* libext/
将jdbc的jar包也拷贝过来,我这里是直接拿的hive里面的jar
cp $HIVE_HOME/lib/mysql-connector-java-5.1.40-bin.jar ./libext/
最后是拷贝oozie web ui的js框架包
cp -a /home/hadoop/cdh/ext-2.2.zip libext/
4.修改oozie的配置文件
[hadoop@hadoop03 conf]$ vi oozie-site.xml
修改oozie-site.xml中的4个property
4.1JDBC驱动包
<property>
<name>oozie.service.JPAService.jdbc.driver</name>
<value>com.mysql.jdbc.Driver</value>
<description>
JDBC driver class.
</description>
</property>
4.2jdbc url
<property>
<name>oozie.service.JPAService.jdbc.url</name>
<value>jdbc:mysql://hadoop03:3306/oozie</value>
<description>
JDBC URL.
</description>
</property>
4.3数据库用户名
<property>
<name>oozie.service.JPAService.jdbc.username</name>
<value>root</value>
<description>
DB user name.
</description>
</property>
4.4数据库密码
<property>
<name>oozie.service.JPAService.jdbc.password</name>
<value>123456</value>
<description>
DB user password.
IMPORTANT: if password is emtpy leave a 1 space string, the service trims the value,
if empty Configuration assumes it is NULL.
</description>
</property>
5.在MySQL中创建数据库
mysql -uroot -p123456
create database oozie;
6.初始化oozie
上传Oozie目录下的yarn.tar.gz文件到HDFS
[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ bin/oozie-setup.sh sharelib create -fs hdfs://hadoop01:8020 -locallib oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz
创建oozie.sql文件
[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ bin/ooziedb.sh create -sqlfile oozie.sql -run
打包项目,生成war包
[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ bin/oozie-setup.sh prepare-war
7.启动
[hadoop@hadoop03 oozie-4.0.0-cdh5.3.6]$ bin/oozied.sh start
可以看到oozie的进程
访问:http://hadoop03:11000/oozie/