1.Oozie共享库添加Spark2
1.查看当前Oozie的share-lib共享库HDFS目录
oozie admin -oozie http://lefincluster-rt1:11000/oozie -sharelibupdate
[ShareLib update status]
sharelibDirOld = hdfs://nameservice1/user/oozie/share/lib/lib_20180605143536
host = http://lefincluster-rt1:11000/oozie
sharelibDirNew = hdfs://nameservice1/user/oozie/share/lib/lib_20180605143536
status = Successful
2.在Oozie的/user/oozie/share/lib/lib_20180605143536创建spark2目录
sudo -u oozie hdfs dfs -mkdir /user/oozie/share/lib/lib_20180605143536/spark2
3.向spark2目录添加spark2的jars和oozie-sharelib-spark*.jar
[root@lefincluster-rt1 jars]# pwd
/opt/cloudera/parcels/SPARK2/lib/spark2/jars
sudo -u oozie hdfs dfs -put *.jar /user/oozie/share/lib/lib_20180605143536/spark2
[root@lefincluster-rt1 spark]# pwd
/opt/cloudera/parcels/CDH/lib/oozie/oozie-sharelib-yarn/lib/spark
sudo -u oozie hdfs dfs -put oozie-sharelib-spark*.jar /user/oozie/share/lib/lib_20180605143536/spark2
4.修改目录权限
sudo -u hdfs hdfs dfs -chmod -R 775 /user/oozie/share/lib/lib_20180605143536/spark2
5.更新Oozie的share-lib
[root@lefincluster-rt1 spark]# oozie admin -oozie http://lefincluster-rt1:11000/oozie -sharelibupdate
[ShareLib update status]
sharelibDirOld = hdfs://nameservice1/user/oozie/share/lib/lib_20180605143536
host = http://lefincluster-rt1:11000/oozie
sharelibDirNew = hdfs://nameservice1/user/oozie/share/lib/lib_20180605143536
status = Successful
6.确认spark2已经添加到共享库
[root@lefincluster-rt1 spark]# oozie admin -oozie http://lefincluster-rt1:11000/oozie -shareliblist
[Available ShareLib]
hive
spark2
distcp
mapreduce-streaming
spark
oozie
hcatalog
hive2
sqoop
pig
2.创建Spark2的Oozie工作流
1.登录Hue,创建Oozie工作流
2.进入WorkSpace
点击lib
在命令行将Spark2自带的example例子上传到/user/hue/oozie/workspaces/hue-oozie-1528256085.53/lib目录
[root@lefincluster-rt1 jars]# pwd
/opt/cloudera/parcels/SPARK2/lib/spark2/examples/jars
sudo -u hdfs hdfs dfs -put spark-examples_2.11-2.1.0.cloudera2.jar /user/hue/oozie/workspaces/hue-oozie-1528256085.53/lib
3.添加Spark2任务
设置使用Spark2,否则默认使用的Spark1
完成配置,点击保存
4.保存完成后,点击运行测试是否正常
运行成功