Hadoop生态圈-CDH5.15.1升级默认的spark版本
作者:尹正杰
版权声明:原创作品,谢绝转载!否则将追究法律责任。
在我的CDH5.11集群中,默认安装的spark是1.6版本,开发的同事跟我抱怨,说之前的大数据平台(在ucloud上,属于云服务)用的就是spark1.6,好多java的API都用不了,有很多高级的功能没法在1.6版本上使用,因此被迫需要升级spark版本,他们要求升级到2.3.0或以上版本,经查阅相关资料,也查看了一些热心网友的帖子,才总结了我部署spark2.3.0的部署笔记。当然你可以参考官网:https://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html。
如果你使用CDH部署kafka的话,相信升级spark版本这个事情对你来说就是小菜一碟了,因为他们基本上是一个套路。如果你使用时CDH免费版本的话,我并不推荐你使用CDH集成kafka。因为里面有一些和奇葩的坑在等着你。
一.下载spark2.3的CSD的jar包
和CDH集成kafka的套路一样,我们在安装spark版本的时候也需要下载相应的csd的jar包。下载地址:http://archive.cloudera.com/spark2/csd/
1>.选择csd版本
2>.安装下载的软件包(wget)
[root@node101 ~]# yum -y install wget Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile 10gen | 2.5 kB 00:00:00 base | 3.6 kB 00:00:00 centosplus | 3.4 kB 00:00:00 epel | 3.2 kB 00:00:00 extras | 3.4 kB 00:00:00 mysql-connectors-community | 2.5 kB 00:00:00 mysql-tools-community | 2.5 kB 00:00:00 mysql56-community | 2.5 kB 00:00:00 updates | 3.4 kB 00:00:00 (1/3): epel/x86_64/updateinfo | 933 kB 00:00:00 (2/3): epel/x86_64/primary | 3.6 MB 00:00:01 (3/3): updates/7/x86_64/primary_db | 6.0 MB 00:00:01 epel 12756/12756 Resolving Dependencies --> Running transaction check ---> Package wget.x86_64 0:1.14-15.el7_4.1 will be installed --> Finished Dependency Resolution Dependencies Resolved =================================================================================================================================================================================================================== Package Arch Version Repository Size =================================================================================================================================================================================================================== Installing: wget x86_64 1.14-15.el7_4.1 base 547 k Transaction Summary =================================================================================================================================================================================================================== Install 1 Package Total download size: 547 k Installed size: 2.0 M Downloading packages: wget-1.14-15.el7_4.1.x86_64.rpm | 547 kB 00:00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : wget-1.14-15.el7_4.1.x86_64 1/1 Verifying : wget-1.14-15.el7_4.1.x86_64 1/1 Installed: wget.x86_64 0:1.14-15.el7_4.1 Complete! [root@node101 ~]#
3>.下载csd的jar包
[root@node101 ~]# mkdir /opt/cloudera/csd && cd /opt/cloudera/csd [root@node101 csd]# [root@node101 csd]# wget http://archive.cloudera.com/spark2/csd/SPARK2_ON_YARN-2.3.0.cloudera4.jar --2018-10-31 00:17:57-- http://archive.cloudera.com/spark2/csd/SPARK2_ON_YARN-2.3.0.cloudera4.jar Connecting to 10.9.137.250:3888... connected. Proxy request sent, awaiting response... 200 OK Length: 19037 (19K) [application/java-archive] Saving to: ‘SPARK2_ON_YARN-2.3.0.cloudera4.jar’ 100%[=========================================================================================================================================================================>] 19,037 --.-K/s in 0.002s 2018-10-31 00:17:57 (10.4 MB/s) - ‘SPARK2_ON_YARN-2.3.0.cloudera4.jar’ saved [19037/19037] [root@node101 csd]# [root@node101 csd]# ll total 20 -rw-r--r--. 1 root root 19037 Oct 5 05:45 SPARK2_ON_YARN-2.3.0.cloudera4.jar [root@node101 csd]#
4>.更改权限,让其属于cloudera-scm用户
[root@node101 csd]# ll total 20 -rw-r--r--. 1 root root 19037 Oct 5 05:45 SPARK2_ON_YARN-2.3.0.cloudera4.jar [root@node101 csd]# [root@node101 csd]# [root@node101 csd]# id cloudera-scm uid=997(cloudera-scm) gid=995(cloudera-scm) groups=995(cloudera-scm) [root@node101 csd]# [root@node101 csd]# [root@node101 csd]# chown cloudera-scm:cloudera-scm SPARK2_ON_YARN-2.3.0.cloudera4.jar [root@node101 csd]# [root@node101 csd]# ll total 20 -rw-r--r--. 1 cloudera-scm cloudera-scm 19037 Oct 5 05:45 SPARK2_ON_YARN-2.3.0.cloudera4.jar [root@node101 csd]#
二.下载spark2.3的parcel安装包
和CDH集成kafka的套路一样,我们在安装spark版本的时候也需要下载相应的parcel的jar包。下载地址:http://archive.cloudera.com/spark2/parcels/
1>.选择spark的版本,它需要和上面的csd的版本对应上,当然也得和你的操作系统的版本对应上。
2>.