Hadoop源码编译支持压缩

Hadoop-2.6.0-CDH-5.7.0版本编译:
软件环境:

组件名称 组件版本 百度网盘链接
Hadoop Hadoop-2.6.0-cdh5.7.0-src.tar.gz 链接:https://pan.baidu.com/s/1jZKj0Nr0Za8308b7ZnGARQ 提取码:e12q
jdk jdk-7u80-linux-x64.tar.gz 链接:https://pan.baidu.com/s/17Iv3opVv6w4WaS5RzrSVHQ 提取码:pgem
maven apache-maven-3.3.9-bin.tar.gz 链接:https://pan.baidu.com/s/1HMKRjgMXdliKLJaAtuJilQ 提取码:gy6v
protobuf protobuf-2.5.0.tar.gz 链接:https://pan.baidu.com/s/1wt4-7IesGBqQzV0IyDda3A 提取码:tjjz

注意:编译的JDK版本必须是1.7,1.8的JDK会导致编译失败

编译hadoop
1、安装必要的依赖库

[root@hadoop-01 ~]# yum install -y svn ncurses-devel
[root@hadoop-01 ~]# yum install -y gcc gcc-c++ make cmake
[root@hadoop-01 ~]# yum install -y openssl openssl-devel svn ncurses-devel zlib-devel libtool
[root@hadoop-01 ~]# yum install -y snappy snappy-devel bzip2 bzip2-devel lzo lzo-devel lzop autoconf automake cmake 

2、下载并上传软件

[hadoop@hadoop-01 ~]$ mkdir app soft source lib data maven_repo shell mysql
[hadoop@hadoop-01 ~]$ cd soft/
[hadoop@hadoop-01 soft]$ rz

[hadoop@hadoop-01 soft]$ ll
total 202192
-rw-r--r-- 1 hadoop hadoop   8491533 Apr 19 13:20 apache-maven-3.3.9-bin.tar.gz
-rw-r--r-- 1 hadoop hadoop  42610549 Apr 19 16:41 hadoop-2.6.0-cdh5.7.0-src.tar.gz
-rw-r--r-- 1 hadoop hadoop 153530841 Apr 19 15:55 jdk-7u80-linux-x64.tar.gz
-rw-r--r-- 1 hadoop hadoop   2401901 Apr 19 15:56 protobuf-2.5.0.tar.gz

3、安装JDK
解压安装包,安装目录必须是/usr/java,安装后记得修改拥有者为root

[hadoop@hadoop-01 soft]$ exit
[root@hadoop-01 ~]# mkdir /usr/java
[root@hadoop-01 ~]# tar -zxvf /home/hadoop/soft/jdk-7u80-linux-x64.tar.gz -C /usr/java
[root@hadoop-01 ~]# cd /usr/java/
[root@hadoop-01 java]# chown -R  root:root jdk1.7.0_80

添加环境变量

[root@hadoop-01 jdk1.7.0_80]# vim /etc/profile 
#添加如下两行环境变量
export JAVA_HOME=/usr/java/jdk1.7.0_80
export PATH=$JAVA_HOME/bin:$PATH
[root@hadoop-01 jdk1.7.0_80]# source /etc/profile
#测试java是否安装成功
[root@hadoop-01 jdk1.7.0_80]# java -version
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)

4、安装maven

[root@hadoop-01 ~]# su - hadoop
[hadoop@hadoop-01 ~]$ tar -zxvf ~/soft/apache-maven-3.3.9-bin.tar.gz -C ~/app/

添加环境变量:

#修改hadoop用户的环境变量
[hadoop@hadoop-01 ~]$ vim ~/.bash_profile
#添加或修改如下内容,注意MAVEN_OPTS设置了maven运行的内存,防止内存太小导致编译失败
export MAVEN_HOME=/home/hadoop/app/apache-maven-3.3.9
export MAVEN_OPTS="-Xms1024m -Xmx1024m"
export PATH=$MAVEN_HOME/bin:$PATH
[hadoop@hadoop-01 ~]$ source ~/.bash_profile
[hadoop@hadoop-01 ~]$ which mvn
~/app/apache-maven-3.3.9/bin/mvn

配置maven

[hadoop@hadoop-01 protobuf-2.5.0]$ vim ~/app/apache-maven-3.3.9/conf/settings.xml
#配置maven的本地仓库位置
<localRepository>/home/hadoop/maven_repo/repo</localRepository>
#添加阿里云中央仓库地址,注意一定要写在<mirrors></mirrors>之间
<mirror>
     <id>nexus-aliyun</id>
     <mirrorOf>central</mirrorOf>
     <name>Nexus aliyun</name>
     <url>http://maven.aliyun.com/nexus/content/groups/public</url>
</mirror>

(可选)添加jars到本地仓库,网络慢可能导致mvn第一次编译时下载需要超长的时间甚至编译失败

#jar包链接
链接:https://pan.baidu.com/s/17Q6QJw838oMExjo8T5w9YA 提取码:z69x 
#下载后 rz上传解压,注意目录层次
[hadoop@hadoop-01 maven_repo]$ rz
[hadoop@hadoop-01 maven_repo]$ tar -zxvf repo.tar.gz 

5、安装protobuf
解压

[hadoop@hadoop-01 ~]$ tar -zxvf ~/soft/protobuf-2.5.0.tar.gz -C ~/app/

编译软件

[hadoop@hadoop-01 protobuf-2.5.0]$ cd ~/app/protobuf-2.5.0/
#  --prefix= 是用来待会编译好的包放在为路径
[hadoop@hadoop-01 protobuf-2.5.0]$ ./configure  --prefix=/home/hadoop/app/protobuf-2.5.0
#编译以及安装
[hadoop@hadoop-01 protobuf-2.5.0]$ make
[hadoop@hadoop-01 protobuf-2.5.0]$ make install

添加环境变量

[hadoop@hadoop-01 protobuf-2.5.0]$ vim ~/.bash_profile
#追加如下两行内容,未编译前是没有bin目录的
export PROTOBUF_HOME=/home/hadoop/app/protobuf-2.5.0
export PATH=$PROTOBUF_HOME/bin:$PATH
[hadoop@hadoop-01 protobuf-2.5.0]$ source ~/.bash_profile 
#测试是否生效,若出现libprotoc 2.5.0则为生效
[hadoop@hadoop-01 protobuf-2.5.0]$ protoc --version
libprotoc 2.5.0

6、编译hadoop
解压

[hadoop@hadoop-01 protobuf-2.5.0]$ tar -zxvf ~/soft/hadoop-2.6.0-cdh5.7.0-src.tar.gz -C ~/source/

编译hadoop使其支持压缩:mvn clean package -Pdist,native -DskipTests -Dtar

#进入hadoop的源码目录
[hadoop@hadoop-01 hadoop-2.6.0-cdh5.7.0]$ cd ~/source/hadoop-2.6.0-cdh5.7.0/
#进行编译,第一次编译会下载很多依赖的jar包,快慢由网速决定,需耐心等待,本人亲测耗时
[hadoop@hadoop-01 hadoop-2.6.0-cdh5.7.0]$ mvn clean package -Pdist,native -DskipTests -Dtar

若报异常,主要信息如下(无异常跳过):

[FATAL] Non-resolvable parent POM for org.apache.hadoop:hadoop-main:2.6.0-cdh5.7.0: Could not transfer artifact com.cloudera.cdh:cdh-root:pom:5.7.0 from/to cdh.repo (https://repository.cloudera.com/artifactory/cloudera-repos): Remote host closed connectio
#分析:是https://repository.cloudera.com/artifactory/cloudera-repos/com/cloudera/cdh/cdh-root/5.7.0/cdh-root-5.7.0.pom文件下载不了,但是虚拟机确实是ping通远程的仓库,很是费解为什么。
#解决方案:前往本地仓库到目标文件目录,然后 通过wget 文件,来成功获取该文件,重新执行编译命令,或者执行4.5的可选步骤,将需要的jar直接放到本地仓库

查看编译后的包:hadoop-2.6.0-cdh5.7.0.tar.gz

#有 BUILD SUCCESS 信息则表示编译成功
[INFO] Apache Hadoop Pipes ................................ SUCCESS [  7.687 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [  6.518 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [  6.997 s]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [  5.443 s]
[INFO] Apache Hadoop Client ............................... SUCCESS [  5.799 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [  1.712 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [  5.434 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [  9.530 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [  0.054 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 38.677 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 15:15 min
[INFO] Finished at: 2019-04-20T09:16:21+08:00
[INFO] Final Memory: 240M/978M
[INFO] ------------------------------------------------------------------------
[hadoop@hadoop-01 hadoop-2.6.0-cdh5.7.0]$ ll /home/hadoop/source/hadoop-2.6.0-cdh5.7.0/hadoop-dist/target/
total 563944
drwxrwxr-x 2 hadoop hadoop      4096 Apr 20 09:15 antrun
drwxrwxr-x 3 hadoop hadoop      4096 Apr 20 09:15 classes
-rw-rw-r-- 1 hadoop hadoop      1998 Apr 20 09:15 dist-layout-stitching.sh
-rw-rw-r-- 1 hadoop hadoop       690 Apr 20 09:15 dist-tar-stitching.sh
drwxrwxr-x 9 hadoop hadoop      4096 Apr 20 09:15 hadoop-2.6.0-cdh5.7.0
-rw-rw-r-- 1 hadoop hadoop 191839625 Apr 20 09:15 hadoop-2.6.0-cdh5.7.0.tar.gz
-rw-rw-r-- 1 hadoop hadoop      7315 Apr 20 09:15 hadoop-dist-2.6.0-cdh5.7.0.jar
-rw-rw-r-- 1 hadoop hadoop 385571341 Apr 20 09:16 hadoop-dist-2.6.0-cdh5.7.0-javadoc.jar
-rw-rw-r-- 1 hadoop hadoop      4855 Apr 20 09:15 hadoop-dist-2.6.0-cdh5.7.0-sources.jar
-rw-rw-r-- 1 hadoop hadoop      4855 Apr 20 09:15 hadoop-dist-2.6.0-cdh5.7.0-test-sources.jar
drwxrwxr-x 2 hadoop hadoop      4096 Apr 20 09:15 javadoc-bundle-options
drwxrwxr-x 2 hadoop hadoop      4096 Apr 20 09:15 maven-archiver
drwxrwxr-x 3 hadoop hadoop      4096 Apr 20 09:15 maven-shared-archive-resources
drwxrwxr-x 3 hadoop hadoop      4096 Apr 20 09:15 test-classes
drwxrwxr-x 2 hadoop hadoop      4096 Apr 20 09:15 test-dir
[hadoop@hadoop-01 hadoop-2.6.0-cdh5.7.0]$ 

检查编译后是否支持压缩
在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/weixin_43212365/article/details/89412879