Hadoop2.7.3 source code is compiled into a compressed package that supports Snappy, and installed to build a Hadoop cluster
prologue
Prepare resources
Note: Use root role compilation to reduce folder permissions problems
apache-maven-3.6.3-bin.tar.gz
下载地址: https://maven.apache.org/download.cgi
hadoop-2.7.3-src.tar.gz
下载地址: https://archive.apache.org/dist/hadoop/common/
protobuf-2.5.0.tar.gz
下载地址: https://github.com/protocolbuffers/protobuf/releases?after=v3.0.0-alpha-3.1
snappy-1.1.3.tar.gz
下载地址: https://github.com/google/snappy/releases
Hadoop related resources, these are also
1.maven installation
Upload the installation package to Linux and decompress it with commands
tar -zxvf apache-maven-3.6.3-bin.tar.gz // 解压到当前目录下
In order to avoid slow import dependencies later
cd apache-maven-3.6.3/conf/
Modify the settings.xml file in this directory
<mirrors>
<!-- mirror
| Specifies a repository mirror site to use instead of a given repository. The repository that
| this mirror serves has an ID that matches the mirrorOf element of this mirror. IDs are used
| for inheritance and direct lookup purposes, and must be unique across the set of mirrors.
| this mirror serves has an ID that matches the mirrorOf element of this mirror. IDs are used
| for inheritance and direct lookup purposes, and must be unique across the set of mirrors.
|
<mirror>
<id>mirrorId</id>
<mirrorOf>repositoryId</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://my.repository.com/repo/path</url>
</mirror>
-->
<!--修改为阿里镜像-->
<mirror>
<id>alimaven</id>
<name>aliyun maven</name>
<url>http://maven.aliyun.com/nexus/content/groups/public/</url>
<mirrorOf>central</mirrorOf>
</mirror>
</mirrors>
The location of the maven default library is
/root/.m2/repository/
If you want to modify the maven library, you can add it in the settings.xml file
<localRepository>/usr/Software/Maven/repository</localRepository> //路径自己写
Add environment variables
vi /etc/profile
export MAVEN_HOME=/usr/Software/Maven/apache-maven-3.6.3
export PATH=$PATH:$MAVEN_HOME/bin
保存退出
source /etc/profile
Check whether the environment variable is valid
mvn -version
2. Prepare the compilation environment
yum install svn
yum install autoconf automake libtool cmake
yum install ncurses-devel
yum install openssl-devel
yum install gcc*
Install snappy,
upload the installation package to linux and unzip
tar -zxvf snappy-1.1.3.tar.gz
cd snappy-1.1.3/
分别执行
该命令会在 /usr/local/ 下生成bin、include、lib目录
./configure
make
make install
如果lib下生成libsnappy相关文件即为成功
Install protobuf,
upload the installation package to linux and unzip
tar -zxvf protobuf-2.5.0.tar.gz
cd protobuf-2.5.0/
分别执行
./configure
make
make install
检测protobuf安装是否成功
执行命令
protoc --version
3. Compile Hadoop2.7.3 source code that supports Snappy
Similarly, upload hadoop-2.7.3-src.tar.gz
解压
tar -zxvf hadoop-2.7.3-src.tar.gz
cd hadoop-2.7.3-src/
mvn clean package -DskipTests -Pdist,native -Dtar -Dsnappy.lib=/usr/local/lib -Dbundle.snappy
该项目打包过程漫长(本人打包花了两个多小时),耐心等待即可,后面我会提供支持Snappy压缩的安装包
After successful execution, hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3.tar.gz is the generated snappy installation package.
Add a question
org.apache.maven.plugin.MojoExecutionException:
protoc version is 'libprotoc x.x.x', expected version is '2.5.0' -> [Help 1
Encountered such an error indicates that the protobuf version required to package Hadoop is 2.5.0, which is inconsistent with the current version
4. Install a Hadoop cluster that supports Snappy
Unzip the hadoop-2.7.3.tar.gz generated above
tar -zxvf hadoop-2.7.3.tar.gz
If the Hadoop cluster has been built before
cd hadoop-2.7.3/lib/native/
将该目录下的所有文件复制到之前已经搭建好的Hadoop目录下的 lib/native/ 下
Start the Hadoop cluster
Excuting an order
hadoop checknative
Seeing that snappy corresponds to true is success
If you haven't built a Hadoop cluster, see here
Hadoop HA cluster construction
5. Hadoop installation package supporting Snappy
Link: https://pan.baidu.com/s/101n4HOClGdJu6oNTb8Kpbg
Extraction code: ta8l