目录
环境
安装&编译
配置Hadoop支持LZO
安装LZO命令
LZO命令说明
环境
- Centos 6.5 64位
- JDK1.8
- Hadoop-2.6.0-cdh5.7.0
- LZO2.1.0包
- hadoop-lzo
- Maven3.6.0
安装&编译
1.安装依赖并下载lzo包
[hadoop@192 sbin]$ sudo yum -y install lzo-devel zlib-devel gcc autoconf automake libtool
[hadoop@192 software]$ wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.10.tar.gz
2.解压并编译LZO包
[hadoop@192 software]$ tar -zxvf lzo-2.10.tar.gz -C ~/source
[hadoop@192 source]$ cd lzo-2.10/
[hadoop@192 lzo-2.10]$ ./configure --enable-shared --prefix /usr/local/lzo-2.10
[hadoop@192 lzo-2.10]$ make && sudo make install
3.下载并解压Hadoop-lzo
[hadoop@192 software]$ wget https://github.com/twitter/hadoop-lzo/archive/master.zip
# 解压
[hadoop@192 software]$ unzip master -d ~/source
4.修改master下的pom添加cloudera仓库
[hadoop@192 hadoop-lzo-master]$ pwd
/home/hadoop/source/hadoop-lzo-master
[hadoop@192 hadoop-lzo-master]$ ll
total 76
-rw-rw-r--. 1 hadoop hadoop 35147 Oct 13 2017 COPYING
-rw-rw-r--. 1 hadoop hadoop 19753 Oct 13 2017 pom.xml
-rw-rw-r--. 1 hadoop hadoop 10170 Oct 13 2017 README.md
drwxrwxr-x. 2 hadoop hadoop 4096 Oct 13 2017 scripts
drwxrwxr-x. 4 hadoop hadoop 4096 Oct 13 2017 src
[hadoop@192 hadoop-lzo-master]$ vi pom.xml
<repository>
<id>cloudera-repo</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
</repository>
5.修改hadoop版本
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<hadoop.current.version>2.6.0-cdh5.7.0</hadoop.current.version>
<hadoop.old.version>1.0.4</hadoop.old.version>
</properties>
6.编译hadoop-lzo,时间比较久耐心等一下
[hadoop@192 hadoop-lzo-master]$ C_INCLUDE_PATH=/usr/local/lzo-2.10/include \
> LIBRARY_PATH=/usr/local/lzo-2.10/lib \
> mvn clean package
# 复制jar包到hadoop目录
[hadoop@192 target]$ cp hadoop-lzo-0.4.21-SNAPSHOT.jar $HADOOP_HOME/share/hadoop/common/
配置Hadoop支持LZO
1.修改$HADOOP_HOME/etc/hadoop下的hadoop-env.sh添加环境变量
export LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib
2.修改core-site.xml
这边注意格式,此处是为了方便阅读才断开,value内请全部一行一行正常写下去,否则会报Exception in thread “main” java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.BZipCodec not found.此种类型错误
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec,
org.apache.hadoop.io.compress.BZip2Codec
</value>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
3.修改mapred-site.xml
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
<property>
<name>mapred.child.env</name>
<value>LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib</value>
</property>
安装完后需要安装lzo命令
[hadoop@192 data]$ sudo yum install lzop
[sudo] password for hadoop:
Loaded plugins: fastestmirror, refresh-packagekit, security
Loading mirror speeds from cached hostfile
* base: mirrors.cn99.com
* extras: mirrors.aliyun.com
* updates: mirrors.163.com
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package lzop.x86_64 0:1.02-0.9.rc1.el6 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
=========================================================================================================================================================
Package Arch Version Repository Size
=========================================================================================================================================================
Installing:
lzop x86_64 1.02-0.9.rc1.el6 base 50 k
Transaction Summary
=========================================================================================================================================================
Install 1 Package(s)
Total download size: 50 k
Installed size: 93 k
Is this ok [y/N]: y
Downloading Packages:
lzop-1.02-0.9.rc1.el6.x86_64.rpm | 50 kB 00:00
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Installing : lzop-1.02-0.9.rc1.el6.x86_64 1/1
Verifying : lzop-1.02-0.9.rc1.el6.x86_64 1/1
Installed:
lzop.x86_64 0:1.02-0.9.rc1.el6
Complete!
lzop命令说明
命令 | 说明 |
---|---|
lzop -v test | 创建test.lzo压缩文件,输出详细信息,保留test文件不变 |
lzop -Uv test | 创建test.lzo压缩文件,输出详细信息,删除test文件 |
lzop -t test.lzo | 测试test.lzo压缩文件的完整性 |
lzop –info test.lzo | 列出test.lzo中各个文件的文件头 |
lzop -l test.lzo | 列出test.lzo中各个文件的压缩信息 |
lzop –ls test.lzo | 列出test.lzo文件的内容,同ls -l功能 |
cat test | lzop > t.lzo | 压缩标准输入并定向到标准输出 |
lzop -dv test.lzo | 解压test.lzo得到test文件,输出详细信息,保留test.lzo不变 |
参考文档:
https://github.com/twitter/hadoop-lzo
http://linux.51yip.com/search/lzop