Hadoop安装LZO

目录


环境
安装&编译
配置Hadoop支持LZO
安装LZO命令
LZO命令说明

环境


安装&编译

1.安装依赖并下载lzo包
[hadoop@192 sbin]$ sudo yum -y install lzo-devel zlib-devel gcc autoconf automake libtool
[hadoop@192 software]$ wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.10.tar.gz
2.解压并编译LZO包
[hadoop@192 software]$ tar -zxvf lzo-2.10.tar.gz -C ~/source
[hadoop@192 source]$ cd lzo-2.10/
[hadoop@192 lzo-2.10]$ ./configure --enable-shared --prefix /usr/local/lzo-2.10
[hadoop@192 lzo-2.10]$ make && sudo make install
3.下载并解压Hadoop-lzo
[hadoop@192 software]$ wget https://github.com/twitter/hadoop-lzo/archive/master.zip
# 解压
[hadoop@192 software]$ unzip master -d ~/source
4.修改master下的pom添加cloudera仓库
[hadoop@192 hadoop-lzo-master]$ pwd
/home/hadoop/source/hadoop-lzo-master
[hadoop@192 hadoop-lzo-master]$ ll
total 76
-rw-rw-r--. 1 hadoop hadoop 35147 Oct 13  2017 COPYING
-rw-rw-r--. 1 hadoop hadoop 19753 Oct 13  2017 pom.xml
-rw-rw-r--. 1 hadoop hadoop 10170 Oct 13  2017 README.md
drwxrwxr-x. 2 hadoop hadoop  4096 Oct 13  2017 scripts
drwxrwxr-x. 4 hadoop hadoop  4096 Oct 13  2017 src
[hadoop@192 hadoop-lzo-master]$ vi pom.xml
<repository>
   <id>cloudera-repo</id>
   <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
</repository>
5.修改hadoop版本
<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <hadoop.current.version>2.6.0-cdh5.7.0</hadoop.current.version>
    <hadoop.old.version>1.0.4</hadoop.old.version>
</properties>
6.编译hadoop-lzo,时间比较久耐心等一下
[hadoop@192 hadoop-lzo-master]$ C_INCLUDE_PATH=/usr/local/lzo-2.10/include \
>   LIBRARY_PATH=/usr/local/lzo-2.10/lib \
>   mvn clean package

# 复制jar包到hadoop目录
[hadoop@192 target]$ cp hadoop-lzo-0.4.21-SNAPSHOT.jar $HADOOP_HOME/share/hadoop/common/

配置Hadoop支持LZO

1.修改$HADOOP_HOME/etc/hadoop下的hadoop-env.sh添加环境变量
export LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib
2.修改core-site.xml

这边注意格式,此处是为了方便阅读才断开,value内请全部一行一行正常写下去,否则会报Exception in thread “main” java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.BZipCodec not found.此种类型错误

<property>
    <name>io.compression.codecs</name>
    <value>org.apache.hadoop.io.compress.GzipCodec,
           org.apache.hadoop.io.compress.DefaultCodec,
           com.hadoop.compression.lzo.LzoCodec,
           com.hadoop.compression.lzo.LzopCodec,
           org.apache.hadoop.io.compress.BZip2Codec
        </value>
</property>
<property>
    <name>io.compression.codec.lzo.class</name>
    <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
3.修改mapred-site.xml
<property>
    <name>mapred.compress.map.output</name>
    <value>true</value>
</property>
<property>
    <name>mapred.map.output.compression.codec</name>
    <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
<property>
    <name>mapred.child.env</name>
    <value>LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib</value>
</property>

安装完后需要安装lzo命令

[hadoop@192 data]$ sudo yum install lzop
[sudo] password for hadoop: 
Loaded plugins: fastestmirror, refresh-packagekit, security
Loading mirror speeds from cached hostfile
 * base: mirrors.cn99.com
 * extras: mirrors.aliyun.com
 * updates: mirrors.163.com
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package lzop.x86_64 0:1.02-0.9.rc1.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=========================================================================================================================================================
 Package                          Arch                               Version                                      Repository                        Size
=========================================================================================================================================================
Installing:
 lzop                             x86_64                             1.02-0.9.rc1.el6                             base                              50 k

Transaction Summary
=========================================================================================================================================================
Install       1 Package(s)

Total download size: 50 k
Installed size: 93 k
Is this ok [y/N]: y
Downloading Packages:
lzop-1.02-0.9.rc1.el6.x86_64.rpm                                                                                                  |  50 kB     00:00     
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : lzop-1.02-0.9.rc1.el6.x86_64                                                                                                          1/1 
  Verifying  : lzop-1.02-0.9.rc1.el6.x86_64                                                                                                          1/1 

Installed:
  lzop.x86_64 0:1.02-0.9.rc1.el6                                                                                                                         

Complete!


lzop命令说明

命令 说明
lzop -v test 创建test.lzo压缩文件,输出详细信息,保留test文件不变
lzop -Uv test 创建test.lzo压缩文件,输出详细信息,删除test文件
lzop -t test.lzo 测试test.lzo压缩文件的完整性
lzop –info test.lzo 列出test.lzo中各个文件的文件头
lzop -l test.lzo 列出test.lzo中各个文件的压缩信息
lzop –ls test.lzo 列出test.lzo文件的内容,同ls -l功能
cat test | lzop > t.lzo 压缩标准输入并定向到标准输出
lzop -dv test.lzo 解压test.lzo得到test文件,输出详细信息,保留test.lzo不变

参考文档:
https://github.com/twitter/hadoop-lzo
http://linux.51yip.com/search/lzop

猜你喜欢

转载自blog.csdn.net/aubekpan/article/details/86983337