hadoop-0.20.203启用LZO压缩 安装成功

#准备各安装包,并scp到各节点
pwd
/work/lzo
#scp ./* node-host:/work/lzo
ls -l
总计 3240
-rw-r--r--  1 root root 2176215 07-13 16:12 hadoop-gpl-packaging-0.2.8-1.x86_64.rpm
drwxr-xr-x 13 root root    4096 07-13 16:23 lzo-2.06
-rw-r--r--  1 root root  141887 07-13 16:12 lzo-2.06-1.el5.rf.x86_64.rpm
-rw-r--r--  1 root root  583045 07-13 16:12 lzo-2.06.tar.gz
-rw-r--r--  1 root root   32402 07-13 16:12 lzo-devel-2.06-1.el5.rf.x86_64.rpm
-rw-r--r--  1 root root  370775 07-13 16:12 lzop-1.03.tar.gz

#安装lzo,各个节点
tar -xzvf lzo-2.06.tar.gz
cd lzo-2.06
./configure --enable-shared
make
make install
#如果是64位系统,cp到/usr/lib64下,32则cp到/usr/lib,  
#也可编辑/etc/ld.so.conf,加入/usr/local/lib/后,执行/sbin/ldconfig 
#或在/etc/ld.so.conf.d/目录下新建lzo.conf文件,写入lzo库文件的路径,然后运行/sbin/ldconfig -v,使配置生效
cp /usr/local/lib/liblzo2.so* /usr/lib64/

cd ..
rpm -ivh lzo-2.06-1.el5.rf.x86_64.rpm
rpm -ivh lzo-devel-2.06-1.el5.rf.x86_64.rpm

#安装lzop
tar -xzvf lzop-1.03.tar.gz
cd lzop-1.03
./configure
make
make install
which lzop
#/usr/local/bin/lzop

cd ..
rpm -e hadoop-gpl-packaging
rpm -Uvh hadoop-gpl-packaging-0.2.8-1.x86_64.rpm

cp /opt/hadoopgpl/lib/hadoop-lzo.jar $HADOOP_HOME/lib
cp /opt/hadoopgpl/native/Linux-amd64-64/* $HADOOP_HOME/lib/native/Linux-amd64-64


#编辑hadoop配置文件并scp到各节点
vi $HADOOP_HOME/conf/core-site.xml
<property>     
<name>io.compression.codecs</name>     
<value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value>
</property>
<property>     
<name>io.compression.codec.lzo.class</name>     
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

vi $HADOOP_HOME/conf/mapred-site.xml
<property>
<name>mapred.compress.map.output</name>     
<value>true</value>   
</property>   
<property>     
<name>mapred.map.output.compression.codec</name>      
<value>com.hadoop.compression.lzo.LzoCodec</value>   
</property>

运行mapreduce出现问题:

lzo.LzoCompressor: java.lang.UnsatisfiedLinkError: Cannot load liblzo2.so.2 (liblzo2.so.2: cannot open shared object file: No such file or directory)!

google一下发现测试系统居然装的32位java。。。

file $JAVA_HOME/bin/java
/usr/share/jdk1.6.0_30/bin/java: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), for GNU/Linux 2.2.5, not stripped

 重装

整理后的安装hadoop lzo脚本如下

#!/bin/sh

echo 'installing lzo >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>'
tar -xzvf lzo-2.06.tar.gz  
cd lzo-2.06  
./configure --enable-shared  
make  
make install
cp /usr/local/lib/liblzo2.so* /usr/lib64/

cd ..  
rpm -ivh lzo-2.06-1.el5.rf.x86_64.rpm  
rpm -ivh lzo-devel-2.06-1.el5.rf.x86_64.rpm

echo 'installing lzop >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>'
tar -xzvf lzop-1.03.tar.gz  
cd lzop-1.03  
./configure  
make  
make install

cd ..  
echo 'installing hadoop-gpl-packaging >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>'
rpm -e hadoop-gpl-packaging  
rpm -Uvh hadoop-gpl-packaging-0.2.8-1.x86_64.rpm

cp /opt/hadoopgpl/lib/hadoop-lzo.jar $HADOOP_HOME/lib  
cp /opt/hadoopgpl/native/Linux-amd64-64/* $HADOOP_HOME/lib/native/Linux-amd64-64

 

hbase启用lzo压缩,同样需要copy相应的library:

cd $HBASE_HOME/lib
mkdir -p native/Linux-amd64-64
cp /opt/hadoopgpl/lib/hadoop-lzo.jar $HBASE_HOME/lib/
cp /opt/hadoopgpl/native/Linux-amd64-64/* $HBASE_HOME/lib/native/Linux-amd64-64/

hbase lzo 测试:

create 'lzotest', {NAME=>'cf', COMPRESSION=>'lzo'}
put 'lzotest', 'row-1', 'cf:col-1', 'val-1'
put 'lzotest', 'row-2', 'cf:col-2', 'val-2'
put 'lzotest', 'row-3', 'cf', 'val-3'
put 'lzotest', 'row-4', 'cf:col-1', 'val-4'

scan 'lzotest'
ROW                                         COLUMN+CELL                                                                                                                 
 row-1                                      column=cf:col-1, timestamp=1342424266301, value=val-1                                                                       
 row-2                                      column=cf:col-2, timestamp=1342424275314, value=val-2                                                                       
 row-3                                      column=cf:, timestamp=1342424286206, value=val-3                                                                            
 row-4                                      column=cf:col-1, timestamp=1342424315516, value=val-4                                                                       
4 row(s) in 0.0750 seconds

运行hbase测试程序:

[root@master ~]# /work/hbase-0.90.3/bin/hbase org.apache.hadoop.hbase.util.CompressionTest 
Usage: CompressionTest <path> none|gz|lzo

For example:
  hbase class org.apache.hadoop.hbase.util.CompressionTest file:///tmp/testfile gz

[root@master ~]# /work/hbase-0.90.3/bin/hbase org.apache.hadoop.hbase.util.CompressionTest file:///tmp/lzotest/hadoop-root-jobtracker-master.log.2012-06-19 lzo
12/07/16 15:23:58 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
12/07/16 15:23:58 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library
12/07/16 15:23:58 INFO compress.CodecPool: Got brand-new compressor
Exception in thread "main" java.io.IOException: java.lang.AbstractMethodError: com.hadoop.compression.lzo.LzoCompressor.reinit(Lorg/apache/hadoop/conf/Configuration;)V
        at org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:89)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readTrailer(HFile.java:890)
        at org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:819)
        at org.apache.hadoop.hbase.util.CompressionTest.doSmokeTest(CompressionTest.java:112)
        at org.apache.hadoop.hbase.util.CompressionTest.main(CompressionTest.java:133)
Caused by: java.lang.AbstractMethodError: com.hadoop.compression.lzo.LzoCompressor.reinit(Lorg/apache/hadoop/conf/Configuration;)V
        at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:105)
        at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:112)
        at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:199)
        at org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:84)
        ... 4 more

 出错,查看hbase/lib 发现有之前copy的hadoop-gpl-compression-0.1.0.jar,删除后成功

[root@master ~]# /work/hbase-0.90.3/bin/hbase org.apache.hadoop.hbase.util.CompressionTest file:///tmp/lzotest/hadoop-root-jobtracker-master.log.2012-06-19 lzo
12/07/16 15:42:25 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
12/07/16 15:42:25 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev 6cbf4e232d7972c94107600567333a372ea08c0a]
12/07/16 15:42:25 INFO compress.CodecPool: Got brand-new compressor
SUCCESS

 

参考文章:

http://blog.csdn.net/liuzhoulong/article/details/7179766

http://share.blog.51cto.com/278008/549393

http://wiki.apache.org/hadoop/UsingLzoCompression

http://running.iteye.com/blog/969800

http://code.google.com/a/apache-extras.org/p/hadoop-gpl-compression/wiki/FAQ?redir=1

http://www.tech126.com/hadoop-lzo/

http://code.google.com/p/hadoop-gpl-packing/

http://inprice.iteye.com/blog/1450893

http://jiajiezhuwudl.i.sohu.com/blog/view/223874167.htm

猜你喜欢

转载自zhb-mccoy.iteye.com/blog/1591891