JAVA 读取lzo压缩文件

JAVA 读取lzo压缩文件

今天犯了一个愚蠢的问题,用lzo做过压缩的数据,用lzop去读,但疼痛了好一会儿。

lzopcode和lzocode的做个简单介绍:

1.lzocode压缩过的文件都是以.lzo_deflate结尾,相应的加载类:(com.hadoop.compression.lzo.LzoCodec)
2.zopcode压缩过的文件都以.lzo结尾(com.hadoop.compression.lzo.LzopCodec)

读取lzocode文件

private static Configuration conf = new Configuration(true); 
private static FileSystem hdfs; 
private static Class<?> codecClass ;
private static CompressionCodec codec;
static { 
        String path = "/usr/local/webserver/hadoop/etc/hadoop/"; 
        conf.addResource(new Path(path + "core-site.xml")); 
        conf.addResource(new Path(path + "hdfs-site.xml")); 
//加载解压lzo的class,对应的还有lzop的class
        codecClass = Class.forName("com.hadoop.compression.lzo.LzoCodec");
        codec = (CompressionCodec)ReflectionUtils.newInstance(codecClass, conf);

public List<String> readFile(String dir) {
        InputStream input = null;
        List<String> list = new LinkedList<String>();
        try {
            Path path = new Path(dir);
            FileSystem hdfs = FileSystem.get(URI.create(dir),conf);
            //获取hdsf上文件夹下面的文件
            FileStatus[] fileStatus = hdfs.listStatus(path);
            //遍历文件,逐一读取内容
            for (int i = 0; i < fileStatus.length; i++) {
                input = hdfs.open(new Path(fileStatus[i].getPath().toString()));
                //解压缩流
                input = codec.createInputStream(input);
                list.addAll(IOUtils.readLines(input,"utf8"));
            }
        } catch (IOException e) {
            e.printStackTrace();
        }finally{
            try {
                if(input != null)
                    input.close();
                hdfs.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        return list;
    }

猜你喜欢

转载自adofu.iteye.com/blog/2264390