ZipFile v.s. ZipInputStream in java.util.zip

网上的评论摘抄, 可以猛戳网址, 这位仁兄也是翻译的国外网站的相关评论:

类ZipInputStream读出ZIP文件序列（简单地说就是读出这个ZIP文件压缩了多少文件），而类ZipFile使用内嵌的随机文件访问机制读出其中的文件内容，所以不必顺序的读出ZIP压缩文件序列。
ZipInputStream和ZipFile之间另外一个基本的不同点在于高速缓冲的使用方面。当文件使用ZipInputStream和FileInputStream流读出的时候，ZIP条目不使用高速缓冲。然而，如果使用ZipFile（文件名）来打开文件，它将使用内嵌的高速缓冲，所以如果ZipFile（文件名）被重复调用的话，文件只被打开一次。缓冲值在第二次打开时使用。如果你工作在UNIX系统下，这是什么作用都没有的，因为使用ZipFile打开的所有ZIP文件都在内存中存在映射，所以使用ZipFile的性能优于ZipInputStream。然而，如果同一ZIP文件的内容在程序执行期间经常改变，或是重载的话，使用ZipInputStream就成为你的首选了。

我用jmh做了一个简单性能测试, 下面是代码, 备注一下:

package com;
/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements. See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache license, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License. You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the license for the specific language governing permissions and
 * limitations under the license.
 */

import java.io.File;
import java.util.concurrent.TimeUnit;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Threads;
import org.openjdk.jmh.annotations.Warmup;

/**
 * @author John Kenrinus Lee
 * @version 2016-05-07
 */
@Fork(1)
@Warmup(iterations = 4, time = 4)
@Measurement(iterations = 4, time = 4)
@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class UnzipTest {
    @Benchmark
    @Threads(1)
    public void zipinputstream() {
        File file = new File("/Users/temp/temp/weizhi21");
        ZipUtils.deleteExistsWithNoPermissionCheck(file);
        assert file.mkdirs();
        try {
            ZipUtils.unzipFileByZipInputStream(new File("/Users/temp/temp/weizhi2/temp.zip"),
                    "/Users/temp/temp/weizhi21", true, null, null);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    @Benchmark
    @Threads(1)
    public void zipfile() {
        File file = new File("/Users/temp/temp/weizhi11");
        ZipUtils.deleteExistsWithNoPermissionCheck(file);
        assert file.mkdirs();
        try {
            ZipUtils.unzipFile(new File("/Users/temp/temp/weizhi1/temp.zip"),
                    "/Users/temp/temp/weizhi11", true, null, null);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

其中ZipUtils可从我的github上下载, 这里只有部分java类达到可用的状态, 其他都属实验性状态, 代码比较凌乱, 强烈建议必须无视它.

结论: [ZipFile优于ZipInputStream]

    // Benchmark                     Mode  Samples         Score          Error  Units
    // UnzipTest.zipfile             avgt        4  70085414.616 ± 19146631.125  ns/op
    // UnzipTest.zipinputstream      avgt        4  81746111.179 ± 23409444.648  ns/op
    // UnzipTest.zipinputstream      avgt        4  78950748.131 ± 48377116.992  ns/op
    // UnzipTest.zipfile             avgt        4  68650953.926 ± 24305554.065  ns/op

备注:
tar.gz/tgz压缩方式[tar+gzip]兼容性和压缩比虽高, 但目前看Java JDK, android SDK不自带相关类库, 需要引入apache-ant.jar, 对于手机这种运算能力的设备也稍差, 而且也不如zip通用, 暂不去深究了;

ZipFile v.s. ZipInputStream in java.util.zip

猜你喜欢