一个关于压缩流InputStream 读取的话题

曾经在网上看到一段代码优化InputStream 读取的，大致是根据流大小来确定每次读取的字节数。代码如下：

DataOutputStream outputStream = null;
try {
    outputStream = new DataOutputStream(new FileOutputStream(file));
    int len = inputStream.available();
    //判断长度是否大于1M
    if (len <= 1024 * 1024) {
        byte[] bytes = new byte[len];
        int rLen = inputStream.read(bytes);
        if (rLen > 0)
            outputStream.write(bytes);
    } else {
        int byteCount;
        //1M逐个读取
        byte[] bytes = new byte[1024 * 1024];
        while ((byteCount = inputStream.read(bytes)) != -1) {
            outputStream.write(bytes, 0, byteCount);
        }
    }
} finally {
    inputStream.close();
    if (outputStream != null)
        outputStream.close();
}

这段代码看着没有什么问题，我也在真实项目中使用过，没有毛病。在一次解析zip流的时候，使用这段代码出现问题啦。内容没有完全写入文件，（但是文件占用字节大小是一致的）一直排查问题开始以为是解析zip文件的方式问题，后面排查到写文件时候。想了一下

inputStream.available()

会不会是这个方法的返回值问题，一直调试没有问题，最后想了下先把这个判断删除，直接while 循环读取，结果成功啦。

后来就一直想是因为什么问题影响的呢。查看 inputStream 源码：

public int read(byte b[], int off, int len) throws IOException {
    if (b == null) {
        throw new NullPointerException();
    } else if (off < 0 || len < 0 || len > b.length - off) {
        throw new IndexOutOfBoundsException();
    } else if (len == 0) {
        return 0;
    }

    int c = read();
    if (c == -1) {
        return -1;
    }
    b[off] = (byte)c;

    int i = 1;
    try {
        for (; i < len ; i++) {
            c = read();
            if (c == -1) {
                break;
            }
            b[off + i] = (byte)c;
        }
    } catch (IOException ee) {
    }
    return i;
}

看到传入的读取的字节长度len 是采用for 一直调用read() 方法当返回-1时间结束循环 read() 方法是抽象方法需要具体实现类去实现再次查阅发现zip解析使用的

java.util.zip.ZipFile

对应inputStream 流实现类是

java.util.zip.ZipFile$ZipFileInflaterInputStream

查看对应read 实现方法的doc解释是这样的

/**
 * Reads uncompressed data into an array of bytes. If <code>len</code> is not
 * zero, the method will block until some input can be decompressed; otherwise,
 * no bytes are read and <code>0</code> is returned.
 * @param b the buffer into which the data is read
 * @param off the start offset in the destination array <code>b</code>
 * @param len the maximum number of bytes read
 * @return the actual number of bytes read, or -1 if the end of the
 *         compressed input is reached or a preset dictionary is needed
 * @exception  NullPointerException If <code>b</code> is <code>null</code>.
 * @exception  IndexOutOfBoundsException If <code>off</code> is negative,
 * <code>len</code> is negative, or <code>len</code> is greater than
 * <code>b.length - off</code>
 * @exception ZipException if a ZIP format error has occurred
 * @exception IOException if an I/O error has occurred
 */

其中对返回值解释中文如下


the actual number of bytes read, or -1 if the end of the compressed input is reached or a preset dictionary is needed

读取的实际字节数，如果达到压缩输入的结尾或需要预设字典，则返回-1

查到这里最终知道答案啦，就是因为压缩流中读到压缩输入的时候也回返回-1 所以我们上面的方法就无法读取压缩流啦。那么我们就需要修改代码来实现

DataOutputStream outputStream = null;
try {
    outputStream = new DataOutputStream(new FileOutputStream(file));
    int len = inputStream.available();
    int byteCount;
    //判断长度是否大于1M
    if (len <= 1024 * 1024) {
        byte[] bytes = new byte[len];
        while ((byteCount = inputStream.read(bytes)) != -1) {
            outputStream.write(bytes, 0, byteCount);
        }
    } else {
        //1M逐个读取
        byte[] bytes = new byte[1024 * 1024];
        while ((byteCount = inputStream.read(bytes)) != -1) {
            outputStream.write(bytes, 0, byteCount);
        }
    }
} finally {
    inputStream.close();
    if (outputStream != null)
        outputStream.close();
}

一个关于压缩流InputStream 读取的话题

猜你喜欢