java 字节流入门（读文件）

导读

写文件基本是单线程顺序写的，用 FileOutputStream 就可以了。但是读文件一般不是顺序读的，为了实现想读哪里读哪里的功能，通常需要 RandomAccessFile。

在我们读文件之前，首先要知道数据起始位置（offset）和长度（length），这样才能使用 RandomAccessFile 的 seek方法移动到数据起始位置，然后将数据读出来。

基本过程就是这样的，很简单：

try (RandomAccessFile raf = new RandomAccessFile(file, "r")) {

            // 数据长度为 10 字节
            int length = 10;

            // 数据起始位置为 offset=5
            int offset = 5;

            // 开辟长度为 length 数组
            byte[] result = new byte[length];

            // 将 RAF 移动到数据起始位置，准备读 length 个字节
            raf.seek(offset);

            // 读 length 个字节
            int count = raf.read(result);
            System.out.println("读了 " + count + " 个字节");

            // 再读一次
            count = raf.read(result);
            System.out.println("读了 " + count + " 个字节");

        } catch (Exception e) {
            e.printStackTrace();
        }

上面这段代码看起来挺完整的，但是，缺少了一个重要环节，验证。当你交给一个人做一件事之后，要验证他是否完成，以及完成的怎么样。这里也一样。

RandomAccessFile 的 read(byte b[]) 方法的定义是这样的（去掉了不重要的doc）：读取 b.length 个字节，放到 b 中，并返回一共读取了多少个字节；当到达文件末尾，没有数据可读时，返回 -1。

/**
     * Reads up to {@code b.length} bytes of data from this file
     * into an array of bytes. This method blocks until at least one byte
     * of input is available.
     * <p>
     * Although {@code RandomAccessFile} is not a subclass of
     * {@code InputStream}, this method behaves in exactly the
     * same way as the {@link InputStream#read(byte[])} method of
     * {@code InputStream}.
     *
     * @param      b   the buffer into which the data is read.
     * @return     the total number of bytes read into the buffer, or
     *             {@code -1} if there is no more data because the end of
     *             this file has been reached.
     * @exception  IOException If the first byte cannot be read for any reason
     * other than end of file, or if the random access file has been closed, or if
     * some other I/O error occurs.
     * @exception  NullPointerException If {@code b} is {@code null}.
     */
    public int read(byte b[]) throws IOException {
        return readBytes(b, 0, b.length);
    }

在程序中每一个细节都是需要注意的。那么，这里为什么要有返回值？

读多少数据是我告诉这个方法的，它又返回给我，这不是有病吗？不是。因为能读出来多少数据是不确定的，即这个方法不能保证一次调用肯定能读出来 b.length 个字节的数据。其中一个原因是：文件没有这么多字节可读。

不确定还有没有其他原因，我在一个4G的文件中单线程随机读取1G以内的数据4万次，返回的值都和要求读取的值一样，但是一个 up to 应该不能为这个方法打包票，即还有其他原因会导致方法无法读取到需要的数据。这里是猜测，暂且命名乔老师猜想。

我为什么要提这个 up to 呢，因为 InputStream 的 read(byte b[]) 的说明更过分：

Reads some number of bytes from the input stream and stores them into  the buffer array <code>b</code>. The number of bytes actually read is  returned as an integer.  This method blocks until input data is available, end of file is detected, or an exception is thrown.

第一句说 Reads some number of bytes from the input stream 这是什么doc，搞笑呢？但是这就是这个方法的本来面目。它确实无法保证能读到你想要的完整数据。

但是，有一点是可以确定的，那就是 the total number of bytes read into the buffer 。你可以检查是否读到了完整的数据。

而实际系统中：你必须检查是否读到了完整的数据。否则你的系统可能崩掉，而你还不知道问题出在哪。

那么，有没有补救措施呢，是有的，RandomAccessFile 方法提供了另一个方法：这个方法在读到 b.length 个字节之前不会给你返回的，除非遇到文件末尾或者遇到异常。这个方法就比较靠谱了。

/**
     * Reads {@code b.length} bytes from this file into the byte
     * array, starting at the current file pointer. This method reads
     * repeatedly from the file until the requested number of bytes are
     * read. This method blocks until the requested number of bytes are
     * read, the end of the stream is detected, or an exception is thrown.
     *
     * @param      b   the buffer into which the data is read.
     * @exception  EOFException  if this file reaches the end before reading
     *               all the bytes.
     * @exception  IOException   if an I/O error occurs.
     */
    public final void readFully(byte b[]) throws IOException {
        readFully(b, 0, b.length);
    }

而这个方法的实现是将原来的读方法套了个循环：一次没读完，我就接着读！直到读到 0 个字节，也就是读到文件末尾了。

public final void readFully(byte b[], int off, int len) throws IOException {
        int n = 0;
        do {
            int count = this.read(b, off + n, len - n);
            if (count < 0)
                throw new EOFException();
            n += count;
        } while (n < len);
    }

这个方法的实现可以验证乔老师猜想，（如果普通的 read 方法可以保证除了遇到文件末尾，都能返回需要的数据，就不需要循环读取了，只需要读一次判断 count 是否为 0 抛出异常就好了。因此，普通的 read 方法肯定还有其他原因会导致无法读取需要的数据）。

在使用了 readFully() 方法后，我们只需要处理 EOFException（End of File Exception，读到文件末尾了还没读到要求的数据长度）一种异常就好了。

因此，完整的读流程为：使用 RandomAccessFile 的 readFully + 检查

try (RandomAccessFile raf = new RandomAccessFile(file, "r")) {
            int length = 10;
            int offset = 5;
            byte[] result = new byte[length];
            raf.seek(offset);

            // 必须读 length 个字节，只要没读够就抛异常
            raf.readFully(result);
        } catch (EOFException e) {
            System.out.println("没读够 10 个字节就到文件末尾了，抛出 EOFException");
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }

读取文件数据时，如果使用 RandomAccessFile ，最好用 readFully(byte b[]) 方法读取数据。不管使用什么读方法，都要检查是否读取到了想要的数据，并进行异常处理。至此，java 字节流入门系列就完整了，包括读写文件，内存和磁盘交互。

欢迎关注个人公众号：数据库漫游指南

这里写图片描述

java 字节流入门（读文件）

导读

猜你喜欢