[NIO] A detailed introduction to the commonly used classes related to file operations in the NIO package of java


Foreword: The original text is very detailed. The content of this article has been processed and organized through my own understanding, including my own verification in the source code. It is only used as a study note record.

Some quotes:

[Java NIO] An article to understand NIO - puyangsky - Blog Park (cnblogs.com)

1. Understand NIO

The original IO is BIO, blocking io, and after jdk1.4 is NIO, non-blocking IO, non-blocking io. Apply a passage from "Java NIO" to explain the reason for the emergence of NIO:

The operating system doesn't match java's stream-based IO model. What the operating system wants to move is large chunks of data (buffers), which is often done with the assistance of hardware direct memory storage DMA, while the native IO of the JVM likes to operate on small databases (stream-oriented, single byte, few lines of text, etc.) . The result is that the operating system sends the data of the entire buffer, and the streaming data of java.io spends a lot of time dismantling them into small data blocks, and often copying a small data block requires round-tripping several layers of objects. The operating system likes to transport data like a whole truck, while the Java.io class likes to process data one by one. With NIO, you can easily back up a truckload of data to ByteBuffer objects that can be used directly. Java's RandomAccessesFile class is a way closer to the operating system.

Therefore, the reason why java's native IO model is slow is because it does not match the operation mode of the operating system. The main reason why NIO is faster than BIO is that it uses the buffer technology.

The summary is that IO is stream-oriented, and NIO is buffer-oriented . IO stream-oriented means that only one or more bytes can be read from the stream at a time until the read is complete, that is, corresponding to the operation of small data blocks, and the read content is not cached. However, NIO can directly cache large data blocks in advance, and then read the stream of the buffer inside the process, which is more efficient.

2. Understand the buffer

As shown in the figure, how the operating system establishes a channel between the disk space and the process buffer:

  • The process uses read() to call the buffer to the operating system kernel
  • The kernel sends instructions to the disk controller to call out the data block where the target file is located
  • The disk calls out the data block and returns it to the kernel space, and the kernel splits the data block to extract the valid content
  • The kernel communicates through the channel established with the process, and the process can read and write large data blocks in the buffer of the operating system kernel

Not so appropriate understanding: the kernel buffer can be understood as a cpu cache, and the disk is like a hard disk.

insert image description here

And because the user cannot directly operate the hardware, it is necessary to let the kernel of the operating system operate the disk through a system call. In addition, block storage devices such as disks operate on fixed-size data blocks, while users request data of irregular sizes. The role of the kernel space here is to decompose and reorganize.

3. Basic components of NIO

The three most important dependent components are: Buffer Buffer, Channel Channel and Selector

The related classes of the nio flow are placed in the java.nio package, roughly as follows:

  • java.nio package: Buffer (buffer) related classes
  • java.nio.channels package: Channel (pipeline) and Selector (selector) related classes
  • java.nio.charset package: related classes for handling character sets

3.1 Buffer Buffer

Buffer is an abstract class, and its subclass implementation classes are as follows. As the name says, a specific Buffer is a buffer in specific units .

** Among them, ByteBuffer is used the most, that is, the buffer in bytes. **Just check the documentation when you need it. Currently mastering ByteBuffer is enough

insert image description here

Its subclasses are also abstract classes: as shown in the figureinsert image description here

3.1.1 Create a buffer

The buffer object will have a series of attributes: [that is, the attributes that can be accessed and manipulated after the buffer object is created]

  • Capacity capacity: the maximum size of the buffer
  • upper bound limit: the current size of the buffer
  • Position position: the next position to read, updated by get() and put()
  • Mark mark: Memo position, you can specify mark = position through the mark() method, and make position = mark through the reset() method
  • 0 <= mark <= position <=limit <=capacity

The attributes are all in the general abstract class of Buffer.

insert image description here

In addition, from the source code, we can see that the methods in ByteBuffer are basically static methods, and many ByteBuffer instance implementation classes are hidden, which is convenient for users to call. So after we get the byteBuffer object, its properties and methods are not in the ByteBuffer class, but in its implementation class such as the HeapByteBuffer class.

Method 1: allocate static method, created by direct allocation:

// 最常用的创建,创建全新缓冲区 
// [其中allocate是分配的意思,其单位默认是类中的byte] 
int capacity = 1024
ByteBuffer byteBuffer = ByteBuffer.allocate(1024);

allocate static method source code:

insert image description here

Method 2: wrap static method, wrap the existing byte array into the buffer:

// 以字节数组为容器创建缓冲区,也就是说这个字节数组就是我们的缓冲区
// 对缓冲区操作 = 对数组操作;对数组操作 = 对缓冲区操作
// [其中wrap是囊、容器的意思,也就是以某字节数组为缓冲区容器]
byte[] array = new byte[1024];
ByteBuffer byteBuffer = ByteBuffer.wrap(array);
// 当然可以限定容器范围,提供offet

Wrap static method source code:
insert image description here
insert image description here

3.1.2 Buffer tool method

3.1.2.1 flip

What is this? The buffer ByteBuffer will continue to be filled with binary. After it is filled, a batch of buffered data written by the process or read from the kernel is carried out , that is, the next step is to pass the contents of the buffer to the channel . At this time, if we directly read the Buffer The buffer, in fact, cannot read anything, because the position pointer at this time points to the end of the Buffer, because it is already full. If we want to read the Buffer again before putting it into the channel , we need to return the pointer to the head, that is, the buffer needs to be flipped.

The method is final in the Buffer abstract class, and its corresponding instance can be used.

int capacity = 1024
ByteBuffer byteBuffer = ByteBuffer.allocate(1024);
xxxxx
byteBuffer.flip();
xxxxx

The source code of flip flipping: it is used to think that the current buffer is full. This full does not refer to capacity, but subjectively thinks that it is full

limit: The current size of the buffer is the current position, and the position is at the end of the array. That is, the current buffer is considered to be full

position: the current position to return to the head

mark: mark point is reset to -1

insert image description here

3.1.2.2 rewind flip

The effect is the same as flip, the difference lies in the timing of use, flip is used to determine that the current buffer is full; and if rewind is used when the current buffer is not full.

int capacity = 1024
ByteBuffer byteBuffer = ByteBuffer.allocate(1024);
xxxxx
byteBuffer.flip();
xxxxx

The source code of rewind flip: for the current buffer is not full

The difference is that the limit has not been operated, that is, the current buffer is not considered full

insert image description here

3.1.2.3 clear

clear buffer contents

int capacity = 1024
ByteBuffer byteBuffer = ByteBuffer.allocate(1024);
byteBuffer.clear();

The source code cleared by clear: it is not a deletion in the actual sense, but an overwrite

In fact, the position pointer is returned to 0, and the previous content will be overwritten the next time it is written.

insert image description here

3.1.2.4 remaining()

remaining = limit - position

Indicates how long I have left, generally used after flip, to indicate how much I have left to read, so position = 0 after flip;

remaining() source code:
insert image description here

3.1.2.5 mark() and reset()

// mark = position
byteBuffer.mark()

// positon = mark
byteBuffer.reset()

3.1.2.6 limit() Limit the range of buffer usage

int newLimit = xxx; // limit要小于capacity
// limit = newLimit; 
// if(position > limit) position = newLimit; 如果position超过了limit,限制其位置
byteBuffer.limit(newLimit); 

3.1.2.7 Other methods of operating index

nextGetIndex()

nextPutIndex()

checkIndex(int i)

Wait, I won't introduce it if I don't use it often.

BufferSummary:

most commonly used:

  • Create ByteBuffer through allocate to read
  • Create ByteBuffer through wrap and write
  • flip() flips, each processing buffer buffer needs to start from the beginning
  • clear(), each time the current buffer buffer is processed, clear it
  • Others are used as needed

3.2 Channel Channel

The Buffer buffer in the process loads data for us, but the writing and reading of data cannot directly perform the system calls of read() and write() of the kernel buffer, and the JVM provides us with a layer of support for system calls. Encapsulation, Channel can access the IO services of the operating system itself with minimal overhead, decoupling and optimizing overhead.

Channel classification:

  • IO file
    • FileChannel
    • SocketChannel
  • Stream IO
    • ServerSocketChannel
    • DatagramChannel

[Different from java.net.socket, socket socket is used to write server communication, native net.socket is blocking, and nio socket can be non-blocking, and can be used for services such as chat software. But java is not a language we commonly use to write servers. It is more suitable for server C++. So we only introduce the FileChannel we want to use]

3.2.1 FileChannel

3.2.1.1 Get FileChannel

FileChannel can only be instantiated through the factory method, that is, calling the getChannel() method of RandomAccessFile, FileInputStream and FileOutputStream.

[RandomAccessFile supports random access to files, the program can jump directly to any place in the file to read and write data, while the other two can only read and write after finding the corresponding position from beginning to end]

Therefore, RandomAccessFile is an optimal solution for reading and writing files. The most important scenarios are multi-threaded downloading and resuming of network requests.

String mode = "rw"
RandomAccessFile file = new RandomAccessFile("文件路径", mode);
/*其中mode有如下选择:
    "r": 以只读方式打开。调用结果对象的任何 write 方法都将导致抛出 IOException。
    "rw": 打开以便读取和写入。
    "rws": 打开以便读取和写入。相对于 "rw","rws" 还要求对“文件的内容”或“元数据”的每个更新都同步写入到基础存储设备。
    "rwd": 打开以便读取和写入,相对于 "rw","rwd" 还要求对“文件的内容”的每个更新都同步写入到基础存储设备。
*/

3.2.1.2 Using FileChannel

FileChannel is a duplex channel that can read and write

// FileChannel源码:
// 把通道中数据传到目的缓冲区中,dst是destination的缩写
public abstract int read(ByteBuffer dst) throws IOException;
// 把源缓冲区中的内容写到指定的通道中去
public abstract int write(ByteBuffer src) throws IOException;

Read the actual usage process:

  • Create a FileChannel with the target file stream
  • allcate creates a buffer
  • Enter the circular processing buffer, channel.read(buffer) indicates that the buffer has obtained the content of the file stream through the channel, and will return the number of bytes successfully read [if the read is complete, it will return -1].
  • The buffer is full after each read, and the position pointer is at the end. We need to flip the buffer, return the position to 0 and start reading from the beginning
  • Each read buffer.remaining() = capacity - 0 is so long, because position = 0 after flip; limit = capacity after clear;
  • Process the content of the current buffer into a variable
  • clear, clear the buffer, and give space for the next content reading
public static void readFile(String path) throws IOException {
    
    
    FileChannel fc = new RandomAccessFile("文件路径", "r").getChannel();
    ByteBuffer buffer = ByteBuffer.allocate(1024);
    // StringBuilder是用来接收文件文本结果
    StringBuilder sb = new StringBuilder();
    // 每次循环读一波,如果channel连接的文件流没有内容读了,就返回 -1
    // 每次读完,limit会自动到有内容的最后一个位置,而不是每次都 = capacity
    while ((fc.read(buffer)) >= 0) {
    
    
        // 每次读完,buffer是满的,翻转指针,保证从头开始读
        buffer.flip();
        
        // remaining = limit - position 表示剩下多长
        // 一定要注意,这个时候position经过了flip后是等于0的
        // 所以一般这个时候buffer.remaining() = capacity - 0; 就是buffer的整长
        // 也表示剩下多少没读
        // 这样写主要是为了最后一波的读,不一定是满的,最后一次读就是limit - 0 就是有内容的,不去读无内容的
        byte[] bytes = new byte[buffer.remaining()];
        
        // 从buffer中获取内容放到bytes数组中
        buffer.get(bytes);
        // 依据bytes数组转换当前数据
        String string = new String(bytes, "UTF-8");
        // 把当前数据加入到总结果中
        sb.append(string);

        // 清空buffer,给下一次内容读取的空间
        buffer.clear();
    }
    System.out.println(sb.toString());
}

Write the actual use process

  • Create a FileChannel with the target file stream
  • allcate creates a buffer
  • Enter the circular processing buffer and use the pointer to traverse the byte array of the content to be written.
  • Fill the target byte array into the buffer array, and the pointer jumps
  • The buffer is flipped, position = 0, let the channel write
  • channel write buffer
  • clear, clear the buffer, and give space for the next buffer
public static void writeFile(String path, String string) throws IOException {
    
    
    FileChannel fc = new RandomAccessFile("文件路径", "rw").getChannel();
    ByteBuffer buffer = ByteBuffer.allocate(1024);
    // 指针,表示写入进度
    int current = 0;
    // 要写的内容的字节数组的长度,为目标字节数组
    int len = string.getBytes().length;
    while (current < len) {
    
    
        // 每次写入1024字节到ByteBuffer中
        // 如果剩余内容不足1024,则提前break
        for (int i=0;i<1024;i++) {
    
    
            if (current+i>=len) break;
            buffer.put(string.getBytes()[current+i]);
        }
        //指针一次性跳转1024。如果是最后一次的缓冲,则跳转小于1024
        current += buffer.position();
		
        // buffer翻转,从头开始写
        buffer.flip();
        // 通过channel通道写入
        fc.write(buffer);
        // 清空buffer数组提供下一次缓冲的空间
        buffer.clear();
    }
}

Summarize:

Although FileChannel is simple to use, it should be noted that FileChannel is blocked! ! ! ! And it cannot be switched to a non-blocking state , which conflicts with NIO's non-blocking concept. But it can satisfy most scenarios with small files and few files.

If you want to implement the non-blocking mode of NIO, you need to use socket channel + selector.

After understanding, the content from the beginning of the socket belongs to the use of nio to write the server for socket communication, and the original java.net.socket is blocked. Currently this is not what we need to use in file operations. Therefore, the following content will not be displayed. And the code complexity is too high.

3.3 Selector

A selector is actually an application of a multiplexing mechanism in the Java language. Before learning Selector, it is necessary to learn the concept of I/O multiplexing. [Multiplexing is non-blocking and synchronous]

Multiplexing:

The select model, poll model and epoll model of linux are the classics of multiplexing. A socket channel will monitor multiple resources. As long as a resource is ready for read or write, the current channel will be provided to the corresponding request that needs the resource. There is no need for a request to open a channel. The poll model monitors through polling, while epoll achieves efficient monitoring through resource responses. If the resource is not ready, then the thread where the request is located will do other things first, and there is no need to suspend blocking.

IO multiplexing means that multiple file descriptors can be monitored through a certain mechanism. Once a file can perform IO operations, the application can be notified to perform corresponding read and write operations.

Therefore, multiplexing can also be understood as a thread to monitor multiple network connections, and the thread regularly polls all network connections, and when one is ready, the thread will provide services for this connection. And for the request where the network connection is not ready, do not perform IO service first, and do other things first [so it is not blocked, it does not mean that it will hang if you can’t do IO], wait for the network connection corresponding to the request to be ready , and then notify the application to use the services provided by the thread for IO.

Therefore, to use Selector, a non-blocking Channel must be used. 【Socket channel】

Selector is only used when using java to write a server for non-blocking communication, and we don't need to use it for the time being.

4. Paths and Files

There are also Files class and Paths class in the nio package, which we often use when manipulating files.

4.1 Paths

Generally, a Path type is returned by the get() method, representing the path of the current resource, and provided to other nio-related classes.

Path path = Path.get("xxx/xxx/xx.jpg");

insert image description here

4.2 Files

! ! Each of the following uses comes with the source code corresponding to the method! !

4.2.0 Get file size

long size = Files.size(Path.get("/xxxx/xxx.jpg"));//得到的结果是B,Byte字节

//提供一个转换单位的方法
String fileSize = "";
double len = Double.valueOf(file.length());
if(len < 1024) {
    
    
    fileSize = "" + String.format("%.2f", len) + "b";
}else if(len >= 1024 && len < 1048576) {
    
    
    fileSize = "" + String.format("%.2f", len/1024) + "kb";
}else if(len >= 1048576 && len < 1073741824) {
    
    
    fileSize = "" + String.format("%.2f", len/1048576) + "mb";
}

insert image description here

4.2.1 Create an input stream for a file resource

InputStream inputStream = Files.newInputStream(Path.get("/xxxx/xxx.jpg"));

insert image description here

4.2.2 Create an output stream for a file resource

OutputStream outputStream = Files.newOutputStream(Path.get("/xxxx/xxx.jpg"));

insert image description here

4.2.3 Create a stream to a folder

DirectoryStream directoryStream = Files.newDirectory(Path.get("/xxxx"));

insert image description here

4.2.4 Create a file

Files.createFile(Path.get("/xxxx/xxx.jpg"));

insert image description here

4.2.5 Create a folder

Files.createFile(Path.get("/xxxx"));

insert image description here

4.2.6 Delete files/folders

Files.delete(Path.get("/xxxx/xxx.jpg"));

insert image description here

4.2.7 Copy files

Files.copy(Path.get("/xxxx/xxx1.jpg"), Path.get("/xxxx/xxx2.jpg"));

insert image description here

4.2.8 Moving files

Files.move(Path.get("/xxxx/xxx1.jpg"), Path.get("/xxxx/xxx2.jpg"));

insert image description here

4.2.9 Determine whether it is a folder

Files.isDirectory(Path.get("/xxxx/xxx"));

insert image description here

4.2.10 There are also the following commonly used

//判断两个文件是否相同
public static boolean isSameFile(Path path, Path path2)
//判断该文件是否被隐藏
public static boolean isHidden(Path path)
//获取当前文件最后被修改的时间
public static FileTime getLastModifiedTime(Path path, LinkOption... options)
//设置当前文件最后被修改的时间
public static Path setLastModifiedTime(Path path, FileTime time)
//当前文件是否存在
public static boolean exists(Path path, LinkOption... options)
//当前文件是否可读
public static boolean isReadable(Path path)
//当前文件是否可写
public static boolean isWritable(Path path)
//当前文件是否可执行
public static boolean isExecutable(Path path)
//建立一个缓冲流
public static BufferedReader newBufferedReader(Path path)
public static BufferedWriter newBufferedWriter(Path path)
//读一个文件,需要我们提供一个绑定文件的流,initialSize是缓冲字节数组的初始化大小
private static byte[] read(InputStream source, int initialSize)
//读所有的行,返回的是一个数组列表,可以用来处理数据
public static List<String> readAllLines(Path path)
//把字节数组写入一个文件
public static Path write(Path path, byte[] bytes)
//获取某目录下的所有文件和目录(Path)
public static Stream<Path> list(Path dir)
//求文件的行数
public static Stream<String> lines(Path path)

5. Write NIO tool class

**To sum up one sentence:** Compared with native IO, if NIO is only used for file operations [only use FileChannel] instead of socket communication, then it is still blocked, but because of the introduction of Buffer space in the IO thread The encapsulation with Channel makes us more efficient when reading and writing files [Native IO needs to operate in 5000ms, NIO may only need 500ms].

**Finally organize the tools for NIO text reading and writing. **Just copy and use directly.

public class NioUtil {
    
    
    /**
     * NIO读取文件
     * @throws IOException
     */
    public static String read(String url) throws IOException {
    
    
        RandomAccessFile access = new RandomAccessFile(url, "r");
        FileChannel channel = access.getChannel();
        int allocate = 1024;
        ByteBuffer byteBuffer = ByteBuffer.allocate(allocate);
        // 接收结果的容器
        StringBuilder sb = new StringBuilder();
        while ((channel.read(byteBuffer)) >= 0) {
    
    
            // 每次读完,buffer是满的,翻转指针,保证从头开始读
            byteBuffer.flip();

            // remaining = limit - position 表示剩下多长
            // 一定要注意,这个时候position经过了flip后是等于0的
            // 所以一般这个时候buffer.remaining() = capacity - 0; 就是buffer的整长
            // 也表示剩下多少没读
            // 这样写主要是为了最后一波的读,不一定是满的,最后一次读就是limit - 0 就是有内容的,不去读无内容的
            byte[] bytes = new byte[byteBuffer.remaining()];

            // 从buffer中获取内容放到bytes数组中
            byteBuffer.get(bytes);
            // 依据bytes数组转换当前数据
            String string = new String(bytes, "UTF-8");
            // 把当前数据加入到总结果中
            sb.append(string);

            // 清空buffer,给下一次内容读取的空间
            byteBuffer.clear();
        }

        channel.close();
        if (access != null) {
    
    
            access.close();
        }
        return sb.toString();
    }

    /**
     * NIO写文件, 默认覆盖
     * @param text 要写入的文本
     * @param url 绝对路径
     * @throws IOException
     */
    public static void write(String url, String text) throws IOException{
    
    
        RandomAccessFile access = new RandomAccessFile(url, "w");
        FileChannel fc = access.getChannel();
        ByteBuffer buffer = ByteBuffer.allocate(1024);
        // 指针,表示写入进度
        int cur = 0;
        // 要写的内容的字节数组的长度,为目标字节数组
        int len = text.getBytes().length;
        while (cur < len) {
    
    
            // 每次写入1024字节到ByteBuffer中
            // 如果剩余内容不足1024,则提前break
            for (int i = 0; i < 1024; i++) {
    
    
                if (cur + i >= len) break;
                buffer.put(text.getBytes()[cur+i]);
            }
            //指针一次性跳转1024。如果是最后一次的缓冲,则跳转小于1024
            cur += buffer.position();
            // buffer翻转,从头开始写
            buffer.flip();
            // 通过channel通道写入
            fc.write(buffer);
            // 清空buffer数组提供下一次缓冲的空间
            buffer.clear();
        }
    }

    /**
     * NIO写文件,可追加
     * @param text 要写入的文本
     * @param url 绝对路径
     * @throws IOException
     */
    public static void write(String url, String text, String mode) throws IOException{
    
    
        RandomAccessFile access = new RandomAccessFile(url, "rw");
        FileChannel fc = access.getChannel();
        if("a".equals(mode)) {
    
    
            //把文件指针指向末尾进行添加
            access.seek(access.length());
        }
        ByteBuffer buffer = ByteBuffer.allocate(1024);
        // 指针,表示写入进度
        int cur = 0;
        // 要写的内容的字节数组的长度,为目标字节数组
        int len = text.getBytes().length;
        while (cur < len) {
    
    
            // 每次写入1024字节到ByteBuffer中
            // 如果剩余内容不足1024,则提前break
            for (int i = 0; i < 1024; i++) {
    
    
                if (cur + i >= len) break;
                buffer.put(text.getBytes()[cur+i]);
            }
            //指针一次性跳转1024。如果是最后一次的缓冲,则跳转小于1024
            cur += buffer.position();
            // buffer翻转,从头开始写
            buffer.flip();
            // 通过channel通道写入
            fc.write(buffer);
            // 清空buffer数组提供下一次缓冲的空间
            buffer.clear();
        }
    }

    public static void main(String[] args) throws Exception {
    
    
        String url ="C:\\xxx.text";
        write(url, "123", "a");
        System.out.println(read(url));

    }
}

Guess you like

Origin blog.csdn.net/NineWaited/article/details/126589314