Ali Ermian: Boy, can you tell me how to achieve zero copy?

We always see zero copy in various places, what exactly is zero copy.

Next, let's sort it out.

Copy refers to the I/O operation in the computer, that is, the read and write operations of data. A computer is a complex thing, including software and hardware. Software mainly refers to operating systems, drivers and applications. There are a lot of hardware, such as CPU, memory, hard disk, and a lot of things.

Such a complex device needs to perform read and write operations, which is cumbersome and complex.

Traditional I/O read and write process

If you want to understand zero copy, you must know how computers read and write data in general. I call this situation traditional I/O.

The initiator of data reading and writing is the application program in the computer, such as our commonly used browser, office software, audio and video software, etc.

The source of the data is generally a hard disk, an external storage device or a network socket (that is, the data on the network is processed through the network port + network card).

The process is inherently very complicated, so in the university courses, "Operating System" and "Computer Composition Principles" are used to specifically talk about the software and hardware of computers.

Simplified version of the read operation process

There is no way to talk about such details, so let's simplify the reading and writing process, ignore most of the details, and only talk about the process.

picture

The above figure is the process of the application program performing a read operation.

  1. The application first initiates a read operation and is ready to read data;

  2. The kernel reads data from the hard disk or external storage into the kernel buffer;

  3. The kernel copies data from the kernel buffer to the user buffer;

  4. The application program reads the data in the user buffer for processing;

Detailed read and write operation process

The following is a more detailed I/O read and write process. This diagram is very useful, and I will use this diagram to clarify some basic but very important concepts of I/O operations.

picture

Take a look at this picture first. The red and pink part above is the read operation, and the blue part below is the write operation.

If you look a little confused at once, it doesn't matter, just look at the following concepts and it will be clear.

application

It is the various applications installed on the operating system.

system kernel

The system kernel is a collection of core resources of a series of computers, including not only hardware devices such as CPU and bus, but also a series of functions such as process management, file management, memory management, device drivers, and system calls.

external storage

External storage refers to external storage media such as hard disks and U disks.

kernel state

  • The kernel state is the mode in which the operating system kernel runs. When the operating system kernel executes privileged instructions, it is in the kernel state.

  • In kernel mode, the operating system kernel has the highest authority and can access all hardware resources and sensitive data of the computer, execute privileged instructions, and control the overall operation of the system.

  • Kernel mode provides the ability of the operating system to manage and control computer hardware, and it is responsible for handling core tasks such as system calls, interrupts, and hardware exceptions.

user state

The user here can be understood as an application program. This user refers to the kernel of the computer. For the kernel, various applications on the system will issue instructions to call the resources of the kernel. At this time, the application program is the user of the kernel. .

  • The user mode is the mode in which the application program runs. When the application program executes ordinary instructions, it is in the user mode.

  • In user mode, applications can only access their own memory space and limited hardware resources, and cannot directly access sensitive data of the operating system or control the hardware devices of the computer.

  • The user mode provides a safe operating environment to ensure that applications are isolated from each other and prevent malicious programs from affecting the system.

mode switch

For the sake of security, the computer distinguishes the kernel mode and the user mode. The application program cannot directly call the kernel resources. After switching to the kernel mode, let the kernel call them. After the kernel calls the resources, it returns to the application program. At this time, the system After switching to the user mode, the application can only process data in the user mode.

In the above process, two mode switchings occurred for one read and one write respectively.

picture

kernel buffer

The kernel buffer refers to the memory space in the memory specially used for direct use by the kernel. It can be understood as an intermediate medium for data interaction between applications and external storage.

If the application wants to read external data, it must read from here. Applications that want to write to external storage go through kernel buffers.

user buffer

The user buffer can be understood as the memory space that the application program can directly read and write. Because the application program cannot directly read and write data to the kernel, the application program must first pass through the user buffer if it wants to process data.

disk buffer

A disk buffer is a temporary storage area in computer memory that is used to stage data read from or written to disk before it is written. It is a mechanism for optimizing disk I/O operations. By utilizing the fast access speed of memory, it reduces frequent access to slow disks and improves the performance and efficiency of data reading and writing.

PageCache

  • PageCache is a mechanism for the Linux kernel to cache the file system. It uses free memory to cache data blocks read from the file system, speeding up file read and write operations.

  • When an application or process reads a file, the data is first read from the file system into PageCache. If the same data is read again later, it can be obtained directly from PageCache, avoiding revisiting the file system.

  • Similarly, when an application or process writes data to a file, the data will be temporarily stored in the PageCache, and then the Linux kernel will asynchronously write the data to the disk, thereby improving the efficiency of the write operation.

Let's talk about the data read and write operation process

After figuring out these concepts above, look back at the flow chart to see if it is much clearer.

read operation
  1. First, the application program initiates a read request to the kernel. At this time, a mode switch is performed, switching from user mode to kernel mode;

  2. The kernel initiates a read operation to external storage or network socket;

  3. write data to disk buffer;

  4. The system kernel copies the data from the disk buffer to the kernel buffer, and copies one (or part) to PageCache by the way;

  5. The kernel copies the data to the user buffer for processing by the application. At this time, another mode switch is performed, switching from the kernel mode back to the user mode;

write operation
  1. The application initiates a write request to the kernel, and at this time a mode switch is performed, switching from user mode to kernel mode;

  2. The kernel copies the data to be written from the user buffer to the PageCache, and at the same time copies the data to the kernel buffer;

  3. The kernel then writes the data to a disk buffer and thus to disk, or directly to a network socket.

where is the bottleneck

But traditional I/O has its bottleneck, which is the reason for the emergence of zero-copy technology. What is the bottleneck? Of course, it is a performance problem, which is too slow. Especially in high-concurrency scenarios, I/O performance often gets stuck.

So where is the time wasted?

data copy

In traditional I/O, the transfer of data usually involves multiple data copies. Data needs to be copied from the application's user buffer to the kernel buffer, and then from the kernel buffer to the device or network buffer. These data copy processes result in multiple memory accesses and data duplication, consuming a lot of CPU time and memory bandwidth.

Switching between user mode and kernel mode

Since the data passes through the kernel buffer, the data is switched back and forth between the user mode and the kernel mode, and there will be context switching during the switching process, which greatly increases the complexity and time overhead of data processing.

Although the time spent on each operation is very small, when the amount of concurrency is high, it will add up to a lot, which is not a small overhead. Therefore, in order to improve performance and reduce overhead, we must start with the above two problems.

At this time, zero-copy technology came out to solve the problem.

what is zero copy

The problem is data copying and mode switching.

But since it is an I/O operation, it is impossible to have no data copy, so the only way to reduce the number of copies is to store the data closer to the application (user buffer) as much as possible.

There are other more important reasons for distinguishing between user mode and kernel mode. It is impossible to change this design purely for I/O efficiency. That can only minimize the number of switching.

The ideal state of zero copy is to operate data without copying, but it does not necessarily mean that there is no copy operation in the display case, but to minimize the number of copy operations.

To achieve zero copy, you should start from the following three aspects:

  1. Minimize data copy operations in various storage areas, such as from disk buffers to kernel buffers, etc.;

  2. Minimize the number of switching between user mode and kernel mode and context switching;

  3. Use some optimization methods, such as caching the data that needs to be operated first, and the PageCache in the kernel is for this purpose;

Implement a zero-copy solution

Direct Memory Access (DMA)

DMA is a hardware feature that allows peripherals (such as network adapters, disk controllers, etc.) to directly access system memory without CPU intervention. During data transmission, DMA can directly transfer data from memory to peripherals, or from peripherals to memory, avoiding multiple copies of data between user mode and kernel mode.

picture

DMA1

As shown in the figure above, the kernel hands over most of the data reading operations to the DMA controller, and the vacated resources can be used to process other tasks.

sendfile

Some operating systems (such as Linux) provide special system calls, such as sendfile, to achieve zero-copy when transferring files over the network. With sendfile, applications can directly transfer file data from the file system to network sockets or target files without going through user buffers and kernel buffers.

If sendfile is not used, if A file is written to B file.

  1. The data of file A needs to be copied to the kernel buffer first, and then copied from the kernel buffer to the user buffer;

  2. Then the kernel copies the data in the user buffer to the kernel buffer, and then it can be written to the B file;

With sendfile, the copy of user buffer and kernel buffer is not used, which saves a lot of overhead.

Shared memory

Using shared memory technology, applications and the kernel can share the same memory area, avoiding data copying between user mode and kernel mode. Applications can write data directly to shared memory, and the kernel can read data directly from shared memory for transfer, or vice versa.

picture

Data sharing is realized by sharing a memory area. Just like the reference object in the program, it is actually a pointer and an address.

Memory-mapped Files

The memory-mapped file directly maps the disk file to the address space of the application program, so that the application program can directly read and write file data in the memory. In this way, the modification of the mapped content is directly reflected in the actual file.

When file data needs to be transferred, the kernel can directly read the data from the memory-mapped area for transfer, avoiding additional copying of data between the user state and the kernel state.

Although it looks no different from shared memory, the implementation methods of the two are completely different, one is the shared address, and the other is the content of the mapped file.

How Java implements zero copy

The Java standard IO library does not have a zero-copy implementation, and the standard IO is equivalent to the traditional mode mentioned above. Only in the NIO introduced by Java, a new set of I/O classes are included, such as  ByteBuffer and  Channel, which can achieve zero copy to a certain extent.

ByteBuffer: Byte data can be directly manipulated, avoiding the duplication of data between user mode and kernel mode.

Channel: Supports directly transferring data from a file channel or a network channel to another channel, realizing zero-copy transfer of files and networks.

With the help of these two objects, combined with the API in NIO, we can achieve zero copy in Java.

First of all, we first write a method using traditional IO to compare with NIO later. The purpose of this program is very simple, that is, to copy a PDF file of about 100M from one directory to another.

public static void ioCopy() {
  try {
    File sourceFile = new File(SOURCE_FILE_PATH);
    File targetFile = new File(TARGET_FILE_PATH);
    try (FileInputStream fis = new FileInputStream(sourceFile);
         FileOutputStream fos = new FileOutputStream(targetFile)) {
      byte[] buffer = new byte[1024];
      int bytesRead;
      while ((bytesRead = fis.read(buffer)) != -1) {
        fos.write(buffer, 0, bytesRead);
      }
    }
    System.out.println("传输 " + formatFileSize(sourceFile.length()) + " 字节到目标文件");
  } catch (IOException e) {
    e.printStackTrace();
  }
}

The following is the execution result of this copy program, 109.92M, taking 1.29 seconds.

Time taken to transfer 109.92 Mbytes to destination file: 1.290 seconds

FileChannel.transferTo() 和 transferFrom()

FileChannel is a channel for reading, writing, mapping and manipulating files, and it is thread-safe in a concurrent environment. The getChannel() method based on FileInputStream, FileOutputStream or RandomAccessFile can create and open a file channel. FileChannel defines two abstract methods, transferFrom() and transferTo(), which implement data transfer by establishing connections between channels.

These two methods are preferred to use sendfile. As long as the current operating system supports it, use sendfile, such as Linux or MacOS. If the system does not support it, such as windows, it will be implemented in the form of a memory-mapped file.

transferTo()

The following is an example of transferTo, still copying the PDF of about 100M, my system is MacOS.

public static void nioTransferTo() {
  try {
    File sourceFile = new File(SOURCE_FILE_PATH);
    File targetFile = new File(TARGET_FILE_PATH);
    try (FileChannel sourceChannel = new RandomAccessFile(sourceFile, "r").getChannel();
         FileChannel targetChannel = new RandomAccessFile(targetFile, "rw").getChannel()) {
      long transferredBytes = sourceChannel.transferTo(0, sourceChannel.size(), targetChannel);

      System.out.println("传输 " + formatFileSize(transferredBytes) + " 字节到目标文件");
    }
  } catch (IOException e) {
    e.printStackTrace();
  }
}

It only took 0.536 seconds, twice as fast.

Time taken to transfer 109.92 Mbytes to destination file: 0.536 seconds

transferFrom()

The following is an example of transferFrom, still copying the PDF of about 100M, my system is MacOS.

public static void nioTransferFrom() {
  try {
    File sourceFile = new File(SOURCE_FILE_PATH);
    File targetFile = new File(TARGET_FILE_PATH);

    try (FileChannel sourceChannel = new RandomAccessFile(sourceFile, "r").getChannel();
         FileChannel targetChannel = new RandomAccessFile(targetFile, "rw").getChannel()) {
      long transferredBytes = targetChannel.transferFrom(sourceChannel, 0, sourceChannel.size());
      System.out.println("传输 " + formatFileSize(transferredBytes) + " 字节到目标文件");
    }
  } catch (IOException e) {
    e.printStackTrace();
  }
}

execution time:

Time taken to transfer 109.92 Mbytes to destination file: 0.603 seconds

Memory-Mapped Files

Java's NIO also supports memory-mapped files (Memory-mapped Files), through  FileChannel.map() implementation.

The following is an  FileChannel.map()example, still copying the PDF of about 100M, my system is MacOS.

    public static void nioMap(){
        try {
            File sourceFile = new File(SOURCE_FILE_PATH);
            File targetFile = new File(TARGET_FILE_PATH);

            try (FileChannel sourceChannel = new RandomAccessFile(sourceFile, "r").getChannel();
                 FileChannel targetChannel = new RandomAccessFile(targetFile, "rw").getChannel()) {
                long fileSize = sourceChannel.size();
                MappedByteBuffer buffer = sourceChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileSize);
                targetChannel.write(buffer);
                System.out.println("传输 " + formatFileSize(fileSize) + " 字节到目标文件");
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

execution time:

Time taken to transfer 109.92 Mbytes to destination file: 0.663 seconds

Guess you like

Origin blog.csdn.net/Javatutouhouduan/article/details/131895829