Zero copy is so simple!

We always see zero copy in various places, what exactly is zero copy.

Next, let's sort it out.

Copy refers to the I/O operations in the computer, that is, the reading and writing operations of data. A computer is a complex thing, including software and hardware. Software mainly refers to the operating system, drivers and applications. There is a lot of hardware, such as CPU, memory, hard disk and so on.

Such a complex device needs to perform read and write operations, which is cumbersome and complex.

The reading and writing process of traditional I/O

If you want to understand zero copy, you must know how computers generally read and write data. I call this situation traditional I/O.

The initiator of data reading and writing is the application program in the computer, such as our commonly used browsers, office software, audio and video software, etc.

The source of the data is generally a hard disk, an external storage device or a network socket (that is, the data on the network is processed through the network port + network card).

The process is inherently complicated, so university courses must specifically cover computer software and hardware through "Operating System" and "Principles of Computer Composition".

Simplified version of the read operation process

There is no way to explain it in such detail, so let’s simplify this reading and writing process, ignore most of the details, and only talk about the process.

picture

The picture above shows the process of a read operation by the application.

  1. The application first initiates a read operation and is ready to read data;

  2. The kernel reads data from the hard disk or external storage into the kernel buffer;

  3. The kernel copies data from the kernel buffer to the user buffer;

  4. The application reads the data in the user buffer for processing;

Detailed read and write operation process

The following is a more detailed I/O read and write process. This diagram is very useful. I will use this diagram to clarify some basic but very important concepts of I/O operations.

picture

Take a look at this picture first. The red and pink parts above are read operations, and the blue parts below are write operations.

If it seems a little confusing at first glance, it doesn’t matter. Just take a look at the following concepts and it will become clear.

application

It is the various applications installed on the operating system.

system kernel

The system kernel is a collection of core resources of a series of computers, including not only hardware devices such as CPU and bus, but also a series of functions such as process management, file management, memory management, device drivers, and system calls.

external storage

External storage refers to external storage media such as hard drives and USB drives.

kernel state

  • Kernel state is the mode in which the operating system kernel runs. When the operating system kernel executes privileged instructions, it is in kernel state.

  • In the kernel state, the operating system kernel has the highest authority and can access all hardware resources and sensitive data of the computer, execute privileged instructions, and control the overall operation of the system.

  • The kernel state provides the operating system with the ability to manage and control computer hardware. It is responsible for handling core tasks such as system calls, interrupts, and hardware exceptions.

User mode

The user here can be understood as an application. This user is for the kernel of the computer. For the kernel, various applications on the system will issue instructions to call the resources of the kernel. At this time, the application is the user of the kernel. .

  • User mode is the mode in which the application runs. When the application executes ordinary instructions, it is in user mode.

  • In user mode, applications can only access their own memory space and limited hardware resources, and cannot directly access sensitive data of the operating system or control the computer's hardware devices.

  • User mode provides a secure operating environment to ensure that applications are isolated from each other and prevent malicious programs from affecting the system.

Mode switching

For security reasons, the computer distinguishes between the kernel state and the user state. Applications cannot directly call kernel resources. They must switch to the kernel state and let the kernel make the calls. After the kernel calls the resources, it returns to the application. At this time, the system After switching to user mode, the application can process data in user mode.

The above process actually involves two mode switches for one read and one write.

picture

kernel buffer

The kernel buffer refers to the memory space in the memory specifically used by the kernel for direct use. It can be understood as an intermediate medium for data interaction between applications and external storage.

If the application wants to read external data, it must read it from here. Applications that want to write to external storage must go through the kernel buffer.

user buffer

The user buffer can be understood as a memory space that the application can directly read and write. Because the application cannot directly read and write data to the kernel, the application must first pass the user buffer if it wants to process the data.

disk buffer

PageCache

  • PageCache is a mechanism used by the Linux kernel to cache file systems. It uses free memory to cache data blocks read from the file system, speeding up file read and write operations.

  • When an application or process reads a file, the data is first read from the file system into the PageCache. If the same data is read again later, it can be obtained directly from PageCache, avoiding the need to access the file system again.

  • Similarly, when an application or process writes data to a file, the data is first temporarily stored in PageCache, and then the Linux kernel writes the data to disk asynchronously, thereby improving the efficiency of the write operation.

Let's talk about the data read and write operation process

After understanding these concepts above, look back at the flow chart and it will become much clearer.

Read operation
  1. First, the application initiates a read request to the kernel. At this time, a mode switch is performed, switching from user mode to kernel mode;

  2. The kernel initiates a read operation to external storage or a network socket;

  3. write data to disk buffer;

  4. The system kernel copies the data from the disk buffer to the kernel buffer, and then copies a copy (or a part) to PageCache;

  5. The kernel copies the data to the user buffer for processing by the application. At this time, another mode switch is performed, switching from the kernel mode back to the user mode;

write operation
  1. The application initiates a write request to the kernel, and at this time a mode switch is performed, switching from user mode to kernel mode;

  2. The kernel copies the data to be written from the user buffer to PageCache, and at the same time copies the data to the kernel buffer;

  3. The kernel then writes the data to a disk buffer, thereby writing to disk, or directly to the network socket.

Where is the bottleneck?

But traditional I/O has its bottleneck, which is the reason for the emergence of zero-copy technology. What is the bottleneck? Of course, it is a performance problem, which is too slow. Especially in high-concurrency scenarios, I/O performance often gets stuck.

So where is the time wasted?

data copy

In traditional I/O, data transmission usually involves multiple data copies. Data needs to be copied from the application's user buffer to a kernel buffer, and then from the kernel buffer to a device or network buffer. These data copy processes result in multiple memory accesses and data copies, consuming a lot of CPU time and memory bandwidth.

Switching between user mode and kernel mode

Since the data passes through the kernel buffer, the data is switched back and forth between the user mode and the kernel mode, and there will be context switching during the switching process, which greatly increases the complexity and time overhead of data processing.

Although the time consumed by each operation is very small, when the amount of concurrency is high, the small amount adds up to a big expense, which is not a small overhead. Therefore, to improve performance and reduce overhead, we must start from the above two issues.

At this time, zero-copy technology came out to solve the problem.

what is zero copy

The problem is data copying and mode switching.

But since it is an I/O operation, it is impossible to have no data copy, so the only way to reduce the number of copies is to store the data closer to the application (user buffer) as much as possible.

There are other more important reasons for distinguishing user mode and kernel mode. It is impossible to change this design simply for I/O efficiency. That can only minimize the number of switching.

The ideal state of zero copy is to operate data without copying, but it does not necessarily mean that there is no copy operation in the display case, but to minimize the number of copy operations.

To achieve zero copy, you should start from the following three aspects:

  1. Minimize data copy operations in various storage areas, such as from disk buffer to kernel buffer, etc.;

  2. Minimize the number of switching between user mode and kernel mode and context switching;

  3. Use some optimization methods, such as caching the data that needs to be operated first, and the PageCache in the kernel is for this purpose;

Implement a zero-copy solution

Direct Memory Access (DMA)

DMA is a hardware feature that allows peripherals (such as network adapters, disk controllers, etc.) to directly access system memory without intervention by the CPU. During data transmission, DMA can directly transfer data from memory to peripherals, or from peripherals to memory, avoiding multiple copies of data between user mode and kernel mode.

picture

DMA1

As shown in the figure above, the kernel hands over most of the data reading operations to the DMA controller, and the vacated resources can be used to process other tasks.

sendfile

Some operating systems (such as Linux) provide special system calls, such as sendfile, to achieve zero copy when transferring files over the network. With sendfile, applications can directly transfer file data from the file system to network sockets or target files without going through user buffers and kernel buffers.

If sendfile is not used, if A file is written to B file.

  1. The data of file A needs to be copied to the kernel buffer first, and then copied from the kernel buffer to the user buffer;

  2. Then the kernel copies the data in the user buffer to the kernel buffer, and then it can be written to the B file;

With sendfile, copies of user buffers and kernel buffers are no longer needed, saving a lot of overhead.

Shared memory

Using shared memory technology, applications and the kernel can share the same memory area, avoiding data copying between user mode and kernel mode. Applications can write data directly to shared memory, and the kernel can then read data directly from shared memory for transfer, or vice versa.

picture

By sharing a memory area, data sharing is achieved. Just like the reference object in the program, it is actually a pointer and an address.

Memory-mapped Files

Memory-mapped files directly map disk files to the application's address space, allowing the application to read and write file data directly in memory. In this way, modifications to the mapped content are directly reflected in the actual file.

When file data needs to be transferred, the kernel can directly read the data from the memory map area for transfer, avoiding additional copies of data between user mode and kernel mode.

Although it seems to be no different from shared memory, the implementation methods of the two are completely different. One is a shared address and the other is a mapped file content.

How Java implements zero copy

Java's standard IO library does not have a zero-copy implementation. Standard IO is equivalent to the traditional mode mentioned above. Only in NIO introduced by Java, a new set of I/O classes, such as  ByteBuffer and  Channel, are included, which can achieve zero copy to a certain extent.

ByteBuffer: Can directly operate byte data, avoiding the copying of data between user mode and kernel mode.

Channel: Supports direct transmission of data from a file channel or network channel to another channel, achieving zero-copy transmission of files and networks.

With the help of these two objects, combined with the API in NIO, we can achieve zero copy in Java.

First, let's write a method using traditional IO to compare with the later NIO. The purpose of this program is very simple, which is to copy a PDF file of about 100M from one directory to another.

public static void ioCopy() {
  try {
    File sourceFile = new File(SOURCE_FILE_PATH);
    File targetFile = new File(TARGET_FILE_PATH);
    try (FileInputStream fis = new FileInputStream(sourceFile);
         FileOutputStream fos = new FileOutputStream(targetFile)) {
      byte[] buffer = new byte[1024];
      int bytesRead;
      while ((bytesRead = fis.read(buffer)) != -1) {
        fos.write(buffer, 0, bytesRead);
      }
    }
    System.out.println("传输 " + formatFileSize(sourceFile.length()) + " 字节到目标文件");
  } catch (IOException e) {
    e.printStackTrace();
  }
}

The following is the execution result of this copy program, 109.92M, which takes 1.29 seconds.

Time taken to transfer 109.92 M bytes to target file: 1.290 seconds

FileChannel.transferTo() 和 transferFrom()

FileChannel is a channel for file reading, writing, mapping and operation. It is thread-safe in a concurrent environment. A file channel can be created and opened based on the getChannel() method of FileInputStream, FileOutputStream or RandomAccessFile. FileChannel defines two abstract methods, transferFrom() and transferTo(), which implement data transfer by establishing connections between channels.

For these two methods, the sendfile method is preferred. As long as the current operating system supports it, use sendfile, such as Linux or MacOS. If the system does not support it, such as Windows, it is implemented using memory mapped files.

transferTo()

The following is an example of transferTo, which still copies a PDF of about 100M. My system is MacOS.

public static void nioTransferTo() {
  try {
    File sourceFile = new File(SOURCE_FILE_PATH);
    File targetFile = new File(TARGET_FILE_PATH);
    try (FileChannel sourceChannel = new RandomAccessFile(sourceFile, "r").getChannel();
         FileChannel targetChannel = new RandomAccessFile(targetFile, "rw").getChannel()) {
      long transferredBytes = sourceChannel.transferTo(0, sourceChannel.size(), targetChannel);

      System.out.println("传输 " + formatFileSize(transferredBytes) + " 字节到目标文件");
    }
  } catch (IOException e) {
    e.printStackTrace();
  }
}

It only took 0.536 seconds, twice as fast.

Time taken to transfer 109.92 M bytes to the target file: 0.536 seconds

transferFrom()

The following is an example of transferFrom, still copying the PDF of about 100M, my system is MacOS.

public static void nioTransferFrom() {
  try {
    File sourceFile = new File(SOURCE_FILE_PATH);
    File targetFile = new File(TARGET_FILE_PATH);

    try (FileChannel sourceChannel = new RandomAccessFile(sourceFile, "r").getChannel();
         FileChannel targetChannel = new RandomAccessFile(targetFile, "rw").getChannel()) {
      long transferredBytes = targetChannel.transferFrom(sourceChannel, 0, sourceChannel.size());
      System.out.println("传输 " + formatFileSize(transferredBytes) + " 字节到目标文件");
    }
  } catch (IOException e) {
    e.printStackTrace();
  }
}

execution time:

Time taken to transfer 109.92 M bytes to target file: 0.603 seconds

Memory-Mapped Files

Java's NIO also supports memory-mapped files (Memory-mapped Files) through  FileChannel.map() implementation.

The following is an  FileChannel.map()example, still copying a PDF of about 100M. My system is MacOS.

    public static void nioMap(){
        try {
            File sourceFile = new File(SOURCE_FILE_PATH);
            File targetFile = new File(TARGET_FILE_PATH);

            try (FileChannel sourceChannel = new RandomAccessFile(sourceFile, "r").getChannel();
                 FileChannel targetChannel = new RandomAccessFile(targetFile, "rw").getChannel()) {
                long fileSize = sourceChannel.size();
                MappedByteBuffer buffer = sourceChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileSize);
                targetChannel.write(buffer);
                System.out.println("传输 " + formatFileSize(fileSize) + " 字节到目标文件");
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

execution time:

Time taken to transfer 109.92 M bytes to target file: 0.663 seconds

A disk buffer is a temporary storage area in computer memory that is used to stage data read from or written to disk before it is written. It is a mechanism to optimize disk I/O operations. By taking advantage of the fast access speed of memory, it reduces frequent access to slow disks and improves the performance and efficiency of data reading and writing.

Guess you like

Origin blog.csdn.net/2301_78586758/article/details/132008398