Zero copy study notes

Recently, I plan to systematically learn netty-related knowledge, but I don’t know much about zero-copy content, and zero-copy technology is used in middleware such as kafka, rabbitmq, and rocketmq, so I collect and read online materials for learning summary, and summarize them into this article , so that the follow-up can continue to study in depth, and it is also convenient for latecomers to learn from. There are inevitably omissions in the article. I hope readers will not hesitate to give advice, thank you very much!

1. Introduction to zero copy

"Zero copy" means that during computer operation, the CPU does not need to consume resources for copying data between memories. And it usually refers to the way that when a computer sends a file on the network, it does not need to copy the content of the file to the user space (User Space), but directly transmits it to the network in the kernel space (Kernel Space).

The benefits brought to us by zero copy
1) Reduce or even completely avoid unnecessary CPU copies, thereby freeing the CPU to perform other tasks
2) Reduce memory bandwidth usage
3) Usually zero copy technology can also reduce user space and operations Context switching between system kernel spaces

Zero-copy implementation
There is no real standard for the actual implementation of zero-copy, it depends on how the operating system implements this. Zero copy is completely dependent on the operating system. If the operating system supports it, there is; if it doesn't support it, it doesn't. Does not depend on Java itself.

2. Zero-copy development

Kernel space: the space used by Linux itself; it mainly provides functions such as process scheduling, memory allocation, and connection to hardware resources

User space: the space provided to each program process; user space does not have permission to access kernel space resources, if the application needs to use the resources of kernel space, it needs to be completed through system calls: switch from user space to kernel space, complete After related operations, switch back from kernel space to user space

2.1 Traditional I/O

insert image description here
For example, java reads a disk file on the linux system and sends it to the remote service
1) issuing a read system call will cause a context switch from user space to kernel space, and then read the data in the file from the disk through DMA Get the kernel space buffer
2) Then copy the data in the kernel space buffer to the user space process memory, and then the read system call returns. The return of the system call will cause a context switch from kernel space to user space.
3) The write system call will again cause a context switch from user space to kernel space, copying the memory data in the process of user space to the socket of kernel space The buffer (also the kernel buffer, but for the socket), then the write system call returns, triggering the context switch again
4) As for the data transmission from the socket buffer to the network card, it is an independent and asynchronous process, that is to say, the write system call The return does not guarantee that the data is transferred to the network card
. "There are four context switches between user space and kernel space. The four data copies are two CPU data copies and two DMA data copies."

Disk—>kernel—>user—>socket buffer (kernel)---->network card

2.2 mmap+write

mmap is a method of memory mapping. This function can be used in file processing, that is, to map a file or other object to the address space of the process, and realize the one-to-one pairing of the file disk address and a segment of virtual address in the process virtual address space. mapping relationship. You can program to make the contents of a disk file look like an array in memory. If the file consists of records, and these records can be described by structures, the contents of the file can be updated by accessing the array of structures.
After realizing such a mapping relationship, the process can use pointers to read and write this section of memory, and the system will automatically write back dirty pages to the corresponding file disk, that is, the operation of the file is completed without calling read, write and other system call functions. The modification of this area by the kernel space also directly reflects the user space, so that file sharing between different processes can be realized.

insert image description here
1) Issue the mmap system call, causing a context switch from userspace to kernelspace. The data in the disk file is then copied to the kernel space buffer by the DMA engine
2) The mmap system call returns, causing a context switch from kernel space to user space
3) There is no need to copy data from kernel space to user space here, because user space This buffer is shared with kernel space
4) A write system call is issued, causing a context switch from user space to kernel space. Copy data from the kernel space buffer to the kernel space socket buffer; the write system call returns, causing a context switch from kernel space to user space
5) asynchronously, the DMA engine copies the data in the socket buffer to the network card

The zero-copy I/O implemented by mmap performs 4 context switches between user space and kernel space, and 3 data copies; the 3 data copies include 2 DMA copies and 1 CPU copy

2.3 sendfile

The sendfile function transfers data between two file descriptors (completely operated in the kernel), thus avoiding the data copy between the kernel buffer and the user buffer, which is very efficient and is called zero copy.

The working principle of sendfile
1. The system calls sendfile() to copy the hard disk data to the kernel buffer through DMA, and then the data is directly copied by the kernel to another kernel buffer related to the socket. There is no switching between user mode and kernel mode, and the copy from one buffer to another is directly completed in the kernel.
2. DMA directly copies the data from the kernel buffer to the protocol stack, without switching, and does not need data from the user state and the core state, because the data is in the kernel.

insert image description here
1) Issue the sendfile system call, resulting in a context switch from user space to kernel space, and then copy the contents of the disk file to the kernel space buffer through the DMA engine, and then copy the data from the kernel space buffer to the socket-related buffer Zone
2) The sendfile system call returns, causing a context switch from kernel space to user space. DMA asynchronously transfers the data in the socket buffer of the kernel space to the network card

The zero-copy I/O implemented by sendfile uses 2 context switches between user space and kernel space, and 3 copies of data. Among them, 3 data copies include 2 DMA copies and 1 CPU copy

2.4 Zero copy implemented by sendfile with DMA collection copy function

SG-DMA

一、Scatter-gather DMA方式是与block DMA方式相对应的一种DMA方式。
在DMA传输数据的过程中,要求源物理地址和目标物理地址必须是连续的。但是在某些计算机体系中,如IA架构,连续的存储器地址在物理上不一定是连续的,所以DMA传输要分成多次完成。如果在传输完一块物理上连续的数据后引起一次中断,然后再由主机进行下一块物理上连续的数据传输,那么这种方式就为block DMA方式。Scatter-gather DMA方式则不同,它使用一个链表描述物理上不连续的存储空间,然后把链表首地址告诉DMA master。DMA master在传输完一块物理连续的数据后,不用发起中断,而是根据链表来传输下一块物理上连续的数据,直到传输完毕后再发起一次中断。很显然,scatter-gather DMA方式比block DMA方式效率高。
二、其工作方式差异性也主要体现在以下几个方面
SG-DMA有三种工作方式,可以工作在Memory-to-Stream即存储接口到流接口,或者Stream-to-Memory即流接口到存储接口,以及Memory-to-Memory的存储器到存储器工作方式。工作在存储器到存储器的工作方式与普通DMA并无差别,没有数据流处理的优势。另外SG-DMA增加了Descriptor Processor,可以实现批量工作,从而进一步减轻Nios处理器的工作。只需要将Descriptor命令字写入到相应的Descriptor memory中。

insert image description here
Starting from version 2.4 of Linux, the operating system provides the SG-DMA method of scatter and gather, which directly reads data from the kernel space buffer to the network card, without copying the data in the kernel space buffer to the socket buffer 1
) Issue the sendfile system call, causing a context switch from userspace to kernelspace. Copy the contents of the disk file to the kernel space buffer through the DMA engine
2) The data is not copied to the socket buffer here; instead, the corresponding descriptor information is copied to the socket buffer. The descriptor contains two kinds of information: A) the memory address of the kernel buffer, B) the offset of the kernel buffer
3) the sendfile system call returns, resulting in a context switch from kernel space to user space. DMA directly copies the data in the kernel buffer to the network card according to the address and offset provided by the descriptor
of the socket buffer. Context switching, and 2 copies of data, and these 2 copies of data are non-CPU copies. In this way, we have achieved the ideal zero-copy I/O transmission, without any CPU copy, and at least context switch for

3. java zero copy

3.1 sendfile mode

NIO's zero copy is implemented by the transferTo() method. The transferTo() method transfers data from the FileChannel object to a writable byte channel (such as Socket Channel, etc.). In the internal implementation, it is implemented by the native method transferTo0(), which depends on the support of the underlying operating system. On UNIX and Linux systems, calling this method will cause the sendfile() system call.
The usage scenarios are generally:
large, slow reading and writing, pursuit of speed
M insufficient memory, cannot load too much data,
insufficient bandwidth, that is, there are a large number of IO operations in other programs or threads, resulting in small bandwidth

If the transferTo and transferFrom of FileChannel are supported by the bottom layer of the operating system, transferTo and transferFrom will also use related zero-copy technology to realize data transmission. The usage is as follows

public void main(String[] args) {
    try {
        FileChannel readChannel = FileChannel.open(Paths.get("./cscw.txt"), StandardOpenOption.READ);
        FileChannel writeChannel = FileChannel.open(Paths.get("./siting.txt"), StandardOpenOption.WRITE, StandardOpenOption.CREATE);
        long len = readChannel.size();
        long position = readChannel.position();
        //数据传输
        readChannel.transferTo(position, len, writeChannel);
        //效果和transferTo 一样的
        //writeChannel.transferFrom(readChannel, position, len, );
        readChannel.close();
        writeChannel.close();
    } catch (Exception e) {
        System.out.println(e.getMessage());
    }
}

All of the above are based on the fact that data file operations are not required. What if both such speed and data operations are required? Then use NIO's direct memory!

3.2 mmap method

The zero-copy method provided by java
The zero-copy implementation of java NIO is based on the
MappedByteBuffer generated by the map method of FileChannel in the mmap+write method. FileChannel provides the map() method, which can establish a virtual memory map between an open file and MappedByteBuffer , MappedByteBuffer inherits from ByteBuffer; the buffer's memory is a memory-mapped area of ​​a file. The bottom layer of the map method is implemented through mmap, so after the file memory is read from the disk to the kernel buffer, the user space and the kernel space share the buffer. The usage is as follows

public void main(String[] args){
    try {
        FileChannel readChannel = FileChannel.open(Paths.get("./cscw.txt"), StandardOpenOption.READ);
        FileChannel writeChannel = FileChannel.open(Paths.get("./siting.txt"), StandardOpenOption.WRITE, StandardOpenOption.CREATE);
        MappedByteBuffer data = readChannel.map(FileChannel.MapMode.READ_ONLY, 0, 1024 * 1024 * 40);
        //数据传输
        writeChannel.write(data);
        readChannel.close();
        writeChannel.close();
    }catch (Exception e){
        System.out.println(e.getMessage());
    }
}

First of all, its role is between traditional IO (BIO) and zero copy. Why do you say that?

IO, you can pass the disk file through the kernel space, read it into the JVM space, and then perform various operations, and finally write it to the disk or send it to the network. The efficiency is slow but it supports data file operations.
Zero copy is to read the file directly in the kernel space and transfer it to the disk (or send it to the network). Since it does not read the file data to the JVM, the program cannot manipulate the file data, although it is very efficient!
The direct memory is in between, the efficiency is average and the file data can be manipulated. Direct memory (mmap technology) maps the file directly to the memory of the kernel space and returns an operation address (address), which solves the dilemma that the file data needs to be copied to the JVM for operation. Instead, it operates directly in the kernel space, saving the step of copying the kernel space to the user space.

NIO's direct memory is implemented by MappedByteBuffer. The core is the map() method, which maps the file to the memory, obtains the memory address addr, and then constructs the MappedByteBuffer class through this addr to expose various file operation APIs.

Since MappedByteBuffer applies for off-heap memory, it is not controlled by Minor GC and can only be recycled when Full GC occurs. DirectByteBuffer improves this situation. It is a subclass of the MappedByteBuffer class. At the same time, it implements the DirectBuffer interface and maintains a Cleaner object to complete memory recovery. Therefore, it can reclaim memory through Full GC, or call the clean() method to recycle.

In addition, the size of direct memory can be set by jvm parameters: -XX:MaxDirectMemorySize.

NIO's MappedByteBuffer also has a brother called HeapByteBuffer. As the name suggests, it is used to apply for memory in the heap, which is essentially an array. Since it's on the heap, it's GC-controlled and easy to recycle.

4. References

https://zhuanlan.zhihu.com/p/268713849
https://www.cnblogs.com/huxiao-tee/p/4660352.html
https://blog.csdn.net/u014303647/article/details/82081451
https://www.pianshen.com/article/46781264586/
https://www.jianshu.com/p/497e7640b57c

Guess you like

Origin blog.csdn.net/shy871/article/details/120326674