Kafka zero copy mechanism

        One of the big reasons why Kafka is so fast is zero-copy technology. Zero-copy is not a patent of Kafka, but an upgrade of the operating system. Netty, for example, also uses zero-copy. Below I will draw a picture to explain zero copy. If it is helpful to you, please like it and support it.

Traditional IO

The data of Kafka must fall into the disk, so it must involve disk IO. Traditional disk IO is also called cache IO, and the efficiency is very low. So why is the efficiency low? Let's first briefly talk about operating system knowledge.

The concepts of user space and kernel space:

We know that operating systems now use virtual memory. So for a 32-bit operating system, its addressing space (virtual storage space) is 4G (2 to the 32nd power). The core of the system is the kernel, which is independent of ordinary applications and has access to protected memory space and full access to underlying hardware devices. In order to ensure that user processes cannot directly operate the kernel and ensure the security of the kernel, the system divides the virtual space into two parts, one is kernel space (Kernel space) and the other is user space (User space). For the Linux operating system, the highest 1G bytes (from virtual address 0xC0000000 to 0xFFFFFFFF) are used by the kernel, called kernel space, while the lower 3G bytes (from virtual address 0x00000000 to 0xBFFFFFFF) are used by each Processes use, called user space. Each process can enter the kernel through system calls, therefore, the Linux kernel is shared by all processes within the system. Therefore, from the perspective of a specific process, each process can have 4G bytes of virtual space.

Traditional file reading and writing or network transmission usually requires converting data from kernel mode to user mode. Before the application reads user-mode memory data and writes to the file/Socket, it needs to convert from user mode to kernel mode before writing to the file or network card. We can call it read/write mode. The steps of this mode are:

  1. First, when calling read, the disk file is copied to the kernel state;
  2. After that, the CPU controls the copy of the kernel state data to the user state;
  3. When calling write, first copy the content in user mode to the buffer of the socket in kernel mode;
  4. Finally, the data of the socket buffer in the kernel mode is copied to the network card device for transmission;

 DMA

DMA (Direct Memory Access) is an important feature of all modern computers. It allows hardware devices of different speeds to communicate without relying on a large interrupt load on the CPU. In layman's terms, DMA transfer copies data from one address space to another. When the CPU initializes the transfer action, the transfer action itself is executed and completed by the DMA controller, that is, it is completed between two pieces of hardware. Without the involvement of the CPU, the CPU can be released to do other things, which greatly improves efficiency. Our common hardware devices such as network cards, disk devices, graphics cards, sound cards, etc. all support DMA.

So the read/write mode mentioned above is roughly as shown in the figure:

 

Traditional IO has two big shortcomings that make it very slow:

  1. We can clearly see that a total of 4 copies were generated. The mutual reading and writing from the disk file to the Kernal supports DMA copy, but even so, there is no hardware support from the Kernal to the User, so DMA is not supported. There are two more times. CPU copy.
  2. Kafka only stores the file on the disk and sends it out through the network. No data needs to be modified in the middle. The two CPU copy operations of read and write are completely redundant.

 zero copy

mmap

Mmap is a type of zero copy. The main purpose is to remove the two CPU copies of read and write to improve performance, and call mmap() to replace the read call:

buf = mmap(diskfd, len);
write(sockfd, buf, len);

The steps for this mode are:

  1. The user program calls mmap(), and the data on the disk will be copied to the kernel buffer through DMA;
  2. Then the operating system will share this kernel buffer with the user program, so that there is no need to copy the contents of the kernel buffer to user space;
  3. The user program then calls write(), and the operating system directly copies the contents of the kernel buffer to the socket buffer;
  4. Finally, the socket buffer sends the data to the network card.


This is obviously a great improvement, reducing the number of context switches from 4 to 2 times, and also reducing the number of data copies from 4 to 3 times.

 

sendfile

The Linux 2.1 kernel began to introduce the sendfile function, which is used to transfer files through sockets. It was no different from mmap at the beginning, but Linux 2.4 made major optimizations and pushed zero copy to the top.

The optimized processing process is as follows:

  1. Copy the file to the kernel buffer;
  2. Append the location and offset of the current data to be generated in the kernel buffer to the socket buffer;
  3. Directly copy the kernel buffer data to the network card device according to the position and offset in the socket buffer;

As shown in the picture:

 After the above process, the data is transferred from the disk after only two copies. This is the real Zero-Copy (zero copy here is for the kernel, and the data is Zero-Copy in kernel mode).

It is the kernel of Linux 2.4 that has been improved, and TransferTo() in Java implements Zero-Copy.

Test Tested
on Windows 10:

 

The test results are for reference only and are not averages, so there may be large deviations.

Copyright statement

Original link: https://blog.csdn.net/yxf19034516/article/details/108518194

Guess you like

Origin blog.csdn.net/TangYuG/article/details/132001553
Recommended