Analysis of zero-copy technology [kafka realizes million-level throughput basis]

Families, the knowledge points in this issue may cause you to lose a few hairs. In order to make you understand as much as possible, let's take a look at a chestnut before entering the topic.

gift for girlfriend

If, I mean if, you have a girlfriend (don't be discouraged if you don't, after reading the article carefully, you can ask the blogger to get it)

Valentine's Day is coming, you want to buy a gift for your girlfriend online, but it seems insincere to send a ready-made gift online. In order to avoid being issued a good card, you want to buy it back and DIY it before giving it

Affected by the epidemic, the courier cannot enter the community, and you cannot leave the community. The courier first puts the gift at the courier point in the community. You go to the courier point to pick it up, take it home and DIY it, and then put it back at the courier point. Click to get a present to send to girlfriend's house

I simply drew a flow chart
insert image description here
. Looking at this process, there are too many unreasonable places. Human time and energy are precious and should be used to play games, not to send/take express delivery.

After a few years, the logistics company developed a delivery robot. From then on, the courier brother no longer needs to deliver goods to the door, so the process has become
insert image description here
a courier robot, and the courier is freed, but you still need to go to the courier point to pick up gifts. , After DIY, you have to send it back to the express point, which is obviously unreasonable. Is it not good for me to play the mid-laner Yasuo at this time?

So you put a request to the gift shop owner: can you help me DIY, and then send it directly to my girlfriend's house

The boss thinks it makes sense and launched a new service: users who need DIY, just need to note the order

So the process becomes when
insert image description here
you place an order, ask the store to help you DIY, and send it directly to your girlfriend's house. The girlfriend received the express and was moved to tears.

So far, your two trips have been saved.

look back

From the perspective of the whole process of delivering gifts, at first the courier needed to run twice, and you also needed to run twice, but now only the robot needs to run twice.

To put it simply, four runs turned into two runs, and these two runs were completed by robots, so the number of artificial runs in the whole process changed from 4 to 0. I will call this process. Zero run


ok what should come is still here, the above example is just to arouse your interest, to understand zero copy, we now substitute the terminology into

  • your home ==> application cache
  • Express Point ==> OS Kernel Cache
  • durex flagship store ==> disk
  • Girlfriend's home ==> target disk or socket

Why implement zero copy

Background: File operations are dangerous operations, and it is impossible for the operating system to directly give permissions to applications or users. Only the kernel has permissions
. When an application operates a file, such as copying and transferring, it needs to go through several steps:
copy the file from the disk to the The system
kernel is copied from the system kernel to the application program
is copied from the application program to the system kernel is copied
from the system kernel to the consumers such as disk and socket

Draw a simple flowchart

Note: The focus of this article is to help you understand zero copy. The knowledge about the operating system kernel is too off topic, so the flow chart is very simple. For example, "operating system kernel cache" can actually be subdivided into "PageCache" and "Socket buffer". ”, I won’t draw details like these, and it’s complicated to draw a headache
, but please rest assured, although it is simple, the key points are not omitted, and it will not affect the understanding of zero copy at all.

insert image description here
The above steps 1, 2, 3, and 4 are all completed by the cpu. We know that the computing efficiency of the cpu and the disk are not of the same order of magnitude

For the server, CPU resources are very scarce. These four replication operations have been occupying the CPU, which will obviously reduce the performance of the entire system.

So the big guys in the computer industry thought of a way to add a component to the motherboard, called DMA (full name Direct Memory Access Controller, direct memory access controller ), which is a piece of physical hardware, dedicated to the disk to the operating system kernel cache data replication between

So the process becomes a
insert image description here
red word explanation: almost all current network cards support SG-DMA (The Scatter-Gather Direct Memory Access) technology (you can use the ethtool -k eth0 | grep scatter-gather command to check whether it is supported), you can understand this point. , no need for in-depth research, this is already a knowledge point in the field of embedded development, we don't need to spend too much time on it

If you are a little confused now, it doesn't matter, you just need to remember that with DMA, the cpu no longer participates in the entire data copying or transfer process, and only needs to send instructions to the DMA to tell it which file should be copied or transferred. , and notify the cpu after the DMA completes the data operation


Let's talk about DMA. The implementation of DMA depends on the operating system and hardware:

In terms of operating system , os provides the sendfile() function,
insert image description here
which receives four parameters: target-side file descriptor, source-side file descriptor, offset, and data length

We don't need to study how to call this function, because Java has already done the encapsulation for us.
insert image description here
You don't need to care which class this method is in now, I will write a code demonstration later

Why kafka can achieve a million-level throughput depends on the zero-copy model. If you look at the kafka source code carefully, you will find that it finally calls the transferTo() method

In terms of hardware , DMAC components are added to the motherboard. In fact, it is not only the motherboard, but the computer has developed to this day, and almost all places involving io operations have integrated DMAC.

So far, we can make a brief summary: the so-called zero-copy technology can actually be understood as a combination of software, hardware, and language. The purpose is to reduce CPU waiting time and improve data transmission efficiency.


Talk is cheap,Show me the code

Let's take copying files as an example to test the performance improvement brought by the zero-copy model.

Create a new test file of
insert image description here
600MB
insert image description here
and try the traditional io copy first (the code is at the end)
insert image description here

600MB took 800ms, which is already very fast in io. If you use ordinary io without buffer, the time has to be increased by about 3 times.

Let's take a look at the zero-copy nio
insert image description here

About 300ms, which is more than doubled compared to io. It doesn’t look good. If you think about high concurrency scenarios, there are tens of thousands of similar operations at every turn. The time saved is enough to make Party A excited to slap the wheelchair.

Code:

public class ZeroCopyTest {
    
    
    public static void main(String[] args) throws IOException, InterruptedException {
    
    
        long start = System.currentTimeMillis();
        File source = new File("D:/fileTest/copy.txt");
        File target = new File("D:/fileTest/copy2.txt");

        ioCopy(source, target);
//        ioCopyWithBuffer(source, target);
//        nioCopy(source, target);
        System.out.println(System.currentTimeMillis() - start);
        Thread.sleep(200);
        target.delete();
    }

    public static void ioCopy(File source, File target) throws IOException {
    
    
        try (InputStream is = new FileInputStream(source);
             OutputStream os = new FileOutputStream(target)) {
    
    
            byte[] buffer = new byte[1024];
            int length;
            while ((length = is.read(buffer)) > 0) {
    
    
                os.write(buffer, 0, length);
            }
        }
    }

    public static void ioCopyWithBuffer(File source, File target) throws IOException {
    
    
        try (InputStream is = new BufferedInputStream(new FileInputStream(source));
             OutputStream os = new BufferedOutputStream(new FileOutputStream(target))) {
    
    
            byte[] buffer = new byte[1024];
            int length;
            while ((length = is.read(buffer)) > 0) {
    
    
                os.write(buffer, 0, length);
            }
        }
    }

    public static void nioCopy(File source, File target) throws IOException {
    
    
        try (FileChannel sourceChannel = new FileInputStream(source).getChannel();
             FileChannel targetChannel = new FileOutputStream(target).getChannel()) {
    
    
            for (long count = sourceChannel.size(); count > 0; ) {
    
    
                long transferred = sourceChannel.transferTo(sourceChannel.position(), count, targetChannel);
                sourceChannel.position(sourceChannel.position() + transferred);
                count -= transferred;
            }
        }
    }
}

The two hairs are gone again, oh oh oh ~


The lead girlfriend mentioned earlier is actually a lie to you.
insert image description here


ok i'm done

Guess you like

Origin blog.csdn.net/qq_33709582/article/details/123043821