Kafka high-performance read and write principle

1. Page cache technology + disk sequential write
First, Kafka writes data to the disk every time it receives data, as shown in the figure below.

 


So here we can’t help but have a question. If the data is stored on a disk and the data is frequently written to the disk file, will the performance be poor? Everyone must think that disk write performance is extremely poor.

Yes, if it is really as simple as the above picture, then the performance is indeed relatively poor. But in fact, Kafka has an excellent and outstanding design here, just to ensure the performance of data writing. First of all, Kafka implements file writing based on the page cache of the operating system.

The operating system itself has a layer of cache, called page cache, which is a cache in memory. We can also call it os cache, which means the cache managed by the operating system itself. When writing a disk file, you can directly write to the os cache, that is, only write to the memory, and then the operating system decides when to flush the data in the os cache to the disk file.

Only this step can improve the disk file writing performance a lot, because in fact, this is equivalent to writing to the memory, not to the disk. Let's continue to look at the figure below.

Then another one is when Kafka writes data, the very important point is that he writes in disk order.

In other words, just append the data to the end of the file, not modify the data at a random location in the file.

Ordinary mechanical disks have extremely poor performance if you write them randomly, that is, you can find a certain location of the file to write data. But if you write data in a sequential manner at the end of the appended file, the performance of this disk sequential write can basically be the same as the performance of writing to the memory itself.

So everyone knows that when Kafka writes data, on the one hand, it writes data based on the page cache at the OS level, so the performance is high, and the essence is to write memory. On the other hand, Kafka uses disk sequential writing, so even when data is flushed to disk, the performance is extremely high, which is similar to writing to memory.

Based on the above two points, Kafka has achieved ultra-high performance for writing data.

So, if it takes 1 millisecond for Kafka to write a piece of data, is it possible to write 1,000 pieces of data per second?

But what if Kafka's performance is extremely high and it only takes 0.01 milliseconds to write a piece of data? So is it possible to write 100,000 records per second?

Therefore, to ensure that the core point of writing tens of thousands or even hundreds of thousands of data per second is to improve the performance of each data writing as much as possible, so that more data can be written per unit time and throughput can be improved. .

2. Zero-copy technology
After finishing writing, let’s talk about consumption.

We should all know that we often consume data from Kafka, so when we consume data, we actually read a piece of data from Kafka's disk file and send it to downstream consumers, as shown in the following figure.

 

 

So if you frequently read data from the disk and send it to consumers, where is the performance bottleneck?

Assuming that if Kafka does nothing to optimize, it simply reads data from the disk and sends it to downstream consumers, then the approximate process is as follows:

1 If the data to be read is not in the os cache, if it is not, read the data from the disk file and put it into the os cache.
2 Copy data from the OS cache of the operating system to the cache of the application process
3 Copy data from the cache of the application process to the Socket cache at the operating system level
4 Finally, extract the data from the Socket cache and send it to the network card, and finally send it Go out for downstream consumption.
The whole process is shown in the following figure:

 

 

 

It can be seen that two copies are completely unnecessary, one is from the os cache to the application process cache, and the other is from the application cache to the Socket cache of the operating system. And in order to make these two copies, several context switches occurred in the middle. For a while, the application was executing, and then the context was switched to the operating system for execution.

Therefore, reading data in this way consumes performance.

In order to solve this problem, Kafka introduces zero-copy technology when reading data.

In other words, the data in the cache of the operating system is directly sent to the network card and then transmitted to the downstream consumers, skipping the step of copying the data twice in the middle, and only one descriptor will be copied in the Socket cache, and the data will not be copied. To the Socket cache.

Let’s look at the picture below and experience this delicate process:

 

 

 

With zero copy technology, there is no need to copy the data in the os cache to the application cache, and then from the application cache to the socket cache. Both copies are omitted, so it is called zero copy.

Socket cache is just the descriptor of the copy data, and then the data is sent directly from the os cache to the network card. This process greatly improves the performance of reading file data during data consumption.

And everyone will notice that when reading data from the disk, it will first check whether there is any in the os cache memory. If so, the data is actually read directly from the memory.

3. Summary
Kafka combines the above read and write features and after good tuning, we will find that a large amount of data is written directly to the os cache, and then when reading the data, it is also read from the os cache.

It is equivalent to Kafka providing data writing and reading completely based on memory, so the overall performance will be extremely high

Guess you like

Origin blog.csdn.net/Erica_1230/article/details/114693768