[The underlying principle of Kafaka achieving high throughput and low latency]

Kafaka’s underlying principles for achieving high throughput and low latency

Although Kafka is a disk-based data storage, it has the characteristics of high concurrency, high throughput, and low latency. Its throughput can easily reach tens of thousands, tens or even millions. So how does Kafka do it?

sequential write

As we all know, Kafka persists message records to local disks. Most people think that disk read and write performance is poor, and they may question how Kafka's performance is guaranteed. In fact, whether it is memory or disk, the key to whether it is fast or slow depends on the addressing method. Disks are divided into sequential read and write and random read and write, and memory is also divided into sequential read and write and random read and write. Disk-based random read and write is indeed very slow, but the sequential read and write performance of the disk is very high. Generally speaking, it is three orders of magnitude higher than the random read and write of the disk. In some cases, the sequential read and write performance of the disk is even higher than the random read and write of the memory . Write .

Kafka's messages are continuously appended to the end of the local disk file instead of being written randomly, which significantly improves Kafka's write throughput.

Page Cache

If the CPU wants to access files on an external disk, it needs to first copy the contents of these files to the memory. Due to hardware limitations, the data transfer speed from the disk to the memory is very slow. If there is free physical memory now, why not use it ? This free memory is used to cache some disk file contents. This part of the memory used to cache disk files is called page cache.

Through the Page Cache of the operating system, Kafka's read and write operations are basically based on memory, and the read and write speed has been greatly improved.

zero copy

What we mainly talk about here is Kafka’s optimization on the consumer side using the “zero-copy” mechanism of the Linux operating system. First, let’s understand the general transmission path of data sent from a file to a socket network connection:

  1. The operating system reads data from the disk to the Page Cache of kernel space.
  2. The application reads the data from Page Cache into the buffer in user space
  3. The application writes the data in the user space buffer back to the kernel space to the socket buffer (socket buffer)
  4. The operating system copies the data from the socket buffer to the NIC buffer sent over the network

This process includes 4 copy operations and 2 system context switches, and the performance is actually very inefficient. The "zero copy" mechanism of the Linux operating system uses the sendfile method, which allows the operating system to send data directly from the Page Cache to the network. Only the last step of the copy operation is required to copy the data to the NIC buffer, thus avoiding the need to re-copy the data. The schematic diagram is as follows:

image-20230923204205119

Through this "zero copy" mechanism, Page Cache combined with the sendfile method, the performance of the Kafka consumer side is also greatly improved. This is why sometimes when the consumer continues to consume data, we do not see that the disk IO is relatively high. At this moment, it is the operating system cache that is providing data.

Partition segment + index

Kafka's messages are stored according to topic classification, and the data in the topic are stored in different broker nodes according to partitions one by one. Each partition corresponds to a folder on the operating system, and the partition is actually stored in segments. This is also very consistent with the design idea of ​​partitioning and bucketing of distributed systems.

Through this partitioned and segmented design, Kafka's messages are actually distributed and stored in small segments, and each file operation is also a directly operated segment. For further query optimization, Kafka creates an index file for the segmented data files by default, which is an .index file on the file system. This partition + index design not only improves the efficiency of data reading, but also improves the parallelism of data operations.

Batch read and write

Kafka data reading and writing is also done in batches rather than individually.
In addition to leveraging the underlying technology, Kafka also provides some means to improve performance at the application level. The most obvious is to use batches. When writing data to Kafka, batch writes can be enabled, which avoids the latency and bandwidth overhead of frequently transmitting individual messages over the network. Assuming that the network bandwidth is 10MB/S, it is obviously much faster to transmit a 10MB message at one time than to transmit a 1KB message 100 million times.

Batch compression

In many cases, the bottleneck of the system is not the CPU or disk, but network IO. This is especially true for data pipelines that need to send messages between data centers on the WAN. Data compression will consume a small amount of CPU resources, but for Kafka, network IO should be considered.

  1. If each message is compressed, but the compression rate is relatively low, so Kafka uses batch compression, that is, multiple messages are compressed together instead of single message compression.
  2. Kafka allows the use of recursive message aggregation. Batches of messages can be transmitted in compressed form and remain compressed in the log until decompressed by the consumer.
  3. Kafka supports multiple compression protocols, including Gzip and Snappy compression protocols.

Source: Kafka - Why can Kafka achieve high throughput and low latency | Can you drink a cup of nothing (liaosi.site)

Guess you like

Origin blog.csdn.net/weixin_45483322/article/details/133217533