Kafka file storage mechanism those things

what is Kafka

Kafka was originally developed by Linkedin. It is a distributed, partitioned, multi-copy, multi-subscriber, distributed log system coordinated by zookeeper (also used as an MQ system), which can be commonly used for web/nginx logs, access Logging, messaging services, and more, Linkedin contributed to the Apache Foundation in 2010 and became a top-level open source project.

1 Introduction

The performance of a commercial message queue, its file storage mechanism design is one of the most critical indicators to measure the technical level of a message queue service.
The following will analyze how Kafka achieves efficient file storage and its practical application effects from the perspective of Kafka's file storage mechanism and physical structure.

2. Kafka file storage mechanism

Some terms in Kafka are explained as follows:

  • Broker: The message middleware processing node. A Kafka node is a broker, and multiple brokers can form a Kafka cluster.
  • Topic: A type of message, such as page view log, click log, etc., can exist in the form of topics, and the Kafka cluster can be responsible for the distribution of multiple topics at the same time.
  • Partition: The physical grouping of topics. A topic can be divided into multiple partitions, and each partition is an ordered queue.
  • Segment: The partition is physically composed of multiple segments, which are described in detail in 2.2 and 2.3 below.
  • offset: Each partition consists of a sequence of ordered, immutable messages that are successively appended to the partition. Each message in the partition has a continuous sequence number called offset, which is used to uniquely identify a message in the partition.

The analysis process is divided into the following 4 steps:

  • Partition storage distribution in topic
  • File storage method in partiton
  • Segment file storage structure in partiton
  • How to find message by offset in partition

Through the detailed analysis of the above 4 processes, we can clearly understand the mystery of the kafka file storage mechanism.

2.1 Partition storage distribution in topic

Assuming that there is only one broker in the Kafka cluster in the experimental environment, xxx/message-folder is the root directory of data file storage, and the server.properties file configuration (parameter log.dirs=xxx/message-folder) in the Kafka broker, for example, create two topic names They are report_push, launch_info, and the number of partitions is partitions=4. The
storage path and directory rules are:
xxx/message-folder

              |--report_push-0
              |--report_push-1
              |--report_push-2
              |--report_push-3
              |--launch_info-0
              |--launch_info-1
              |--launch_info-2
              |--launch_info-3

In Kafka file storage, there are multiple different partitions under the same topic, each partition is a directory, the partition naming rule is topic name + sequential number, the first partition number starts from 0, and the maximum number is the number of partitions minus the number of partitions. 1.
If it is a multi-broker distribution situation, please refer to Kafka cluster partition distribution principle analysis

2.2 File storage method in partiton

The following schematic diagram illustrates how files are stored in a partition:
image

                              图1
  • Each partition (directory) is equivalent to a giant file and is evenly distributed into multiple segment (segment) data files of equal size. However, the number of messages in each segment file is not necessarily equal. This feature facilitates the rapid deletion of old segment files.
  • Each partition only needs to support sequential read and write, and the segment file life cycle is determined by the server configuration parameters.

The advantage of this is that it can quickly delete useless files and effectively improve disk utilization.

2.3 segment file storage structure in partiton

The reader learns about the partition storage method of the Kafka file system from Section 2.2. This section analyzes the composition and physical structure of the segment file in the partition in depth.

  • Segment file composition: It consists of two major parts, namely index file and data file. These two files correspond one-to-one and appear in pairs. The suffixes ".index" and ".log" represent segment index file and data file respectively.
  • Segment file naming rules: The first segment of the global segment starts from 0, and each subsequent segment file is named the offset value of the last message of the previous segment file. The maximum value is a 64-bit long, 19-digit character length, and no digits are padded with 0s.

The following file list is an experiment done by the author on the Kafka broker. Create a topicXXX containing 1 partition, set the size of each segment to 500MB, and start the producer to write a large amount of data to the Kafka broker. The segment file list is shown in Figure 2 below. the above 2 rules:
image

            图2

Taking a pair of segment file files in Figure 2 above as an example, the physical structure of the corresponding relationship between index<—> data file in the segment is described as follows:
image

            图3

In the above FIG. 3, the index file stores a large amount of metadata, the data file stores a large amount of messages, and the metadata in the index file points to the physical offset address of the message in the corresponding data file.
Taking the metadata 3,497 in the index file as an example, the third message is sequentially represented in the data file (the 368772nd message in the global partition), and the physical offset address of the message is 497.

From the above Figure 3, we know that the segment data file is composed of many messages. The following describes the physical structure of the message in detail:
image

           图4

Parameter Description:

Keyword Explanation
8 byte offset Each message in a partition (partition) has an ordered id number, which is called an offset (offset), which can uniquely determine the position of each message in the partition (partition). That is, the offset represents the number of message of the partition
4 byte message size message size
4 byte CRC32 Verify message with crc32
1 byte “magic" 表示本次发布Kafka服务程序协议版本号
1 byte “attributes" 表示为独立版本、或标识压缩类型、或编码类型。
4 byte key length 表示key的长度,当key为-1时,K byte key字段不填
K byte key 可选
value bytes payload 表示实际消息数据。

2.4 在partition中如何通过offset查找message

例如读取offset=368776的message,需要通过下面2个步骤查找。

  • 第一步查找segment file
    上述图2为例,其中00000000000000000000.index表示最开始的文件,起始偏移量(offset)为0.第二个文件00000000000000368769.index的消息量起始偏移量为368770 = 368769 + 1.同样,第三个文件00000000000000737337.index的起始偏移量为737338=737337 + 1,其他后续文件依次类推,以起始偏移量命名并排序这些文件,只要根据offset **二分查找**文件列表,就可以快速定位到具体文件。
    当offset=368776时定位到00000000000000368769.index|log

  • 第二步通过segment file查找message
    通过第一步定位到segment file,当offset=368776时,依次定位到00000000000000368769.index的元数据物理位置和00000000000000368769.log的物理偏移地址,然后再通过00000000000000368769.log顺序查找直到offset=368776为止。

The advantages of doing this can be seen from the above Figure 3. The segment index file adopts the sparse index storage method, which reduces the size of the index file and can be directly operated by mmap. The sparse index sets a metadata pointer for each corresponding message of the data file, which is smaller than Dense indexes save more storage space, but take more time to look up.

3 Kafka File Storage Mechanism – Actual Operational Effect

lab environment:

  • Kafka cluster: consists of 2 virtual machines
  • cpu: 4 nuclei
  • Physical memory: 8GB
  • Network card: Gigabit network card
  • jvm heap: 4GB
  • For detailed Kafka server configuration and optimization, please refer to: kafka server.properties configuration details

image

                              图5                                 

As can be seen from Figure 5 above, Kafka rarely has a large number of disk read operations, mainly periodic batch write disk operations, so disk operations are very efficient. This is closely related to the design of reading and writing messages in Kafka file storage. Reading and writing messages in Kafka has the following characteristics:

write message

  • The message is transferred from the java heap to the page cache (ie physical memory).
  • The disk is flushed by an asynchronous thread, and the message is flushed from the page cache to the disk.

read message

  • The message is directly transferred from the page cache to the socket and sent out.
  • When the corresponding data is not found from the page cache, disk IO will be generated at this time, and the Load message will be sent from the
    disk to the page cache, and then sent directly from the socket.

4. Summary

Kafka efficient file storage design features

  • Kafka divides a large parition file in a topic into multiple small file segments. Through multiple small file segments, it is easy to periodically clear or delete the consumed files and reduce disk usage.
  • The index information can quickly locate the message and determine the maximum size of the response.
  • By mapping all index metadata to memory, segment file IO disk operations can be avoided.
  • By sparse storage of index files, the space occupied by the metadata of index files can be greatly reduced.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326710754&siteId=291194637