[Kafka from entry to abandonment series three] Kafka architecture in-depth-workflow and storage mechanism

In the previous blog, I built a Kafka distributed cluster [Kafka from entry to abandonment series 2] Kafka cluster was built and successfully sent and consumed messages, which initially verified the functions of Kafka. Next, I will be in the first article [Kafka from Getting Started to Abandoning Series 1] Based on the brief theory of overview and basic architecture and the practice of the second article, I began to discuss Kafka's architecture and some strategies in depth. This blog focuses on the whole process and file storage mechanism. The next few articles Introduce the producer strategy, consumer strategy and some advanced features separately

Kafka workflow

Through the learning of basic concepts, we can know that Kafka's messages are divided into Topic, and Topic can be logically divided into Partion. The following points should be noted about Topic&Partion:

  • A topic can be considered as a type of message. Each topic will be divided into multiple Partions, and each Partion is an append log file at the storage level.
  • Any message published to the partition will be appended to the end of the log file. The position of each message in the file is called the offset (offset). The offset is a long number that uniquely marks a message .

Insert picture description here

  • In the Kafka mechanism, the message from the producer push is appended to the Partion. This is a sequential write disk mechanism , which is much more efficient than random write memory
  • Each consumer in the consumer group will record in real time which offset they have consumed , so that when the error is restored, they can continue to consume from the last position
  • Kafka only guarantees the order of the messages in the Partion, and cannot guarantee the order of the messages of the global topic.

The overall process is shown in the following figure:
Insert picture description here
So why is there the concept of partition? It is mainly for the common characteristics of all our distributed middleware: load balancing & horizontal expansion . Topic is only a logical concept, and it is oriented to producers and consumers; while Partition is a physical concept. It is conceivable that if Topic does not partition, but stores the messages in Topic in a broker, then all read and write requests for that Topic will be processed by this broker, and throughput is likely to fall into a bottleneck, which is obviously not in line with high throughput The amount of application scenarios. So it must be partitioned to distribute traffic to different servers .

Kafka file storage mechanism

Of course, even for a Partition, if the amount of messages is too large, there is a risk of clogging, so we need to clean up the messages regularly, of course, from the old ones. If there is only one Partion, then it must be cleared completely. It has a serious impact on the maintenance of message files and the cleanup of consumed messages. So we need to further subdivide Partition physically. Therefore, partitions need to be further subdivided in units of segments . Each partition (directory) is equivalent to a huge file being evenly distributed among multiple equal-sized segment data files (the number of messages in each segment file is not necessarily Equal) This feature also facilitates the deletion of old segments, which facilitates the cleaning of messages that have been consumed, and improves disk utilization. Each partition only needs to support sequential read and write.

Partition&Segment

Next we send a message to see what it looks like on physical storage:
Insert picture description here
three messages: hello, dashuaige, hhh. Open the storage directory and you can see that on the 103 machine, there is a set of index and log , which is the content of a segment . The
Insert picture description here
segment file consists of two parts, namely the ".index" file and the ".log" file, which are respectively represented as the segment index Files and data files . The command rules for these two files are: The first segment of the partition global starts from 0 , and each subsequent segment file is named the offset value of the last message of the previous segment file, the value size is 64 bits, and the length of the number is 20 bits. There is no number filled with 0. I only have one piece of data here, so it starts from 0. When you open the log file, you can see:
Insert picture description here
Although it is garbled, you can vaguely see the messages sent in sequence. The overall storage architecture is as follows:
Insert picture description here

Segment storage structure

Through the above understanding of the segment placement files, we have basically figured out the structure of the segment. Of course, I am here for a single segment, which can’t be seen. Here is an example of a large number of files from the Internet:

//第一段segment,起始位置为0
00000000000000000000.index
00000000000000000000.log
//第一段segment,起始位置为170410
00000000000000170410.index
00000000000000170410.log
//第一段segment,起始位置为239430
00000000000000239430.index
00000000000000239430.log

Take the above segment file as an example to show the corresponding relationship between the ".index" file and the ".log" file of segment: 00000000000000170410, as shown in the figure below: Insert picture description here
As shown above, the ".index " index file stores a large amount of metadata, ". The "log" data file stores a large number of messages, and the metadata in the index file points to the physical offset address of the message in the corresponding data file. Taking the metadata [3, 348] in the ".index" index file as an example, the ".log" data file represents the third message, that is, 170410+3=170413 messages in the global partition. The physical offset address is 348 [ note that this physical offset address is not offset, the global offset is 170413]

Quickly locate messages in partion

Since the message is divided into segments in Partion, how do we quickly locate the position of the message and operate it accurately? Take the above picture as an example, read the message with offset=170418:

  • First find the segment file, where 00000000000000000000.index is the first file, the second file is 00000000000000170410.index (starting offset is 170410+1=170411), and the third file is 00000000000000239430.index (starting offset 239430+1=239431), so this offset=170418 falls into the second file. The other subsequent files are followed by analogy, named and arranged by their offsets, and then the specific file location can be quickly located according to the binary search method .
  • Secondly, according to 170418 -170410=8 in the 00000000000000170410.index file, it is obtained that it is the 8th message in the segment, and the index is located again according to the binary search method to obtain **[message offset, physical offset] **Coordinate [8,1325] is located to the position of 1325 in the 00000000000000170410.log file for reading.
  • After finding the position of 1325, read the messages in order , and make sure to finish reading this message [Where does this message end?] The physical structure of the message is resolved, and the message has a fixed physical structure, including: offset (8 Bytes), message body The size (4 Bytes), crc32 (4 Bytes), magic (1 Byte), attributes (1 Byte), key length (4 Bytes), key (K Bytes), payload (N Bytes) and other fields can be determined. The size of the message, that is, where the reading ends.

The above is the detailed method of locating the message. Through the index method, the corresponding message can be found quickly even after Kafka writes to the disk sequentially.

This blog explains the workflow and storage mechanism of Kafka. In fact, for Kafka, more strategies are reflected in the producer and consumer. In the next two blogs, we will introduce the producer strategy and consumer strategy respectively .

Part of the content refer to https://gitbook.cn/books/5ae1e77197c22f130e67ec4e/index.html

Guess you like

Origin blog.csdn.net/sinat_33087001/article/details/108335097