Kafka internal structure

Insert picture description here
I'm dull. After three days and two nights, I finally understood the internal structure of Kafka. I read many articles before and it was unclear, and I still didn't have a clear composition in my mind...

1. Introduction to Kafka

Apache Kafka was originally a distributed messaging system open sourced by LinkedIn. It is now a sub-project of Apache and has become one of the most widely used messaging systems in the open source field. The Kafka community is very active. Since version 0.9, the slogan of Kafka has been changed from "a high-throughput, distributed messaging system" to "a distributed streaming platform".

Kafka differs from traditional messaging systems in:

  • Kafka is a distributed system that is easy to scale out.
  • It provides high throughput for both publish and subscribe
  • It supports multiple subscribers and automatically balances consumers when it fails
  • Informed endurance

2. Kafka core composition

Insert picture description here
As shown above: you can see the Producer, Broker, Topic, Partition, Consumer Group, Consumer and other components.

In a set of Kafka architecture, there are multiple Producers, multiple Brokers, and multiple Consumers. Each Producer can correspond to multiple Topics, and each Consumer can only correspond to one Consumer Group.

Component Introduction
Producer Message producer: the client that sends messages to Broker
Producer Message Consumer: Client that reads messages from Broker
Broker Intermediate message processing node: A Kafka node is a Broker, one or more Brokers can form a Kafka cluster (generally an odd number of clusters is recommended, at least three)
Topic Topic: Each message published to the Kafka cluster needs to specify a topic, and Kafka messages are stored in the topic
Partition Partition: A physical storage concept. A topic can be divided into multiple Partitions. Each Partition is internally ordered, but the data combined by multiple partitions is not ordered
Consumer Group Consumer group: each Consumer belongs to a specific Consumer Group (CG), a message can be sent to multiple different CGs, but the data of the same partition can only be consumed by one Consumer in the CG

Here are some more detailed explanations of some components:

2.0 Zookeeper

zookeepers (responsible for election, balance, meta records, consumption records)

Zookeeper interacts with brokers and consumers in the cluster to maintain data and cluster high availability.

  • Record the location information of consumer consumption messages;
  • Leader Election when partitions fail
  • How to store the meta information of Kafka in zookeeper

Note: Producer is not registered in zk, consumer is registered in zk.

The structure diagram of kafka in zookeeper is as follows:
Insert picture description here

2.1 Partition (horizontal expansion, high concurrency)

Kafka maintains a distributed partition log file for each topic, and each partition is an append log at the Kafka storage level. Any message posted to this partition will be appended to the end of the log file. Each message in the partition will be assigned a monotonically increasing sequence number in chronological order, which is our offset, which is a long number. Through this offset, we can determine a unique message in the partition. Order is guaranteed under partition, but order is not guaranteed under topic.
Insert picture description here

In response to what the above table says: each Partition is internally ordered, but the data combined by multiple partitions is not ordered

When the message producer sends a message:

  • If a patition is specified, the patition is used directly
  • No patition is specified, but the key is specified, Hash is performed by the key value, and the remainder of the number of partitions is taken to ensure that the same key value will be routed to the same partition. If you want strong order consistency in the queue, you can let all messages Both are set to the same Key.
  • Neither patition nor key is specified, use polling to select a patition

Insert picture description here
Reason for partition:

1. It is convenient to expand horizontally in the cluster. Each patition can be adjusted to fit the machine where it is located, and a topic can be composed of multiple patitions, so the entire cluster can adapt to data of any size.

2. High concurrency can be improved because it can be read and written in units of patition. (Multiple requests may be distributed in different patitions)

2.2 Copy mechanism (high availability)

In the above picture, you can see: Kafka cluster management messages. There are three Brokers in this layer. Among them, there is a TopicA in Broker1, but this TopicA has two partitions: Partition0, Partition1

As mentioned in the above table, a topic can have multiple Partitions

As you can see, there is also a TopicA in Broker2, and this TopicA also has two partitions: Partition0 and Partition1.

In Broker1: Partition0 is Leader, Partition1 is Follower,
but in Broker2: Partition0 is Follower, Partition1 is Leader

We call Partition0 in Broker2 the copy of Partition0 in Broker1. In fact, both of these can be called copies, and Partition0 in Broker1 is the master copy.

Question: Why put the copy of Partition0 in Broker2?

Answer: I ask you, if you put both Leader and Follower on the same machine, then when this machine goes down, doesn't it mean that the data in this partition is all gone? What do you need a copy for?

Kafka's replication mechanism is that multiple server nodes replicate the logs of other nodes' topic partitions. When a node in the cluster fails, the request to access the failed node will be transferred to other normal nodes (this process is usually called Reblance), and each partition of each topic in Kafka has a master copy and 0 or more The replica, the replica keeps the data synchronized with the master replica, and will be replaced when the master replica fails.

The same patition may have multiple replications (corresponding to default.replication.factor=N in the server.properties configuration file). Without replication, once the Broker goes down, all the data on the pattern cannot be consumed, and the producer can no longer store data on the pattern. After the introduction of replication, the same patition may have multiple replications. At this time, a leader needs to be selected among these replications. The producer and consumer only interact with this leader, and other replications act as followers to copy data from the leader.

In Kafka, not all replicas can be used to replace the primary replica, so an ISR (In sync Replicas) set is maintained in the leader node of Kafka, which is also called the in sync set. The need in this set Two conditions are met:

  • The node must remain connected to ZK
  • During the synchronization process, this copy cannot be too far behind the master copy

The ISR stores the locations where all the copies exist, such as: [0,2,1], similar to this structure storage, where 0, 2, 1 represent the Broker ID (this ID is unique)


[0, 2, 1]: Indicates that the current Leader is on the machine with ID 0, 2, 1 means the machine where the other two copies are located.


Note: In fact, it is not unreasonable for 2 to be in front of 1, which means that the machine copy with ID 2 is more Close to the master copy, when the leader hangs up, 2 will be elected as the new leader.

Question: An error is reported when Kafka sets the number of replicas larger than Broker.

Solution: As mentioned before, Kafka copies are stored in different Brokers. If the number of copies is greater than the number of Brokers, at this time there are at least two copies of the same Broker, then what is the point?

2.3 High-performance log storage

All messages under a topic in Kafka are distributed and stored on multiple nodes in a partition manner. At the same time, on the Kafka machine, each Partition actually corresponds to a log directory, and there are multiple log segments (LogSegment) under the directory. The LogSegment file consists of two parts, namely the ".index" file and the ".log" file, which are respectively represented as the segment index file and the data file. The command rules for these two files are: the first segment of the partition global starts from 0, and each subsequent segment file is named the offset value of the last message of the previous segment file, the value size is 64 bits, and the length of the number is 20 bits. No number is filled with 0, as follows, assuming there are 1000 messages, each LogSegment size is 100, the following shows the index and Log of 900-1000:
Insert picture description here

Since Kafka's message data is too large, if all indexes are built, it will take up space and increase time-consuming. Therefore, Kafka chooses the sparse index method, so that the index can directly enter the memory to speed up the partial query speed.

Briefly introduce how to read the data. If we want to read the 911 data, the first step is to find out which segment it belongs to, and find the file it belongs to according to the dichotomy. After finding 0000900.index and 00000900.log, Then go to the index to find the index (911-900) = 11 or the nearest index less than 11, where we find the index is [10,1367] through the dichotomy, and then we pass the physical position of this index 1367, and start to go Search later until you find 911 data.

The above is about the process of looking for a certain offset, but most of the time we don’t need to find a certain offset, we just need to read in order, and in the order of reading, the operating system will add between the memory and the disk Page cahe, which is the pre-read operation we usually see, so our sequential read operation is very fast. But Kafka has a problem. If there are too many partitions, there will be a lot of log segments. When writing is done in batches, it will actually become random writing. Random I/O has a great impact on performance at this time. So generally speaking, Kafka cannot have too many partitions. In response to this, RocketMQ writes all logs in one file, which can become sequential writing. With certain optimizations, reading can be close to sequential reading.

Think about it: 1. Why do we need to partition, that is to say, the topic has only one partition, can't it? 2. Why does the log need to be segmented?

1. The partition is for horizontal expansion. 2. If the log is too large in the same file, it will affect performance. If the log grows indefinitely, the query speed will slow down

3. Consumption model

After the message is sent by the producer to the Kafka cluster, it will be consumed by the consumer. Generally speaking, there are two types of consumption models: push and pull

Based on the message system of the push model, the message agent records the consumption status. After the message broker pushes the message to the consumer, it marks the message as being consumed, but this method cannot guarantee the processing semantics of consumption well. For example, when we send a message to the consumer, the consumption process hangs up or the message is not received due to network reasons. If we mark it as consumed at the consumption agent, the message will be lost forever. If we use this method of replying after the producer receives the message, the message agent needs to record the consumption status, which is not desirable. If push is used, the rate of message consumption is completely controlled by the consumer agent. Once the consumer is blocked, problems will occur.

Kafka adopts a pull model (poll) to control the consumption speed and consumption progress by itself, and consumers can consume according to any offset. For example, consumers can consume messages that have already been consumed for reprocessing, or consume recent messages, and so on.
Insert picture description here

When the producer sends data to the leader, the data reliability level can be set through the request.required.acks parameter: (the above figure is for the -1 process)

  • 0: Means that the producer does not wait for the confirmation of the completion of the broker synchronization, and continues to send the next (batch) message,
    providing the lowest delay. But the weakest durability, when the server fails, data loss is likely to occur . For example, if the leader is dead, the producer will continue to send messages without knowing it, and the broker will lose the data if the data is not received.

  • 1: It means that the producer waits for the leader to successfully receive the data and get the confirmation before sending the next message. This option provides better durability and lower latency.

    The leader of Partition dies, the follwer has not been copied, the data will be lost

  • -1: means that the producer gets the follower confirmation before sending the next piece of data

    The durability is the best and the latency is the worst.

The performance of the three mechanisms is decreasing, and the reliability is increasing.

4. Storage Strategy

No matter whether the message is consumed or not, Kafka will keep the message, which means that the message can be consumed repeatedly.

There are two strategies to delete old data:

  1. Based on time: log.retention.hours=168 (in the configuration file: one week of data is saved by default)
  2. Based on size: log.retention.bytes=1073741824 (1G)

It should be noted that because Kafka reads a specific hour time complexity of O(1), that is, it has nothing to do with file size, so deleting expired files here has nothing to do with improving Kafka performance.

Guess you like

Origin blog.csdn.net/RookiexiaoMu_a/article/details/105452515