How the Kafka client finds the leader partition

Under normal circumstances, each topic in Kafka will have many partitions, and each partition will have multiple replicas. Among these replicas, there is a leader partition, and the remaining partitions are called followers, and all read and write operations to the partition are performed on the leader partition. Therefore, when we write messages to Kafka or read messages from Kafka, we must first find the leader of the corresponding partition and the broker address where it is located, so that subsequent operations can be performed. This article will introduce how Kafka finds the leader partition.

We know that Kafka is written in Scala, but it supports clients in many languages, including: C/C++, PHP, Go, Ruby, etc. (see https://cwiki.apache.org/confluence/display/KAFKA /Clients). Why is this? This is because Kafka internally implements a set of protocols based on the TCP layer. As long as this protocol is used to communicate with Kafka, many languages ​​can be used to operate Kafka.

At present, Kafka supports up to 30 kinds of protocols. How the Kafka client finds the leader partition introduced in this article involves the Metadata protocol inside Kafka. The Metadata protocol mainly solves the following four problems:

  • What topics exist in Kafka?

  • How many partitions does each topic have?

  • Broker address and port where the Leader partition is located?

  • What is the address and port of each broker?

The client only needs to construct the corresponding request and send it to the Broker to get the answers to the above four questions. The whole process is as follows:

  • The client constructs the corresponding request

  • The client sends the request to the Broker side

  • The broker side receives the request processing and sends the result to the client.

The Metadata request protocol (v0-v3 version) is as follows:

At present, there are five versions of the Metadata request protocol, and the v0-v3 versions have the same format. But there is a problem with these protocols: when the Kafka server side sets the  auto.create.topics.enable parameter to true, if the topic we query does not exist, Kafka will automatically create the topic, which is probably not the result we want. Therefore, based on this problem, in the fifth edition of the Metadata Request Protocol, the format has changed, as follows: the client only needs to construct one  TopicMetadataRequest , which includes the names of the topics we need to query (TopicNames); of course, we can query multiple topics at a time, Just put these themes into the List. At the same time, we can also not pass in any topic name, at this time Kafka will send all the internal topic related information to the client.

After Kafka's Broker receives the client's request and processes it, it will construct one  TopicMetadataResponseand send it to the client. TopicMetadataResponse The format of the protocol is as follows: we can specify  allow_auto_topic_creation parameters to tell Kafka if we need to create the topic when it does not exist, then control is in our hands.

It can be seen that the corresponding protocol contains the information of the Leader, Replicas and Isr of each partition, and also includes the information of all brokers in the Kafka cluster. If there is a problem with the processing, the corresponding error message code will appear, mainly including the following:

Also, the Metadata protocol is currently the only one that can be sent to any Broker. Because any Broker will store these Metadata information after startup. Moreover, the client provided by Kafka will also store the metadata information in memory after obtaining it. And the cached Metadata information will be updated in the following situations:

  • When sending a request to Kafka, a Not a Leader exception is received;

  • After the  meta‐data.max.age.ms parameter-configured time expires.

In the above two cases, the client provided by Kafka will automatically send a Metadata request again, so that the updated information can be obtained. The whole process is as follows:

Well, after talking for a long time, let's take a look at how the program is constructed  TopicMetadataRequest and processed  TopicMetadataResponse.

TopicMetadataRequest It is sent through  SimpleConsumer the  send method, and it returns  TopicMetadataResponse , which contains the information we need. Running the above program outputs the following:

From the above output, you can see the information of the leader machine, isr, and all replicas of each partition. One thing we need to pay attention to, because there are currently multiple versions of the Metadata request protocol, we can use the lower version of the protocol to communicate with the higher version of the Kafka cluster, because the higher version of Kafka can support the lower version of the Metadata request protocol; but we cannot Use the higher version of the Metadata request protocol to communicate with the lower version of Kafka.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325971472&siteId=291194637