Kafka producer ack, min.insync.replicas, replication factor mechanism analysis

Kafka has three very important configuration parameters, acks and min.insync.replicas. and replication factor among which < a i=4> is the configuration parameter of the producer, is the configuration parameter of the Broker. These three parameters play a big role in preventing the producer from losing data.acksmin.insync.replicas

1. Partition copy


1. Kafka’s topic can be partitioned, and multiple copies can be configured for partitions. The configuration can be changed through thereplication.factor parameter.

2. There are two types of partition replicas in Kafka: Leader Replica and Follower Replica. When each partition is created, one replica must be elected as the leader replica, and the remaining replicas automatically become Follower copy.

3. In Kafka, follower copies do not provide services to the outside world. That is to say, any follower copy cannot respond to read and write requests from consumers and producers. All requests must be sent by the leader The leader copy handles it. In other words, all read and write requests must be sent to the Broker where the leader copy is located, and the Broker is responsible for processing. The follower copy does not process client requests, and its only task is to read from the leader copy < /span> messages and writes them into its own commit log to achieve synchronization with the leader replica.Asynchronously pulls

4. Kafka’s default replica factorreplication.factor is 3, that is, each partition has only 1 leader copy and 2 follower copies. The details are as shown in the following figure:

5. As mentioned above, the producer client only writes to the Leader broker, and the followers copy data asynchronously. Since Kafka is a distributed system, there is bound to be a risk of not being able to synchronize with the Leader in real time. Therefore, a method is needed to determine whether these followers have kept up with the leader, that is, whether the followers have synchronized the latest data. In other words In other words, Kafka wants to tell us clearly, under what conditions can the follower copy be synchronized with the leader? This is the ISR synchronized copy mechanism discussed below.

2. In-sync replicas


1. In-sync replica (ISR) is called a synchronous replica. The replicas in the ISR are all replicas that are synchronized with the Leader, so followers that are not in the list will be considered to be out of sync with the Leader. Then, what exists in the ISR? What copy? First of all, it is clear that the leader copy always exists in the ISR. Whether the follower copy is in the ISR depends on whether the follower copy is "synchronized" with the leader copy.

Screaming tip: The understanding of "whether the follower copy is synchronized with the leader copy" is as follows:

1. The synchronization mentioned above does not mean complete synchronization, that is, it does not mean that once the follower copy lags behind in synchronization with the leader copy, it will be kicked out of the ISR list.

2. The broker side of Kafka has a parameterreplica.lag.time.max.ms, which indicates the maximum time interval between the follower copy and the leader copy. The default is 10 seconds. This means that as long as If the time interval between the follower copy and the leader copy does not exceed 10 seconds, the follower copy and the leader copy can be considered to be synchronized. Therefore, even if the current follower copy lags behind the leader copy by a few messages, as long as it catches up with the leader copy within 10 seconds , you will not be kicked out.

3. If the follower copy is kicked out of the ISR list, it will be added to the ISR list again when it catches up with the progress of the Leader copy. Therefore, the ISR is a dynamic list and is not static.

2. As shown in the figure above: the partition1 copy on Broker3 has exceeded the specified time and is not synchronized with the Leader copy, so it is kicked out of the ISR list. The ISR at this time is [1,3].

3. acks confirmation mechanism


1. The acks parameter specifies how many partition replicas must receive the message before the producer considers the message to be successfully written. This parameter plays an important role in whether the message is lost. The configuration of this parameter is as follows:

  • acks=0 means that the producer will not wait for any response from the server before successfully writing the message. In other words, once a problem occurs and the server does not receive the message, the producer will have no way of knowing and the message will be lost. . Since there is no need to wait for a response from the server when changing the configuration, messages can be sent at the maximum speed supported by the network, thereby achieving very high throughput.

  • acks=1 means that as long as the leader partition copy of the cluster receives the message, it will send a successful response ack to the producer. At this time, after the producer receives the ack, it can consider the message to be written successfully. Once the message cannot be written Into the leader partition copy (such as network reasons, leader node crash), the producer will receive an error response. When the producer receives the error response, in order to avoid data loss, the data will be resent. The throughput of this method depends on It depends on whether asynchronous sending or synchronous sending is used.

    Screaming tip: If the producer receives an error response, data loss may still occur even if the message is resent. For example, if a node that has not received the message becomes the new leader, the message will be lost.

  • acks =all, means that the producer will receive a response from the server only when all nodes participating in replication (copies of the ISR list) have received the message. This mode is the highest level and the safest, ensuring that more than A Broker receives the message. The latency in this mode will be high.

4. Minimum synchronization copy


1. As mentioned above, when acks=all, all replicas need to be synchronized before a successful response can be sent to the producer. In fact, there is a problem here: What will happen if the Leader replica is the only synchronized replica? This is equivalent to acks=1. So it is unsafe.

2. The Broker side of Kafka provides a parametermin.insync.replicas. This parameter controls the minimum number of copies to which a message is written before it is considered a "real write". This value defaults to The value is 1. Setting the production environment to a value greater than 1 can improve the durability of the message. Because if the number of synchronized replicas is lower than the configured value, the producer will receive an error response, thus ensuring that the message is not lost.

4.1、Case 1

1. As shown in the figure below, when min.insync.replicas=2 and acks=all, if the ISR list only has [1,2], 3 will be kicked out of the ISR list. You only need to ensure that the two replicas are synchronized, and the producer will A successful response will be received.

4.2、 Case 2

1. As shown in the figure below, when min.insync.replicas=2, if the ISR list only has [1], 2 and 3 are kicked out of the ISR list, then when acks=all, the number cannot be written successfully; when acks= 0 or acks=1 can successfully write data.

4.3、Case 3

1. This situation is easily misleading. If acks=all and min.insync.replicas=2, and the ISR list is [1,2,3], then it will still wait until all synchronization replicas have synchronized the message. , will send an ack of successful response to the producer. Because min.insync.replicas=2 is only a minimum limit, that is, if the synchronization replicas are less than the configured value, an exception will be thrown, and acks=all needs to ensure that all ISRs A successful response can be sent only when all copies of the list are synchronized. As shown in the figure below:

5. Summary


  • With acks=0, the producer will not wait for any response from the server before successfully writing the message.

  • acks=1, as long as the leader partition copy of the cluster receives the message, it will send an ack of successful response to the producer.

  • acks=all, which means that only when all nodes participating in replication (copies of the ISR list) receive the message, the producer will receive a response from the server. At this time, if the number of ISR synchronization copies is less than < /span>With the value of min.insync.replicas, the message will not be written.

Original address:Analysis of Kafka producer ack mechanism

Guess you like

Origin blog.csdn.net/qq_38263083/article/details/133121727