Kafka advanced series-message exception

Other URL

Platform construction --- Kafka use --- Kafka repeated consumption and loss of data_diggerTT's blog -CSDN blog_kafka repeated consumption

Message is lost

Producer lost message

When will the message be lost?

Scenario 1: The message size exceeds the value of Broker's message.max.bytes. At this time, Broker will return an error directly;

Scenario 2: After the producer sends a message, the network is suddenly interrupted, resulting in the message not being sent to the Broker

Scenario 3: The format of the message is wrong; (this scenario should be excluded during the development of the self-test)

solution

Specify the size of the message (correspondence: scenario 3)

The producer's max.requests.size represents the maximum value of a single message sent by the producer, and it can also refer to the total size of all messages in a single request. This value must be less than Broker's message.max.bytes.

Send asynchronously, provide callback method (corresponding: scenario 2)

Conclusion : Use the API with callback to process when sending fails (such as storing in other media for compensation later)

Detailed:

In production, we will use the asynchronous sending of the Kafka producer, with the following two APIs:

producer.send(msg) without
callback method producer.send(msg,callback) with callback method

Message retry (corresponding: scenario 2)

  • Set the number of retries: retries //Experience value: 3.
  • Set the retry interval: retry.backoff.ms //Experience value: 20000 The
    default is 1000, which is 1 second.
    Set this option to make it try again after a certain period of time. At this time, the network may be ready. Otherwise, if the retry interval is too close, the network will not get well in a short time, and the number of retries will be wasted.
    Setting this option will cause the message sequence to change. The way to ensure the order remains unchanged: configure max.in.flight.requests.per.connection=1 (Function: limit the number of unresponsive requests that the client can send on a single connection. Set to 1 to indicate that the kafka broker is responding to requests Before, the client could no longer send requests to the same broker.)
  • Set the reconnection interval: reconnect.backoff.ms //Experience value: 20000

Number of responses (correspondence: scenario 2) 

in conclusion

Set acks = all

Detail

The parameter acks has three values: 0, 1, -1 (all).

acks is a parameter of Producer, which represents your definition of the "committed" message.

  • 0: The producer considers the message to be sent successfully after sending the message.
    May cause data loss: the network is down, the broker fails to save the message
  • 1: The producer sends the message to the server, and after the leader replica copy of the server is successfully written, it returns to the producer to send the message successfully.
    It may cause data loss: it is possible that data is written to the leader replica, and then a successful response is returned to the producer. If there is a problem with the server where the leader replica is at this time, and the follower replica has not had time to synchronize the data, it will be lost. data.
  • all: After all replicas are successfully written, a successful response will be returned to the producer.
    Assuming that there are three replicas of the partition (one leader replica, two follower replicas), then acks=-1 means that the message is written to the leader replica, and the two follower replicas successfully synchronize data from the leader replica, and the server Only then will the producer send a successful response to the message. To ensure that the data is not lost, then the value of acks is set to -1, and it is also necessary to ensure that there are more than one copy

Sending delay (corresponding: scenario 2) 

in conclusion

Set linger.ms. //Experience: 50. Default value: 0

Detail

This setting can delay the producer to send messages, thereby reducing the possibility of network problems to a certain extent.

For details, please refer to: https://kafka.apachecn.org/documentation.html

Broker lost message

When will the message be lost?

Scenario 1: If the broker where the leader copy is located suddenly goes down, then a leader must be re-elected from the follower copy, but if there are some leader data that has not been synchronized by the follower copy, the message will be lost.

Description: Kafka introduces a multi-copy (Replica) mechanism for partition (Partition). There will be a guy called leader among multiple replicas in a partition, and the other replicas are called followers. The message we send will be sent to the leader replica, and then the follower replica can pull the message from the leader replica for synchronization. Producers and consumers only interact with the leader replica. You can understand that other replicas are just copies of the leader replica, and their existence is only to ensure the security of message storage.

solution

Eligibility for leader election

in conclusion

Set unclean.leader.election.enable to false

Detailed

This parameter controls the eligibility of follower replica to run for leader replica after the leader replica has a problem.

Set to false, which means that if the follower replica is too far behind the leader replica, it cannot participate in the election.

Number of copies

in conclusion

Set replication.factor to a number greater than 1 //Experience value: 3

Detailed

This parameter sets the number of partition copies. If we want to ensure that data is not lost, the number of copies needs to be set to greater than 1.

Minimum number of copies 

in conclusion

Set min.insync.replicas to a number greater than 1, and should be less than replication.factor

Detailed


  • The parameter min.insync.replicas greater than 1 should be used in conjunction with the acks parameter in the producer. When the producer acks=all, all copies of the server have been written successfully before returning a successful response to the producer.
    min.insync.replicas is to control how many replicas a message must be written to before it is considered "committed". Assuming min.insync.replicas=1, it means that there can be only one replica. This replica is the leader replica. At this time, even if acks is set to all, the message is actually only sent to the leader replica, and a successful response will be returned later.
  • min.insync.replicas is smaller than replication.factor
    In order to ensure the high availability of the entire Kafka service, it is necessary to ensure that replication.factor> min.insync.replicas. why? If the two are equal, as long as one copy fails, the entire partition will not work normally, which is an obvious violation of high availability! It is generally recommended to set replication.factor = min.insync.replicas + 1.

Consumers lose messages

When will the message be lost?

Scenario 1 : Submit the offset first, then process the message. When the message processing fails, the message is lost.

The message is processed first, and then the offset is submitted. When the offset fails to be submitted, the message will be repeated. Message repetition only needs to deal with idempotence, and message repetition is not the focus of this article.

solution

Submit the offset manually

in conclusion

Set enable.auto.commit to false. After processing the message, submit it manually. There are several ways to submit:

  1. Asynchronously submit
    consumer.commitAsync(), consumer.commitAsync(new OffsetCommitCallBack())
  2. Synchronous submission
    consumer.commitSync()

Detailed

Synchronous submission will continue to retry until it succeeds, requiring a response from the Broker, and asynchronous submission can improve throughput.

Zero message loss configuration

If all the above conditions are achieved, zero message loss can be achieved.

Looking at many blogs, after the above configuration is done, zero messages can be lost in production.

The following situations can also lead to modified messages, but you can ignore it

Kafka's data is first written to the operating system cache. If we use the above configuration scheme, the data is written successfully but not yet to the disk, but the cluster is powered off, and data will be lost at this time. If it is configured to write to the disk immediately, it will reduce throughput, and it is generally not configured like this.

Duplicate message

Other URL

Kafka message duplication and loss scenarios and solution analysis_zhangCheng's blog-CSDN blog
kafka notes: Producers deliver messages exactly once (idempotence solves repeated message delivery) -cloud community-Huawei cloud
Kafka messages will be lost And repeat? ——How to realize Kafka precise transmission once semantics is
extremely simple series-kafka production-repetitive questions_learned, interrogated, carefully thought, distinguished, and practiced-CSDN blog

Duplicate producer message

When will it repeat?

Scenario 1 : The message sent by the producer does not receive the correct broke response, causing the producer to retry.

Detailed explanation: After the producer sends a message, after the broker is placed on the market, due to network and other reasons, the sender gets a sending failure response or network interruption, and then the producer receives a recoverable Exception retry message, which causes the message to be repeated.

solution

Start the idempotence of Kafka

in conclusion

enable.idempotence=true //At this time, acks=all
acks=all
retries>1 will be turned on by default

Detailed

After Kafka 0.11.0.0 version, the idempotent producer was officially launched to support the idempotence of producers.

Each producer has a unique id, and every time the producer sends a piece of data, it will bring a sequence. When the message is placed on the disk, the sequence will be incremented by 1. Just judge whether the sequence of the current message is greater than the current maximum sequence, greater than that means that the data has not been placed on the order, and can be consumed normally; not greater than that means that the order has been placed, and the retransmitted message at this time will be rejected by the server to avoid it. The message is repeated.

limitation

        During initialization, Kafka will generate a unique ID for the producer called Producer ID or PID. The PID and serial number are bundled with the message and then sent to the Broker. Since the sequence number starts from zero and increases monotonically, the broker will only accept the message when the sequence number of the message is exactly one greater than the last submitted message in the PID/TopicPartition pair. If this is not the case, the Broker assumes that the producer resends the message. 

Duplicate consumer message

When will it repeat?

Scenario 1 : The consumer uses the automatic submission offset mode. When the consumer receives the message and has not consumed it, he submits the offset and starts to consume the message. As a result, the network is abnormal, the offset submission fails, and the consumption message is successful.

Scenario 2 : The consumer uses the manual offset submission mode. After the consumer consumes the message, the consumer hangs up, and the offset has not yet been submitted

Solution 1

Processing in business code.

in conclusion

Set to manual submission mode; manual judgment is idempotent

Detailed

There are actually two methods for idempotence:
(1) Store the unique key in a third-party medium. When you want to manipulate data, first determine whether the third-party medium (database or cache) has this unique key.

(2) Store the version number (offset) in the data, and then use this version number as an optimistic lock when you want to manipulate the data, and the operation can only be done when the version number is greater than the original one.

Solution 2

Use Kafka's stream processing engine: Kafka Streams (this method is rarely used)

in conclusion

Set processing.guarantee=exactly_once, you can easily achieve exactly once.

Guess you like

Origin blog.csdn.net/feiying0canglang/article/details/113886464