Kafka producer idempotence and transactions

Article Directory

Idempotence

Introduction
Take http as an example. One or more requests will get the same response (except for network timeout issues). In other words, the effect of performing multiple operations is the same as performing one operation .

Insert picture description here
If a certain system is not idempotent, if the user repeatedly submits a certain form, it may cause adverse effects. For example, if the user clicks the submit order button multiple times on the browser, multiple identical orders will be generated in the background.

Kafka producer idempotence

Insert picture description here

  • Duplicate production message

The Kafka producer produces the message to the partition: By default, the message is saved in the partition and an ack is returned to the producer, indicating whether the current action of sending the message is successful. If the ack response fails, the producer will send the last message again. Continue to send, at this time Kafka will save exactly the same data.

  • Prevent repeated message sending: Turn on the idempotence of Kafka
  1. When the producer produces a message, it will add a pid (producer unique number) to the producer, and then add a Sequece number (an increasing sequence for the message) to the message
  2. When sending a message, the pid and Sequece number will be sent together
  3. When partition receives the message, it will save the pid and Sequece number together
  4. If the ack response fails, the producer sends the message again, and the partition needs to save the message again according to the pid and Sequece number
  5. Judgment condition: Whether the sequece number sent by the producer is less than or equal to the sequece number in the partition, if it is less, it will not be saved, otherwise it will be saved

Configure idempotence
props.put("enable.idempotence",true);

Principle of idempotence

In order to achieve the idempotence of producers, Kafka introduced the concepts of Producer ID (PID) and Sequence Number.

  • PID: Each Producer is assigned a unique PID when it is initialized. This PID is transparent to the user.
  • Sequence Number: For each producer (corresponding to PID), the message sent to the specified topic partition corresponds to a Sequence Number that starts from 0 and increases.
    Insert picture description here

Guess you like

Origin blog.csdn.net/zh2475855601/article/details/115317385