Reliability, power and so on, and transaction Kafka Kafka notes - reliability, power and other matters and

Kafka notes - reliability, power and other matters and

 


Classification:  Message Queuing  Tags:  Kafka

Very busy these days, but I give my requirement is at least one week to the article, so do acquire this note appetizer, source code analysis estimates the next two days should be able to write. - give their own fuel, even if no one is watching.

Reliability #

How to ensure that messages are not lost #

Kafka only the message "Submit" the (committed message) to do a limited degree of persistence guarantee.

Message has been submitted
after Kafka when several Broker successfully receives a message and written to a log file, they will tell the producer program this message has been successfully submitted.

A limited degree of persistence to ensure that
if a message is stored on N Kafka Broker, then at least the N Broker has survived at least one, in order to ensure that messages are not lost.

Loss of data Case #

Program producers lose data #

Since Kafka Producer is sent asynchronously, End call producer.send (msg) does not think that message has been sent successfully.

So, Producer Always use the API to send a notification with callbacks using producer.send (msg, callback). Once the message submission failure situation occurs, the process may be targeted.

Consumers end data loss #

The consumer is to update the offset, and then consume the message. If this time consumers suddenly goes down, then the message will be lost.

So we must first consume the message, and then update the offset position. But this will cause the message repeated consumption.

Another case is the consumer to get the message to open multiple threads to process messages asynchronously, and consumer automatically updated offset forward. If one of these threads run fails, the message is lost.

Encountered such a situation, consumer do not open automatically submit displacement, but to submit the application manually shift.

Best achieve #

  1. 使用producer.send(msg,callback)。
  2. Provided acks = all. acks is a parameter Producer, representing all copies Broker have received the message, the message is considered "Submitted."
  3. Retries set to a large value. Producer is a parameter, the corresponding Producer automatic retry. If the network jitter occurs, it can automatically retry sending the message to avoid message loss.
  4. unclean.leader.election.enable = false. Broker control which qualified election Leader partition. Broker is not allowed to fall behind too many campaign Leader.
  5. Provided replication.factor> = 3. Broker parameters, redundancy Broker.
  6. Provided min.insync.replicas> 1. Broker parameters. Control messages to be written to at least the number of copies is considered "Submitted."
  7. Ensure replication.factor> min.insync.replicas. If the two are equal, as long as there is a copy of the hook, the entire partition can not work properly. Recommended arranged replication.factor = min.insync.replicas + 1.
  8. Ensure that messages consumption complete submission. Consumer end parameter enbale.auto.commit, set to false, manual submission displacement.

The second explanation and Article VI:
If the ISR only one copy of, acks = all a acks = 1 it is equivalent to the introduction of min.insync.replicas purpose is to make a lower limit of limitations: You can not be complacent all ISR write, but also to ensure that the number of write ISR is not less than min.insync.replicas.

Idempotence #

In 0.11.0.0 version introduces create idempotency Producer functions. Only need to set props.put ( "enable.idempotence", true), or props.put (ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true).

After enable.idempotence provided true, Producer automatic upgrade to idempotent Producer. Kafka will automatically go heavy. Broker will be more save some fields. When the same message sent by the Producer field value, Broker automatically aware of these messages has been repeated.

Range:

  1. Idempotency can only be guaranteed on a single partition, a idempotency Producer to ensure that duplicate messages do not appear on a partition on a topic.
  2. Idempotency can only be achieved on a single answer session here refers to the first run Producer process. After the restart process Producer, idempotency is not guaranteed.

Transaction #

Kafka began in version 0.11 provides support for transactions, providing a read committed transaction isolation level. To ensure that multiple messages atomically written to the target partition, but also to ensure that Consumer Affairs can only see the message successfully submitted.

Transactional Producer #

To ensure that multiple messages atomically written into the plurality of partitions. These messages either all succeed, or all fail. Producer also afraid transactional process restarted.

Producer side settings:

  1. Openenable.idempotence = true
  2. Producer parameter setting end transactional.id

In addition, the transaction calls plus API, such as initTransaction, beginTransaction, commitTransaction and abortTransaction, respectively, to deal with matters of initialization, transaction begin, commit the transaction and the transaction is terminated.
as follows:

Copy
producer.initTransactions();
try {
            producer.beginTransaction();
            producer.send(record1);
            producer.send(record2);
            producer.commitTransaction();
} catch (KafkaException e) {
            producer.abortTransaction();
}

This code can guarantee record1 and record2 is the same as a transaction submitted to Kafka, either all succeed or write failure.

Consumer end setting:
setting isolation.level parameters, there are two values:

  1. read_uncommitted: The default value indicates Consumer end transactional Producer whether to commit the transaction or terminate the transaction, its messages are written to be read.
  2. read_committed: Consumer show only reads transactional Producer successful submission message transactions written. Note that all non-transactional messages written Producer can see.

Very busy these days, but I give my requirement is at least one week to the article, so do acquire this note appetizer, source code analysis estimates the next two days should be able to write. - give their own fuel, even if no one is watching.

Reliability #

How to ensure that messages are not lost #

Kafka only the message "Submit" the (committed message) to do a limited degree of persistence guarantee.

Message has been submitted
after Kafka when several Broker successfully receives a message and written to a log file, they will tell the producer program this message has been successfully submitted.

A limited degree of persistence to ensure that
if a message is stored on N Kafka Broker, then at least the N Broker has survived at least one, in order to ensure that messages are not lost.

Loss of data Case #

Program producers lose data #

Since Kafka Producer is sent asynchronously, End call producer.send (msg) does not think that message has been sent successfully.

So, Producer Always use the API to send a notification with callbacks using producer.send (msg, callback). Once the message submission failure situation occurs, the process may be targeted.

Consumers end data loss #

The consumer is to update the offset, and then consume the message. If this time consumers suddenly goes down, then the message will be lost.

So we must first consume the message, and then update the offset position. But this will cause the message repeated consumption.

Another case is the consumer to get the message to open multiple threads to process messages asynchronously, and consumer automatically updated offset forward. If one of these threads run fails, the message is lost.

Encountered such a situation, consumer do not open automatically submit displacement, but to submit the application manually shift.

Best achieve #

  1. 使用producer.send(msg,callback)。
  2. Provided acks = all. acks is a parameter Producer, representing all copies Broker have received the message, the message is considered "Submitted."
  3. Retries set to a large value. Producer is a parameter, the corresponding Producer automatic retry. If the network jitter occurs, it can automatically retry sending the message to avoid message loss.
  4. unclean.leader.election.enable = false. Broker control which qualified election Leader partition. Broker is not allowed to fall behind too many campaign Leader.
  5. Provided replication.factor> = 3. Broker parameters, redundancy Broker.
  6. Provided min.insync.replicas> 1. Broker parameters. Control messages to be written to at least the number of copies is considered "Submitted."
  7. Ensure replication.factor> min.insync.replicas. If the two are equal, as long as there is a copy of the hook, the entire partition can not work properly. Recommended arranged replication.factor = min.insync.replicas + 1.
  8. Ensure that messages consumption complete submission. Consumer end parameter enbale.auto.commit, set to false, manual submission displacement.

The second explanation and Article VI:
If the ISR only one copy of, acks = all a acks = 1 it is equivalent to the introduction of min.insync.replicas purpose is to make a lower limit of limitations: You can not be complacent all ISR write, but also to ensure that the number of write ISR is not less than min.insync.replicas.

Idempotence #

In 0.11.0.0 version introduces create idempotency Producer functions. Only need to set props.put ( "enable.idempotence", true), or props.put (ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true).

After enable.idempotence provided true, Producer automatic upgrade to idempotent Producer. Kafka will automatically go heavy. Broker will be more save some fields. When the same message sent by the Producer field value, Broker automatically aware of these messages has been repeated.

Range:

  1. Idempotency can only be guaranteed on a single partition, a idempotency Producer to ensure that duplicate messages do not appear on a partition on a topic.
  2. Idempotency can only be achieved on a single answer session here refers to the first run Producer process. After the restart process Producer, idempotency is not guaranteed.

Transaction #

Kafka began in version 0.11 provides support for transactions, providing a read committed transaction isolation level. To ensure that multiple messages atomically written to the target partition, but also to ensure that Consumer Affairs can only see the message successfully submitted.

Transactional Producer #

To ensure that multiple messages atomically written into the plurality of partitions. These messages either all succeed, or all fail. Producer also afraid transactional process restarted.

Producer side settings:

  1. Openenable.idempotence = true
  2. Producer parameter setting end transactional.id

In addition, the transaction calls plus API, such as initTransaction, beginTransaction, commitTransaction and abortTransaction, respectively, to deal with matters of initialization, transaction begin, commit the transaction and the transaction is terminated.
as follows:

Copy
producer.initTransactions();
try {
            producer.beginTransaction();
            producer.send(record1);
            producer.send(record2);
            producer.commitTransaction();
} catch (KafkaException e) {
            producer.abortTransaction();
}

This code can guarantee record1 and record2 is the same as a transaction submitted to Kafka, either all succeed or write failure.

Consumer end setting:
setting isolation.level parameters, there are two values:

  1. read_uncommitted: The default value indicates Consumer end transactional Producer whether to commit the transaction or terminate the transaction, its messages are written to be read.
  2. read_committed: Consumer show only reads transactional Producer successful submission message transactions written. Note that all non-transactional messages written Producer can see.

Guess you like

Origin www.cnblogs.com/Leo_wl/p/11806615.html