Producer processing logic of apache kafka series

Kafka Producer processing logic

 

Kafka Producer generates data and sends it to Kafka Server. The specific distribution logic and load balancing logic are all maintained by the producer.

Kafka structure diagram

Kafka Producer default calling logic

Default Partition logic

1. Distribution logic when there is no key

A partition is randomly selected every topic.metadata.refresh.interval.ms time. All records within this time window are sent to this partition.

A partition will be reselected after sending data error

2. Distribute according to key

Hash the key, then modulo the number of partitions

Utils.abs(key.hashCode) % numPartitions

How to get the leader information (metadata) of a Partition

After deciding which Partition to send to, you need to know which broker is the leader of the Partition before you can decide where to send it.

The specific implementation location

kafka.client.ClientUtils#fetchTopicMetadata

 Implementation plan

1. Obtain the metadata of the Partition from the broker. Since all Kafka brokers store all metadata, any broker can return all metadata

2. Broker selection strategy: Randomly sort the list of brokers, start access from the first broker, if there is an error, access the next one

3. Error handling: request metadata from the next broker after an error

Notice

  • Producer gets metadata from broker and doesn't care about zookeeper.
  • After the broker changes, the producer's function of obtaining metadata cannot be dynamically changed.
  • The list of brokers to use when getting metadata is determined by metadata.broker.list in the producer's configuration. As long as the machines in this list have a normal service, the producer can obtain metadata.
  • After obtaining the metadata, the producer can write data to the brokers in the non-metadata.broker.list list

error handling

The producer's send function has no return value by default. Error handling is implemented by EventHandler.

The error handling of DefaultEventHandler is as follows:

  • get error data
  • Wait for an interval, the  length of which is determined by the configuration retry.backoff.ms
  • Refetch metadata
  • resend data

The number of retries on error is  determined by the configuration message.send.max.retries

DefaultEventHandler throws an exception when all retries fail. code show as below

 

if(outstandingProduceRequests.size >0) {

  producerStats.failedSendRate.mark()

  val correlationIdEnd = correlationId.get()

  error("Failed to send requests for topics %s with correlation ids in [%d,%d]"

    .format(outstandingProduceRequests.map(_.topic).toSet.mkString(","),

    correlationIdStart, correlationIdEnd-1))

  thrownewFailedToSendMessageException("Failed to send messages after "+ config.messageSendMaxRetries +" tries."null)

}

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326721647&siteId=291194637