Kafka kernel understanding: message collection/consumption mechanism

1. Kafka data collection mechanism

In the Kafka cluster, the producer is responsible for generating data and sending it to the corresponding topic ; the producer sends the data to the partition corresponding to the topic by pushing

The data sent by the Producer to the Topic is composed of key/value key-value pairs. Kafka decides to send the data to different Partitions according to the different values ​​of the key . By default, the Hash mechanism is used to send the data to different Partitions corresponding to the Topic. The configuration parameters are: { partitioner.class }

Producer sends data in two ways: sync (synchronous) and async (asynchronous). The default is synchronous, which is determined by the parameter { producer.type }; when the asynchronous sending mode is used, the Producer provides a retry mechanism, and the default fails to retry. send 3 times

Kafka Producer related parameters:

 

 

 

2. Kafka data consumption mechanism

Kafka has two modes for consuming data: queue and publish-subscribe ; in queue mode, a piece of data will only be sent to one customer in the customer group for consumption; in publish-subscribe mode, a piece of data will be sent to multiple customers for consumption

Kafka's customer consumes data in kafka based on offset , and shares an offset offset for all customers in a customer group

In Kafka, the data consumption mode of Kafka is determined by controlling the parameter { group.id } of Customer . If the parameter value of all consumers is the same, then Kafka at this time is similar to the queue mode, and the data will only be sent to one customer, at this time Kafka is similar to load balancing; otherwise, it is a publish-subscribe mode; in queue mode, Kafka's Consumer Rebalance may be triggered

Kafka's data is sorted by partition (the order of insertion), that is, the data in each partition is ordered. When Consumer consumes data, it also consumes partitioned data in an orderly manner, but does not guarantee the orderliness of all data (between multiple partitions)

Consumer Rebalance : When the number of consumers in a consumer group is the same as the number of partitions corresponding to the topic, a Consumer consumes the data of a Partition; if they are inconsistent, a Consumer may consume the data of multiple Partitions or not consume it In the case of data, this mechanism changes dynamically according to the number of Consumers and Partitions

Consumer actively obtains data from the Kafka cluster by polling

Kafka Consumer related parameter description:

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326770664&siteId=291194637