Getting Started with Kafka, Consumer Workflow (18)

kafka consumption method

insert image description here
pull (pull) mode:
consumer actively pulls data from broker.
Kafka takes this approach.
Push (push) mode:
Kafka does not use this method, because the message sending rate is determined by the broker, and it is difficult to adapt to the rate of all consumers. For example, if the push speed is 50m/s, consumer1 and consumer2 have no time to process the message
. The disadvantage of the pull mode is that if Kafka has no data, consumers may fall into a loop and consistently return empty data

Overall Consumer Workflow

insert image description here
1) The offset of each consumer is submitted by the consumer to the system topic for storage
2) The data of each partition can only be consumed by one consumer in the consumer group
3) A consumer can consume data from multiple partitions

consumer principle

insert image description here
Consumer Group (CG): A consumer group is composed of multiple consumers. The condition for forming a consumer group is that all consumer groups have the same groupid.
Each consumer in the consumer group is responsible for consuming different partition data, and a partition can only be consumed by one consumer in the group.
Consumer groups do not affect each other, and all consumers belong to a certain consumer group, that is, a consumer group is a logical subscriber.
insert image description here
1) If more consumers are added to the consumer group, exceeding the number of topic partitions, some consumers will be idle and will not receive any messages. 2
) Consumer groups do not affect each other, and all consumers belong to a certain A consumer group, that is, a consumer group is a logical subscriber

Consumer initialization process

insert image description here
coordinator: the auxiliary group implements the initialization of consumer groups and the allocation of partitions.
Coordinator node selection = hashcode value of groupid%50 (partition array of _consumer_offsets)
For example: hashcode value of groupid=1, 1%50=1, then the topic of _consumer_offsets Partition 1, on which broker, select the coordinator of this node as the boss of the consumer group, and when all consumers under the consumer group submit offsets, they submit offsets to this partition 1) Each consumer sends a
joinGroup Request
2) Select a consumer as the leader
3) Send the topic of consumption to the leader consumer
4) The leader will be responsible for specifying the consumption plan
5) Assign the consumption plan to the coordinator
6) The coordinator will send the consumption plan to each consumer
7) Each consumer will save the heartbeat with the coordinator (default 3s), and once it times out (session.timeout.ms=45s), the consumer will be blocked and rebalance will be triggered, or the time for the consumer to process the message is too long Long (max.poll.interval.ms5 minutes), will also trigger rebalancing
insert image description here
Fetch.min.bytes The minimum fetch size of each batch, the default is 1 byte
fetch.max.wait.ms The minimum value of a batch of data has not been exceeded Time, the default is 500ms
fetch.max.bytes The maximum fetch size per batch, the default is 50m
max.poll.records The maximum number of messages returned by pulling data once, the default is 500
insert image description here
insert image description here

insert image description here

Guess you like

Origin blog.csdn.net/weixin_43205308/article/details/131522465