Kafka-- consumers

 

Kafka application uses KafkaConsumer to subscribe to a topic and receive messages from the subscribed topic.

 

Consumers and consumer groups

Consumer object is used to subscribe to a topic and receive messages, and then verify that the message and save the results.

Kafka consumers belonging to consumer group. Where a group of consumers subscribe to the same theme, each consumer receives the message subject portion of a partition.

If a consumer group in the number of consumers is greater than the number of partitions theme, then there will be idle part of the consumer, it will not receive any messages.

When adding a new consumer group, it reads that the original message read by other consumers. When a consumer is shut down or a crash, it will leave the group, the original partition will be read by the group in its chosen other consumers.

To increase consumer group in a major way to scale-spending power.

We need to create a large number of partitions theme, an increase in the load can be added to more consumers.

 

Unlike traditional messaging systems, scale-Kafka consumers and consumer groups do not have a negative impact on performance.

 

 

Partition rebalancing

Partition ownership is transferred from one consumer to another consumer, such behavior is known as rebalancing.

Rebalancing brings high availability and scalability for the consumer group.

Under normal circumstances, we do not want to happen such behavior.

During the rebalancing, consumers can not read the message, which we can not use the entire group a short time. In addition, when the partition is re-assigned to another consumer, consumer to read the current state is lost, it may also need to flush the cache before it resumed the state will slow down the application.

 

Consumers to be assigned by the group coordinator broker (different groups may have different coordinator) sends a heartbeat to maintain their group affiliation and their ownership of the partition.

As long as consumers send heartbeat interval at a normal time, it was considered to be active, indicating that partition read it in the news.

Consumers may send the heartbeat message in the polling (in order to obtain the message), or when the offset filed.

If consumers stop sending heartbeat long enough, the session expires, the group coordinator think it's dead, it will trigger a rebalancing.

If a consumer crashes, and stop reading the message, a group coordinator will wait a few seconds, to confirm that it will trigger the death of rebalancing.

When cleaning up consumers, consumers will notify the coordinator will have to leave the group, the coordinator will immediately trigger a re-balanced to minimize processing pauses.

 

If the message needs to take a long processing time can be increased to increase the market value max.poll.interval.ms polling interval.

 

 

Partition allocation

When consumers want to join the group, it sends a request like JoinGroup group coordinator. The first to join the group of consumers will be the main group.

The main group to obtain a list of the members of the group from the coordinator there, and is responsible for distribution to each consumer partition. It uses a class that implements the interface PartitionAssignor to determine which partition should be assigned to which the consumer.

Kafka built two allocation strategies. After the distribution is completed, the main group of the distribution of the list sent to the group coordinator, coordinator then sends the message to all consumers.

Each consumer can only see their own allocation information, only the main group in the group knows the allocation information for all consumers.

This process is repeated each time it is re-balanced.

 

 

polling

Core consumer polling message API, request data to the server via a simple polling.

Once consumers subscribed to the topic, polling will handle all the details, including the coordination group, zoning rebalancing, send heartbeat and access to data.

Consumers must continue to Kafka poll (poll), otherwise they will be considered dead.

poll () method returns a list of records. Each record contains information about the record belongs to the theme, the partition information is recorded, the record key offset in the partition, and record right.

Use close before the launch of the application () method to close the consumer. Fi and socker will also shut down, and immediately trigger again balanced, rather than waiting for the group coordinator found that it no longer sends heartbeat and found it had died, because that takes longer, resulting in the entire group at a time Can not read the message inside.

 

Polling is not just as simple as acquiring data, call the new consumer poll once in the first () method, it will be responsible for looking GroupCoordinator, then join the group, accepted the assigned partition.

If this happens again balanced, the whole process is carried out during polling. Of course, the heartbeat is sent out from the polling inside.

 

In the same group, we can not allow a consumer to run multiple threads, multiple threads can not allow a consumer to securely share.

As a rule, a consumer using a county.

If you want to run multiple customers in the same consumer group, we need to let every consumer runs in its own thread.

It is better to consumer logic is encapsulated in its own object, and then use the Java thread pool starts multiple threads, so that every consumer runs on its own thread.

 

 

Consumer Configuration

bootstrap.servers

 

 

group.id

 

 

key.deserializer

 

 

value.deserializer

 

 

fetch.min.bytes

 

 

fetch.max.wait.ms

 

 

max.partition.fetch.bytes

 

 

session.timeout.ms

 

 

auto.offset.reset

 

 

enable.auto.commit

 

 

partition.assignmeng.strategy

 

 

client.id

 

 

max.poll.records

 

 

receive.buffer.bytes

 

 

send.buffer.bytes

 

 

Submission and offset

 

Guess you like

Origin www.cnblogs.com/microcat/p/11444248.html