The coordinator of the core components of Kafka

1, consumers and consumer groups

Suppose a topic there are four partitions, a consumer group, only consumer, then the consumer will consume all the data in the partition.

If there are two consumer groups of consumers, each consumer spending two partition.

If there are four groups of consumers consumers, each consumer spending a partition.

If there are five consumer consumer group, then there is a consumer is free.

Note: In the same consumer group, do not let the number of consumers is greater than the number of partitions

Not affect each other across multiple consumer groups.

2, coordinator

In kafka-0.10 version, Kafka on the server introduced the group coordinator (GroupCoordinator) , creates an instance of each GroupCoordinator Kafka Server starts, for each consumer group under the management of part of the consumer and the consumer group consumption offset .

In the introduction of the consumer client coordinator (ConsumerCoordinator) , instantiate an instance of a consumer will ConsumerCoordinator objects, ConsumerCoordinator in charge of a group under the same consumer GroupCoordinator each consumer and server communicate .

2.1 Consumer Coordinator (ConsumerCoordinator)

ConsumerCoordinator defined position:

public class KafkaConsumer<K, V> implements Consumer<K, V> {
    
    private final ConsumerCoordinator coordinator;
    
}

ConsumerCoordinator is a private member variable KafkaConsumer, so ConsumerCoordinator information stored in the corresponding only visible to the consumer, not the information is to look in each other's ConsumerCoordinator between different consumers

ConsumerCoordinator role:

  • Metadata handling consumer cache update request
  • Consumer initiates a request to join the group to the group coordinator
  • Before and after treatment of the corresponding consumer added consumer
  • Consumers asked to leave the group (for example, when consumers unsubscribe)
  • Transmitting a request submitted to the offset group coordinator
  • Through a regular heartbeat detection task group coordinator to make their own perception of the operational status
  • Leader of consumers ConsumerCoordinator is also responsible for allocating partitions, a consumer group Consumers leader elected by the group coordinator, leader of consumers ConsumerCoordinator responsible for the distribution of consumers and the partition, and then send the results to the assigned group coordinator, then the group coordinator and then assign the result back to the other consumers of consumer coordinator, so reducing the burden on the service side

Assembly to achieve the above functions are ConsumerCoordinator private member ConsumerCoordinator class or its parent class private members:

 1 public final class ConsumerCoordinator extends AbstractCoordinator {
 2     private final List<PartitionAssignor> assignors;
 3     private final OffsetCommitCallback defaultOffsetCommitCallback;
 4     private final SubscriptionState subscriptions;
 5     private final ConsumerInterceptors<?, ?> interceptors;
 6     private boolean isLeader = false;
 7     private MetadataSnapshot metadataSnapshot;
 8     private MetadataSnapshot assignmentSnapshot;
 9     
10     省略了部分代码....
11 }
12 
13 
14 public abstract class AbstractCoordinator implements Closeable {
15     private enum MemberState {
16         UNJOINED,    // the client is not part of a group
17         REBALANCING, // the client has begun rebalancing
18         STABLE,      // the client has joined and is sending heartbeats
19     }
20 
21     private final Heartbeat heartbeat;
22     protected final ConsumerNetworkClient client;
23     private HeartbeatThread heartbeatThread = null;
24     private MemberState state = MemberState.UNJOINED;
25     private RequestFuture<ByteBuffer> joinFuture = null;
26     
27     省略了部分代码....
28 }

 

2.2 Group coordinator (GroupCoordinator)

GroupCoordinator role:

  • Responsible for processing requests related to its management team members (consumers) submitted
  • And establish a connection between the consumer and to elect a leader from between consumers connected to it
  • When the leader of consumers subscribe to a good relationship with the distribution partition, it will send the results to the group coordinator, group coordinator and then return the results to individual consumers
  • Consumers connected with the management of consumption offset submit, save each consumer to offset the internal kafka theme
  • By heartbeat consumers and their connection status
  • Create a Startup group coordinator when a regular task for cleaning up expired consumption set of metadata and past consumption offset information

GroupCoordinator dependent components and their effect

  • KafkaConfig: Examples of OffsetConfig and GroupConfig
  • ZkUtils: get inside the theme from Zookeeper points when consumer distribution group coordinator partition metadata information.
  • GroupMetadataManager: GroupMetadata responsible for managing and consumption offset submission, and provides a method for the group coordinator calling a series of group management. GroupMetadataManager not only GroupMetadata wrote kafka internal topic, but also a GroupMetadata memory cache, which includes members (consumers) of metadata information, such as consumer memberId, leaderId, partition distribution relations, state yuan data. State metadata may be the following five states:
    • PreparingRebalance: Consumer groups ready to balance operation
    • AwaitingSync: Consumers will wait for the leader to send partition is assigned to a group relations coordinator
    • Stable: Consumers normal operation, normal heartbeat
    • Dead: consumer group in this state is not a member of any consumer, and metadata information has been deleted
    • Empty: consumer group in this state is not a member of any consumer, but the metadata information is not deleted, all consumers know that consumption offset corresponding metadata information expired.
  • ReplicaManager: GroupMetadataManager need to have offset the consumption of consumer information and consumer groups metadata information submitted written Kafka internal topics, themes and operation of the internal operation of the other topics, like the first copy of the message written by the leader ReplicaManager responsible leader and the other a copy of a copy of ReplicaManager management.
  • DelayedJoin: Delay class of operation, the processing for monitoring the consumption heartbeat timeout between all group members and group coordinator
  • GroupConfig: define the session timeout interval between group members and the group coordinator

3 interaction, consumer coordinator and the coordinator of the group

3.1 Heartbeat

Consumers coordinator sends a heartbeat by the group coordinator and to maintain their affiliation and groups and their ownership of the partition. As long as consumers send heartbeat interval at a normal time, it was considered to be active, indicating that partition read it in the news. Consumers will send a heartbeat message or a polling acquisition when submitted offset.

If consumers stop sending heartbeat long enough, the session expires, the group coordinator think it's dead, it will trigger a rebalancing.

In version 0.10, the heartbeat tasks are performed by a separate thread heartbeat, the heartbeat can be sent in a neutral polling get the message. Thus, the transmission of the heartbeat (that is, group coordinator consumer group detects the operating state of the time) and the frequency of polling message (the processing time it takes to determine the message) is independent from each other. In the 0.10 version of Kafka, the consumer can be specified before leaving the group and trigger a re-balancing can not be how long the message polling, so can avoid livelock (livelock), for example, sometimes the application does not crash, but for some reason it can not function properly. This configuration session.timeout.ms are independent, which is used to control the time of detecting the consumer crash and stop time to send a heartbeat.

3.2 Partition rebalancing

Three cases partition rebalancing occurred:

  • When adding a new consumer group, it reads that the original message read by other consumers.
  • When a consumer is shut down or a crash, it will leave the group, the group originally by its partition will be read in the other consumers to read. If a consumer initiative to leave consumer groups, consumer groups will inform the coordinator will have to leave the group, group coordinator will immediately trigger a re-balanced to minimize processing pauses. If a consumer unexpectedly crashes occurred without informing the group coordinator will stop reading the message, the group coordinator will wait a few seconds, to confirm that it will trigger the death of rebalancing. In those few seconds of time, consumers will not die partition of the message read.
  • When the topic changes, such as the administrator to add a new partition, the partition will happen redistribution.

Partition ownership is transferred from one consumer to another consumer, such behavior is known as the partition rebalancing .

Rebalancing is important because it brings high availability and scalability for consumer groups (we can safely add or remove consumers), but under normal circumstances, we do not want to happen such behavior. During the rebalancing, consumers can not read the message, which we can not use the entire group a short time. In addition, when the partition is re-assigned to another consumer, consumer to read the current state is lost, it may also need to flush the cache before it resumed the state will slow down the application.

3.3 leader partition consumer distribution strategy

When consumers want to join the group, it sends a request to the JoinGroup group coordinator. The first to join the group of consumers will be the leader of consumers. consumers group leader from a group coordinator, where members of the list (the list includes all recently sent heartbeat consumers, they are considered to be active), and is responsible for distribution to each consumer partition.

Each consumer's consumer coordinator at the time of the request to join the group coordinator group will support the partition is assigned its own strategy report to the group coordinator (polling or allocation is based on the span or other), the selected group coordinator send all consumers under consumer group supported by the partition allocation policies to consumers leader, leader of consumers are allocated based on the partition allocation policy.

After completion, leader of consumers to the distribution list to the group coordinator, consumer coordinator then sends this information to all consumers. Each consumer can only see their own allocation information, consumers know only the leader of the group in all consumer distribution information. This process is repeated each time it is re-balanced.

3.4 Consumer groups into the process

  • Consumers create, coordinator of consumers will choose a smaller load node sends a request to find a coordinator of the group to the node
  • KafkaApis processing request, the call returns coordinator node group is located, as follows:

  • When you find the group coordinator, coordinator consumer consumption apply to join the group, send request JoinGroupRequest
  • KafkaApis call handleJoinGroup () method of processing the request
    • The consumers registered to the consumer group
    • The consumer clientId with a UUID generated a value assigned to the consumer memberId
    • The constructor of consumer information MemberMetadata
    • The registration information to the consumer MemberMetadata in GroupMetadata
    • The first group to join a consumer will be the leader
  • The result of processing the request is returned to the consumer JoinGroupRequest
  • After adding the group is successful, partition rebalancing

Guess you like

Origin www.cnblogs.com/hyunbar/p/12527014.html