[Kafka from entry to abandonment series 6] Kafka architecture in-depth-high concurrent reading and writing and Zookeeper management

After understanding [Kafka from entry to abandonment series 4] Kafka architecture in-depth-producer strategy and [Kafka from entry to abandonment series 5] Kafka architecture in-depth-consumer strategy , let us understand how Zookeeper manages.

Kafka efficiently reads and writes data

How does Kafka ensure efficient data reading and writing? There are three points of support: distributed read and write, sequential write to disk, and zero copy technology . In fact, the first two points are also mentioned in the previous blog.

  • Distributed reading and writing , the various strategies we mentioned are to satisfy distributed, reliable and efficient reading and writing
  • Write to disk sequentially . Kafka's producer produces data to be written to the log file. The writing process is to append to the end of the file, which is sequential writing. For the same disk, sequential writing can reach 600M/s, while random writing is only 100K/s. This is related to the mechanical mechanism of the disk. The reason why sequential writing is fast is that it saves a lot of head addressing time .
  • Zero-copy technology , in simple terms, is that the data does not need to go through the user mode. For detailed principles, please refer to how Kafka uses zero-copy to improve performance.

Kafka's high concurrent read and write can be achieved through the above technologies

Zookeeper management

In a Kafka-based distributed message queue, ZooKeeper's functions include: broker registration, topic registration, producer and consumer load balancing, maintaining the relationship between partition and consumer, recording the progress of message consumption, and consumer registration .
Insert picture description here

Register broker

The broker is registered in zookeeper. Remember that when the distributed cluster was built, the service node we added to the zk configuration file was used to register the broker.

  • In order to record the registration information of the brokers , a node belonging to Kafka was created on ZooKeeper, and its path is /brokers
  • When each broker of Kafka starts, it will register in ZooKeeper, tell ZooKeeper its broker.id, in the entire cluster, broker.id should be globally unique, and create its own node on ZooKeeper , its node path is /brokers/ids/{broker.id};
  • After the node is created, Kafka will record the broker.name and port number of the broker to the node;
  • In addition, the attribute of the broker node is a temporary node. When the broker session fails, ZooKeeper will delete the node. In this way, we can easily monitor the changes of the broker node and adjust the load balance in time.
    Insert picture description here
    Of course, after registering the broker, you also need to register the topic

Register topic

In Kafka, all the correspondences between topics and brokers are maintained by ZooKeeper. In ZooKeeper, a special node is established to record this information, and the node path is /brokers/topics/{topic_name}. As mentioned earlier, in order to ensure data reliability, each Topic Partitions actually has a backup, and the number of backups is controlled by the replicas in the Kafka mechanism.

  • Election leader , Kafka finds a node for each partition as the leader, and the rest as the follower
  • Replica synchronization . When the message of producer push is written to the partition, the broker (Kafka node) as the leader will write the message to its own partition, and at the same time, copy the message to each follower to achieve synchronization .
  • Maintain ISR . If a follower goes down, the leader will find another replacement and synchronize messages
  • Re-elect the leader . If the leader fails, the followers will elect a new leader to replace and continue business

All these operations are done by zookeeper

Register consumer

What are the jobs that zookeeper does on the consumer side?

  • Register a new consumer group . When the new consumer group is registered in ZooKeeper, ZooKeeper will create a dedicated node to save related information. The node path is /consumers/{group_id}. There are three sub-nodes under the node, namely [ids, owners, offsets ]. ids node: record the consumers currently consuming in the consumer group; owners node: record the topic information consumed by the consumer group; offsets node: record the offset of each partition of each topic . Of course the new version is not recorded in zookeeper
  • Register a new consumer . When a new consumer is registered in Kafka, a temporary child node will be created under the /consumers/{group_id}/ids node and related information will be recorded.
  • To monitor the changes of consumers in the consumer group , each consumer must pay attention to the changes in the number of consumers in the consumer group to which it belongs, that is, monitor the changes in the child nodes under /consumers/{group_id}/ids. As soon as it is discovered that consumers have added or decreased, load balancing of consumers will be triggered.

In fact, not only the registration of consumers, but also the management of consumer strategies.

Producers load balancing

For different partitions of the same topic, Kafka will try to distribute these partitions to different broker servers. This balancing strategy is actually implemented based on ZooKeeper .

  • To monitor broker changes , after the producers are started, they must register under ZooKeeper and create a temporary node to monitor changes in the broker server list. Because the brokers created under ZooKeeper are also temporary nodes, when brokers change, producers can get relevant notifications from changing their broker list.
  • Monitoring topic changes , topic changes , and changes in the relationship between broker and topic are also implemented through ZooKeeper's Watcher monitoring

When the broker changes and the topic changes, zookeeper can monitor and control the distribution of messages and partitions.

Consumer load balancing

When consumers start, they will go to ZooKeeper to create temporary nodes with their own conusmer-id /consumer/[group-id]/ids/[conusmer-id], and register /consumer/[group-id]/ids to monitor events:

  • Monitor the list of consumers . When a consumer changes, the rest of the consumers in the same group will be notified.
  • To monitor the list of brokers , consumers also monitor changes in the list of brokers.

Then follow the strategies we mentioned before to sort and consume

Record consumption progress Offset

When the consumer consumes the message of the specified message partition, it is necessary to periodically record the consumption progress Offset of the partition message to ZooKeeper, so that after the consumer restarts or another consumer takes over the message consumption right of the message partition again, it can Continue to consume messages from the previous progress. Of course, it is now recommended to record on a specific topic .

Record the relationship between Partition and Consumer

There are multiple consumers (consumers) under the consumer group. For each consumer group, Kafka will assign it a globally unique group ID, which is shared by all consumers in the group. Each partition under the subscribed topic can only be assigned to one consumer under a certain group (of course, the partition can also be assigned to other groups). At the same time, Kafka assigns a consumer ID to each consumer, usually expressed in the form of hostname:UUID.

  • Record the relationship between partition and consumer . In Kafka, it is stipulated that each partition can only be consumed by one consumer in the same group. Therefore, it is necessary to record the relationship between partition and consumer on ZooKeeper

Once each consumer has determined its consumption power for a partition, it needs to write its consumer ID to the temporary node of the corresponding message partition of ZooKeeper

Part of the content comes from: https://gitbook.cn/books/5ae1e77197c22f130e67ec4e/index.html

Guess you like

Origin blog.csdn.net/sinat_33087001/article/details/108398136