Kafka's node structure and log structure in zookeeper

 

login zookeeper

zkCli.sh -server centos1:2181

 

create topics mytopic

kafka-topics.sh --create --zookeeper centos1:2181,centos2:2181,centos3:2181 --replication-factor 3 --partitions 2 --topic mytopic

 

zookeeper node structure

/controller      data={"version":1,"brokerid":2,"timestamp":"1495002238024"}      //id=2的broker是leader

/controller_epoch     data=1

 

/brokers

/brokers/ids //Maintain active brokers in real time

/brokers/ids/0

/brokers/ids/1

/brokers/ids/2

/brokers/topics

/brokers/topics/mytopic/partitions/0/state data={"controller_epoch":7,"leader":1,"version":1,"leader_epoch":0,"isr":[1,0,2] } //Where leader refers to the leader of the partition, and each partition has a leader. "isr":[1,0,2] indicates that the partition has three replications, which are located on the three brokers 1,0,2 respectively. The leader maintains the synchronization information of other replicas.

/brokers/topics/mytopic/partitions/1/state    data={"controller_epoch":7,"leader":2,"version":1,"leader_epoch":0,"isr":[2,1,0]}

/brokers/seqid

 

/admin/delete_topics

 

/isr_change_notification

 

/consumers/

/consumers/console-consumer-24372     data=

/consumers/console-consumer-24372/ids     data=

/consumers/console-consumer-24372/ids/console-consumer-24372_centos1-1495075711403-999aec1a     data={"version":1,"subscription":{"mytopic":1},"pattern":"white_list","timestamp":"1495075711460"}

/consumers/console-consumer-24372/owners     data=null

/consumers/console-consumer-24372/owners/mytopic     data=null

/consumers/console-consumer-24372/owners/mytopic/0     data=console-consumer-24372_centos1-1495075711403-999aec1a-0

/consumers/console-consumer-24372/owners/mytopic/1     data=console-consumer-24372_centos1-1495075711403-999aec1a-0

/consumers/console-consumer-24372/offsets     data=null

/consumers/console-consumer-24372/offsets/mytopic     data=null

/consumers/console-consumer-24372/offsets/mytopic/0     data=153

/consumers/console-consumer-24372/offsets/mytopic/1 data=582 //console-consumer-24372 is the group name of a console consumer, and 582 is the offset of the messages in partition 1 in mytopic currently consumed by the consumer shift. This value can be modified directly in zookeeper, so that the consumer starts reading messages from this value (offset).

 

/config

/config/changes

/config/clients

/config/topics

 

 

 kafka directory structure

 ./.lock

./meta.properties

./cleaner-offset-checkpoint

./replication-offset-checkpoint

./recovery-point-offset-checkpoint

./mytopic-0 // Naming method: topic + partition ID

./mytopic-0/00000000000000000000.index

./mytopic-0/00000000000000000000.timeindex

./mytopic-0/00000000000000000000.log //Where to store messages

./mytopic-1

./mytopic-1/00000000000000000000.index

./mytopic-1/00000000000000000000.timeindex

./mytopic-1/00000000000000000000.log //Where to store messages

 

 

Kafka replication mechanism:

1. Each partition stores n replicas and can withstand n-1 node failures.

2. One of the n replicas is the leader, which also maintains the synchronization state of all replicas.

3. If the leader fails, the producer will be notified, and then the producer will resend the message to the new leader.

4. The method of selecting a new leader is: all followers register themselves in zookeeper, the first to register is the leader, and the others are followers.

5. The replication mechanisms supported by Kafka are:

Synchronization mechanism: The producer finds the leader from zookeeper, sends a message to the leader, and writes the message to the leader's local log. The follower pulls the message from the leader, each follower writes the message to the local log, and sends a confirmation receipt to the leader. The leader receives the confirmation receipt from the follower and then sends the confirmation receipt to the producer. On the consumer side, all messages are pulled from the leader.

Asynchronous mechanism: Different from the synchronous mechanism, once the leader writes the message to the log, it will send a confirmation receipt to the producer. So this mechanism does not guarantee the success of writing to the failed follower.

 

 

consumer group 与 partition

1. Multiple consumers listening to the same topic can belong to a group. Multiple consumers belonging to the same group will not receive messages repeatedly. If you want to receive all messages repeatedly, you need to configure different groups.

2. Suppose the number of partitions is m, and the number of consumers belonging to the same group is n:

a. m=n, an average partition is allocated to each consumer

b. m>n, each consumer can be assigned a partition, and some consumers will be assigned to multiple partitions

c. m<n, only m consumers can allocate a partition, and nm consumers cannot receive messages. At this time, if a new consumer is started, an old consumer will not be able to read the message.

The number of consumers can be adjusted at any time without missing messages.

命令:bin/kafka-console-consumer.sh --bootstrap-server centos1:9092  --topic mytopic --consumer-property group.id=group1

3. You can specify that the consumer only receives messages from a certain partition

命令:bin/kafka-console-consumer.sh --bootstrap-server centos1:9092  --topic mytopic --consumer-property group.id=group1  --partition 0

4. The following zookeeper node information is shared by the group (the new version of Kafka may not save the offset to the zookeeper):

/ consumers / mygroup / offsets / mytopic / 0 data = 153

/ consumers / mygroup / offsets / mytopic / 1 data = 582  

153 and 582 record the offsets read by partition 0 and partition 1 of mygroup in mytopic.

 

Reference: http://www.jasongj.com/2015/01/02/Kafka%E6%B7%B1%E5%BA%A6%E8%A7%A3%E6%9E%90/

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326450474&siteId=291194637