kafka-Message, log and index files, consumer groups, rebalance

Under the relevant records and kafka Message, log files, index files, consumer offset record related content consumption, a lot of understanding of the text refer to the end of the text Bowen, books as well as seniors.

kafka messages

kafka message in the Message, the version V1 is a part, mainly the relationship between key and value.

(1) key: When a message needs to be written to the specified partition of a partition for the topic, a key value need be.

(2) value: The actual message content stored here.

(3) all other metadata of the message, generally do not care about, it is transparent to the user.

To save the message data, Kafka ByteBuffer used to store an array of bytes which is compact, as compared to using java object to hold the data message to the heap, it is more space-saving, improving memory usage.

log and index files

basic introduction

Look under a topic directory partition, found that log, index and timeindex three documents, it has the following characteristics.

(1) log file name is a file offset of the first message named, the actual length is 64-bit offset, but here only 20 bits is adequate to cope with the production. As can be seen the first log file name begins with 0, and the second log file is 4,161,281, a first log file is saved messages from the offset of 0-4161280.

(2) a set of name index + log + timeindex file is the same, and the log file is full default 1G, will be formed Rolling log to record a new message composition, the broker terminal by log.segment.bytes = 1073741824 specified, this value can be adjusted.

(3) index and timeindex will be allocated just use the size 10M when after log rolling, it will be trimmed to the actual size, so see the size of the index file of the first few, only a few hundred K.

# 一个分区目录下文件内容,参考文末书籍杜撰,主要为了说明概念
[root@hadoop01 /home/software/kafka-2/kafka-logs/football-0]# ll -h
-rw-r--r--. 1 root root 514K Mar 20 16:04 00000000000000000000.index
-rw-r--r--. 1 root root 1.0G Mar 17 03:36 00000000000000000000.log
-rw-r--r--. 1 root root 240K Mar 20 16:04 00000000000000000000.timeindex

-rw-r--r--. 1 root root 512K Mar 20 16:04 00000000000004161281.index
-rw-r--r--. 1 root root 1.0G Mar 17 03:36 00000000000004161281.log
-rw-r--r--. 1 root root 177K Mar 20 16:04 00000000000004161281.timeindex

-rw-r--r--. 1 root root 10M Mar 20 16:04 00000000000008749921.index
-rw-r--r--. 1 root root 390M Mar 17 03:36 00000000000008749921.log
-rw-r--r--. 1 root root 10M Mar 20 16:04 00000000000008749921.timeindex

If you want to view these files, you can use the shell kafka provided to complete several key information is as follows:

(1) offset is increasing integer.

(2) position relative to the outer position increments batch is to be understood as the byte offset of the message.

(3) CreateTime: timestamp.

(4) magic: 2 represent this type of message is V2, if 0 is representative of the type of V0, V1 represents type. This type of machine is V2, but can also be temporarily accordance with the above reference V1 to understand the specific need to see the end of the text books in the details.

(5) compresscodec: None Description compression type is not specified, kafka now provides four selectable, 0-None, 1-GZIP, 2-snappy, 3-lz4.

(6) crc: crc values ​​for all fields on the check.

# 查看并打印log文件内容
[root@hadoop01 /home/software/kafka-2/kafka-logs/football-0]# ../../bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files 00000000000000000004.log  --print-data-log
Dumping 00000000000000000004.log
Starting offset: 4
baseOffset: 4 lastOffset: 4 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 49 isTransactional: false position: 0 CreateTime: 1584368524633 isvalid: true size: 85 magic: 2 compresscodec: NONE crc: 3049289418
baseOffset: 5 lastOffset: 5 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 49 isTransactional: false position: 85 CreateTime: 1584368668414 isvalid: true size: 73 magic: 2 compresscodec: NONE crc: 2267711305
baseOffset: 6 lastOffset: 6 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 49 isTransactional: false position: 158 CreateTime: 1584368679882 isvalid: true size: 78 magic: 2 compresscodec: NONE crc: 789213838
baseOffset: 7 lastOffset: 7 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 49 isTransactional: false position: 236 CreateTime: 1584368695371 isvalid: true size: 95 magic: 2 compresscodec: NONE crc: 703634716

Structure principle

(1) message contents, log stored in the log file, it is the message record carrier. Record message is packaged in the form of, to the end of the append log log file, uses a sequential write mode, the official website of the reference picture, a topic of the different partitions may be thought of as Queue, are sequentially written message sent to it. In FIG partition0 0,1,2,3 other digital message is a partition of the offset, which is incremented number.

Note also have offset the consumer, when just beginning to learn the two confused, consumers offset refers to the location of consumption, it is constantly updated figures, mainly in order to continue to the next consumer orientation with. As the official website shown in the picture, offset A consumer consumption is 9, offset consumer spending is B 11, was handed over to different consumers offset their own records separate.

(2) the position of the index, stored in the index file, log log default each write 4K (log.index.interval.bytes set), writes a file index information to the index, so the index file is sparse index, it does not have index information for each log.

The figure is used directly brought online, log in the log file is written sequentially, a message + offset + position actual composition, the index file is a data structure by relative offset (4byte) + position (4byte) Composition the system saves a relative offset relative to the first message, just 4byte on it, you can save space, look at the actual need to calculate back to the actual offset, which is transparent to the user. Because log file names below is zero, and therefore the actual offset for the relative offset of 3 is 3 + 0, 3 remains.

For sparse index, although its index density is not high, but the offset is ordered, kafka find an offset time corresponding to the actual message, you can find by index-half to get to the nearest low offset, and then from the low offset corresponding position start from the actual log file onward to find the corresponding message. To find message offset = 5, this data find 34,597 to go low index file, then the byte offset 4597 from the log file to start reading from the 4597 bytes read until offset = 5, this data, it starts reading the log file from the ratio of direct to save time. Binary search time complexity is O (lgN), if re-traverse time complexity is O (N).

Note that the figure of the index in the comma does not exist, this image of a comma is added in order to facilitate understanding.

(3) the index file timestamps, its role is to allow users to query a certain period of time message, it is a data structure of a time stamp (8byte) + Relative offset (4byte), if you want to use the index file, first need time to find the corresponding relative offset, and then go to the corresponding index file is found position information before traversing the log file, it also requires the use of the above said index file.

However, due to the production of news producer can specify the time stamp of the message, which could lead to the timestamp of the message is not necessarily the order, so try not specify a timestamp when the production of news.

Consumer groups and coordinator

When the news consumer spending, consumers will be recorded offset (note offset is not partitioned, different contexts must be distinguished), the consumer offset, is stored in a special internal partitions, called __consumer_offsets, it is a role , consumer group that is stored in the consumer's offset. It generates 50 Subdivision (offsets.topic.num.partitions set) default to create a copy, if the partitions 50 distributed across 50 servers, will greatly ease the pressure on consumers to offset the submission. Can produce this particular consumer group in the creation of the consumer.

# 如果只启动了hadoop03一个broker,则所有的50个分区都会在这上面生成
[root@hadoop03 /home/software/kafka-2/bin]# sh kafka-console-consumer.sh --bootstrap-server hadoop03:9092 --topic football --from-beginning --new-consumer

So the question is, in the end save consumers offset to which partition it, kafka in accordance with the consumer group group.id determined using Math.abs (groupId.hashCode ())% 50, to calculate the partition number, so all consumers can determine a consumer group under the offset, which will be saved to the partition.

So the question again, since all consumers within a consumer group regarded offset submitted to the same partition under __consumer_offsets, how to distinguish different consumer offset it? Originally submitted to the news under the partition, key is groupId + topic + partition number, value consumers offset. The key partakers area code, note the partition number is the partition number of consumer groups in the topic of consumer spending. Since the actual case a partition under the topic, can only be consumed by a consumer group in a consumer, this is not worried offset confusing problem.

In fact, topic under multiple partitions evenly distributed to consumer spending at a consumer group, to be completed by the coordinator, which listens to the consumer, if the consumer is down or add new consumers, will rebalance, use certain strategies to make the partition reassigned to consumers. As shown below, the consumer group will be saved by the offset position in which the broker, the election as coordinator of the consumer group, is responsible for monitoring the various heartbeat consumers understand their health status, and the topic of the corresponding partition leader, as far as possible the average give the consumer the group of consumers, based on changes in the consumer, such as add a consumer, would be triggered coordinator rebalance.

There is one detail, and the coordinator between the consumer group also carried out what communication, how do tacit understanding between individual consumers do not grab people's resources? Reference predecessors summarized as follows.

(1) the selected consumer group coordinator will send join group request.

(2) coordinator will select a leader in consumer consumer group, and then the topic of consumer information to be returned to the leader.

(3) leader consumer information based on topic, specify a set of programs in line with the consumer group's own consumption, returned to the coordinator by sync group request.

(4) After the coordinator will receive an allocation distributed to consumers.

(5) Finally, each consumer who will have a consumer program, comply with it for consumption.

rebalance

rebalance agreement is consistent with how to consume topic partitions within the consumer group, in the end the books mentioned there are three trigger conditions, where only the first record because it is the most common, and that is where the consumer or consumer group to increase, or leave, or Ben collapse (it resembles life). The other two, one is topic number of partitions increased use kafka shell partition, there is a consumer topic is in accordance to the regular match, when a new topic has been compliance with this rule occur, will trigger rebalance.

It has three strategies for the range, round robin, sticky.

TopicA assume p0 ~ p6 partition has a total of six partitions, a consumer group has three consumers, as a basis for an intuitive feel three strategies.

(1)range

Somewhat similar to the python's range, it's a range, will be divided according to the partition number, the result is:

Consumers 1 p0 p1, consumer 2 p2 p3, consumer 3 p4 p5

(2)round robin

Is evenly distributed randomly, the results slightly.

(3)sticky

There is a small problem two allocation above, is to have the consumer after downtime, re-allocation, originally belonged to a consumer spending smartly partition will be assigned to new customers. Such as consumer policy at range 3 hang, re-allocation will become consumers 1 p0 p1 p2 consumers 2 p3 p4 p5, p2 so it was re-assigned. Given the complexity of managing consumer offset, try to want to maintain old habits, if it is sticky strategy will become consumers 1 p0 p1 p4 consumer 2 p2 p3 p5.

Above, it is not necessarily the correct understanding, writing also more long-winded, but learning is a continuous understanding and correction process.

Reference Bowen:

(1)https://blog.csdn.net/xiaoyu_bd/article/details/52398265

(2) "Apache Kafka real"

Guess you like

Origin www.cnblogs.com/youngchaolin/p/12543436.html