Kafka study notes: knowledge point sorting

1. Why do you need a message system
?
1. Decoupling:
　　Allows you to independently extend or modify the processing on both sides, as long as you make sure they follow the same interface constraints.
2. Redundancy:
　　Message queues persist data until they have been fully processed, thus avoiding the risk of data loss. In the "insert-get-delete" paradigm used by many message queues, before removing a message from the queue, your processing system needs to explicitly indicate that the message has been processed, thus ensuring that your data is kept safe until you are done using it.
3. Scalability:
　　Because the message queue decouples your processing, it is easy to increase the frequency of message queuing and processing, as long as additional processing is required.
4. Flexibility & peak processing capacity:
　　In the case of a sharp increase in traffic, the application still needs to continue to function, but such burst traffic is not common. It would be a huge waste to invest resources on standby at any time based on the ability to handle such peak visits. Using message queues enables critical components to withstand sudden access pressures without completely crashing due to sudden overloaded requests.
5. Recoverability:
　　When a part of the system fails, it will not affect the entire system. Message queues reduce the coupling between processes, so even if a process processing a message hangs, messages added to the queue can still be processed after the system is restored.
6. Order Guarantee:
　　In most usage scenarios, the order of data processing is very important. Most message queues are inherently ordered and guarantee that data will be processed in a specific order. (Kafka guarantees the ordering of messages in a Partition)
7. Buffering:
　　It helps to control and optimize the speed of data flow through the system, and solve the inconsistency of the processing speed of production and consumption messages.
8. Asynchronous communication:
　　Many times, users do not want or need to process messages immediately. Message queues provide asynchronous processing mechanisms that allow users to put a message on the queue, but not process it immediately. Put as many messages as you want into the queue, and then process them when needed.
Copy code

2. Kafka architecture
2.1 Topological structure

As shown in the figure below:

Figure 1

2.2 Related concepts As shown in

Figure 1, the related terms of kafka are explained as follows:

Copy code 1. producer
:
　　message producer, the terminal or service that publishes messages to the kafka cluster.
2.broker:
　　The server included in the kafka cluster.
3.topic:
　　The category to which each message published to the kafka cluster belongs, that is, kafka is topic-oriented.
4.partition:
　　partition is a physical concept, each topic contains one or more partitions. The unit of kafka allocation is partition.
5.consumer:
　　A terminal or service that consumes messages from the kafka cluster.
6.Consumer group:
　　In the high-level consumer API, each consumer belongs to a consumer group, and each message can only be consumed by one consumer in the consumer group, but can be consumed by multiple consumer groups.
7.replica:
　　A copy of the partition to ensure the high availability of the partition.
8.leader:
　　A role in the replica, the producer and the consumer only interact with the leader.
9. Follower:
　　A role in the replica that replicates data from the leader.
10.controller:
　　One of the servers in the kafka cluster, used for leader election and various failovers.
12.zookeeper: Kafka
　　stores the meta information of the cluster through zookeeper.
Copy code
2.3 The storage structure of zookeeper node

kafka in zookeeper is shown in the following figure:

Figure 23.

Producer publishes messages
3.1 Writing method

The producer uses the push mode to publish messages to the broker, and each message is appended to the patition, which belongs to Sequential writing to disk (sequential writing to disk is more efficient than random writing to memory, ensuring kafka throughput).

3.2 Message Routing

When the producer sends a message to the broker, it will choose which partition to store it in according to the partition algorithm. The routing mechanism is:

1. If the patition is specified, use it directly;
2. If the patition is not specified but the key is specified, a patition is selected by hashing the value of the key
3. If neither the patition nor the key is specified, use polling to select one patition.
Attach the java client partition source code, it is clear at a glance:

copy code
//create message instance
public ProducerRecord(String topic, Integer partition, Long timestamp, K key, V value) {
     if (topic == null)
          throw new IllegalArgumentException("Topic cannot be null");
     if (timestamp != null && timestamp < 0)
          throw new IllegalArgumentException("Invalid timestamp " + timestamp);
     this.topic = topic;
     this.partition = partition;
     this.key = key;
     this.value = value;
     this.timestamp = timestamp;
}

//计算 patition，如果指定了 patition 则直接使用，否则使用 key 计算
private int partition(ProducerRecord<K, V> record, byte[] serializedKey , byte[] serializedValue, Cluster cluster) {
     Integer partition = record.partition();
     if (partition != null) {
          List<PartitionInfo> partitions = cluster.partitionsForTopic(record.topic());
          int lastPartition = partitions.size() - 1;
          if (partition < 0 || partition > lastPartition) {
               throw new IllegalArgumentException(String.format("Invalid partition given with record: %d is not in the range [0...%d].", partition, lastPartition));
          }
          return partition;
     }
     return this.partitioner.partition(record.topic(), record.key(), serializedKey, record.value(), serializedValue, cluster);
}

// 使用 key 选取 patition
public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
     List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
     int numPartitions = partitions.size();
     if (keyBytes == null) {
          int nextValue = counter.getAndIncrement();
          List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
          if (availablePartitions.size() > 0) {
               int part = DefaultPartitioner.toPositive(nextValue) % availablePartitions.size();
               return availablePartitions.get(part).partition();
          } else {
               return DefaultPartitioner.toPositive(nextValue) % numPartitions;
          }
     } else {
          //Do keyBytes hash select a patition
          return DefaultPartitioner.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
     }
}
Copy code
3.3 Write process

producer Write message sequence diagram as shown below:

Figure.3

Process description:

Copy code
1. Producer first Find the leader of the partition from the "/brokers/.../state" node of zookeeper
2. The producer sends the message to the leader
3. The leader writes the message to the local log
4. The followers pull the message from the leader, and the leader sends ACK after writing to the local log
5. After the leader receives the ACK of all replicas in the ISR, it increases HW (high watermark, the offset of the last commit) 3.4 Producer delivery guarantee
Generally , there are three situations: 1. At most once messages may be lost, but will never be repeatedly transmitted 2. At least one messages will never be lost, but may be repeated Transmission 3. Exactly once each message will definitely be transmitted once and only once When the producer sends a message to the broker, once the message is committed, it will not be lost due to the existence of replication. However, if the producer encounters a network problem after sending data to the broker and the communication is interrupted, the producer cannot judge whether the message has been committed. Although Kafka cannot determine what happened during a network failure, the producer can generate something similar to a primary key, and idempotent retries multiple times in the event of a failure, thus achieving Exactly once, but it has not yet been implemented. Therefore, by default, a message from the producer to the broker ensures At least once. You can achieve At most once by setting the producer to send it asynchronously. 4. The broker saves the message 4.1 Storage method

Physically divide the topic into one or more patitions (corresponding to the num.partitions=3 configuration in server.properties), each patition physically corresponds to a folder (the folder stores all messages and index files of the patition), as follows :

Figure.4

4.2 Storage strategy Kafka keeps all messages

regardless of whether they are consumed or not. There are two strategies to delete old data:

1. Time-based: log.retention.hours=168
2. Size-based: log.retention.bytes=1073741824 It
should be noted that because Kafka has a time complexity of O to read a specific message (1), that is, it has nothing to do with file size, so deleting expired files here has nothing to do with improving Kafka performance.

4.3 Topic Creation and Deletion

4.3.1 Creating

a topic The sequence diagram of creating a topic is as follows:

Figure 5

Process description:

Copy code
1. The controller registers the watcher on the /brokers/topics node of ZooKeeper. When the topic is created, the controller will Get the partition/replica allocation of the topic through watch.
2. The controller reads the list of all currently available brokers from /brokers/ids, for each partition in set_p:
2.1 Select an available broker from all replicas (called AR) assigned to the partition as the new leader, and set AR as the new ISR
2.2 Write the new leader and ISR to /brokers/topics/[topic ]/partitions/[partition]/state
3. The controller sends a LeaderAndISRRequest to the relevant broker through RPC.
Copy code
4.3.2 Delete

topic The sequence diagram of deleting topic is as follows:

Figure.6

Process description:

1. The controller registers the watcher on the /brokers/topics node of zooKeeper. When the topic is deleted, the controller will get the topic through the watch partition/replica allocation.
2. If delete.topic.enable=false, end; otherwise, the watch registered by the controller on /admin/delete_topics is fired, and the controller sends a StopReplicaRequest to the corresponding broker through the callback.

Five, kafka HA
5.1 replication

As shown in Figure 1, the same partition may have multiple replicas (corresponding to default.replication.factor=N in the server.properties configuration). Without a replica, once the broker goes down, the data of all the patitions on it cannot be consumed, and the producer can no longer store data on the patitions on it. After replication is introduced, the same partition may have multiple replicas, and at this time, a leader needs to be selected among these replicas. The producer and consumer only interact with this leader, and other replicas act as followers to copy data from the leader.

Kafka's algorithm for assigning replicas is as follows:

1. Sort all brokers (assuming a total of n brokers) and partitions to be allocated
2. Assign the i-th partition to the (i mod n)-th broker
3. Assign the i-th partition The jth replica of is allocated to the ((i + j) mode n)th broker.
5.2 Leader failover

When the leader corresponding to the partition is down, a new leader needs to be elected from the followers. When electing a new leader, a basic principle is that the new leader must have all the messages committed by the old leader.

Kafka dynamically maintains an ISR (in-sync replicas) in zookeeper (/brokers/.../state). From the writing process in Section 3.3, it can be seen that all replicas in the ISR keep up with the leader, and only the members in the ISR to be elected as leader. For f + 1 replicas, a partition can guarantee that messages will not be lost while tolerating the failure of f replicas.

When all replicas are not working, there are two possible solutions:

1. Wait for any replica in the ISR to come alive and elect it as the leader. Data loss is guaranteed, but the time may be relatively long.
2. Select the first surviving replica (not necessarily an ISR member) as the leader. There is no guarantee against data loss, but relatively short periods of unavailability.
kafka 0.8.* uses the second way.

Kafka elects the leader through the Controller. Please refer to Section 5.3 for the process.

5.3 Broker failover The kafka

broker failover sequence diagram is as follows:

Figure 7.

Process description:

Copy code Read available brokers from /brokers/ids node

3. The controller decides set_p, which contains all partitions on the down broker
4. For each partition in set_p
    4.1 Reads the ISR from the /brokers/topics/[topic]/partitions/[partition]/state node
    4.2 Determines the new leader (as described in Section 4.3)
    4.3 Write the new leader, ISR, controller_epoch and leader_epoch information to the state node
5. Send the leaderAndISRRequest command to the relevant brokers via RPC
Copy code
5.4 controller failover Controller failover

is triggered when the controller is down. Each broker will register a watcher on the "/controller" node of zookeeper. When the controller goes down, the temporary node in zookeeper disappears. All surviving brokers receive a fire notification. Each broker tries to create a new controller path. There is only one The election was successful and was elected as the controller.

When a new controller is elected, the KafkaController.onControllerFailover method will be triggered, and the following operations will be done in this method:

1.
Read and increase the Controller Epoch.
2. Register the watcher on the reassignedPartitions Patch (/admin/reassign_partitions).
3. Register the watcher on the preferredReplicaElection Path (/admin/preferred_replica_election).
4. Register the watcher on the broker Topics Patch (/brokers/topics) via partitionStateMachine.
5. If delete.topic.enable=true (default is false), partitionStateMachine registers watcher on Delete Topic Patch(/admin/delete_topics).
6. Register Watch on the Broker Ids Patch (/brokers/ids) via replicaStateMachine.
7. Initialize the ControllerContext object, set all the current topics, the list of "live" brokers, the leaders and ISRs of all partitions, etc.
8. Start replicaStateMachine and partitionStateMachine.
9. Set the brokerState state to RunningAsController.
10. Send the leadership information for each partition to all "live" brokers.
11. If auto.leader.rebalance.enable=true (the default value is true), start the partition-rebalance thread.
12. If delete.topic.enable=true and there is a value in Delete Topic Patch(/admin/delete_topics), delete the corresponding topic.
Copy code

6. Consumer consumes messages
6.1 consumer API

kafka provides two sets of consumer APIs:

1. The high-level Consumer API
2. The SimpleConsumer API
The high-level consumer API provides a high-level abstraction for consuming data from kafka, while SimpleConsumer APIs require developers to pay more attention to details.

6.1.1 The high-level consumer API The

high-level consumer API provides the semantics of consumer group, a message can only be consumed by one consumer in the group, and the consumer does not pay attention to the offset when consuming the message, and the last offset is saved by zookeeper.

Using the high-level consumer API can be a multi-threaded application, it should be noted:

1. If the number of consumer threads is greater than the number of patitions, some threads will not receive messages
2. If the number of patitions is greater than the number of threads, some threads will receive messages from multiple patitions.
3. If a thread consumes multiple patitions, the order of the messages you receive cannot be guaranteed, while the messages in a patition are ordered
6.1.2 The SimpleConsumer API If

you want more control over the patition, you should use the SimpleConsumer API, for example:

1. Read a message multiple times
2. Consume only part of the message in a patition
3. Use Transactions are used to ensure that a message is only consumed once.
However , when using this API, partition, offset, broker, leader, etc. are no longer transparent to you and need to be managed by yourself. You need to do a lot of extra work:

1. The offset must be tracked in the application to determine which message should be consumed next
2. The application needs to know through the program who the leader of each Partition is
3. The change of the leader needs to be processed
Use The general flow of the SimpleConsumer API is as follows:

1.
Find a "live" broker and find the leader of each partition
2. Find the followers of each partition
3. Define the request, which should describe the application What data is needed
4. Fetch data
5. Identify leader changes and respond as necessary
Copy code
The following description is for the high-level Consumer API.

6.2 consumer group

As mentioned in section 2.2, the allocation unit of kafka is patition. Each consumer belongs to a group, and a partition can only be consumed by one consumer in the same group (which ensures that a message can only be consumed by one consumer in the group), but multiple groups can consume the partition at the same time .

One of the design goals of Kafka is to realize offline processing and real-time processing at the same time. According to this feature, real-time processing systems such as spark/Storm can be used to process messages online, and Hadoop batch processing systems can be used for offline processing. Data can also be backed up to Another data center only needs to ensure that the three belong to different consumer groups. As shown in the following figure:

Figure.8

6.3 Consumption Mode
The consumer uses the pull mode to read data from the broker.

The push mode is difficult to adapt to consumers with different consumption rates, because the message sending rate is determined by the broker. Its goal is to deliver messages as quickly as possible, but this can easily cause the consumer to be too late to process messages, typically resulting in denial of service and network congestion. In the pull mode, messages can be consumed at an appropriate rate according to the consumer's consumption capacity.

For Kafka, the pull mode is more suitable. It simplifies the design of the broker. The consumer can independently control the rate of consuming messages. At the same time, the consumer can control the consumption method by itself - it can be consumed in batches or one by one, and at the same time, it can choose different submission methods to achieve different transport semantics.

6.4 consumer delivery guarantee

If the consumer is set to autocommit, the consumer will automatically commit once it reads the data. If only this process of reading messages is discussed, then Kafka ensures Exactly once.

However, in actual use, the application does not end when the consumer reads the data, but needs to perform further processing, and the order of data processing and commit largely determines the consumer delivery guarantee:

Copy code
1. Read the message first commit and then process the message.
    In this mode, if the consumer crashes before it has time to process the message after commit, it will not be able to read the message that has just been submitted but not processed after restarting the work next time, which corresponds to At most once
2. After reading the message Process first and then commit.
    In this mode, if the consumer crashes before committing after processing the message, the message that has just not been committed will be processed when the work is restarted next time. In fact, the message has already been processed. This corresponds to At least once.
3. If you must do Exactly once, you need to coordinate the output of the offset and the actual operation.
    The classic approach is to introduce two-phase commit. It would be more concise and general if the offset and the operation input could be stored in the same place. This way may be better, as many output systems may not support two-phase commit. For example, after the consumer gets the data, it may put the data in HDFS. If the latest offset and the data itself are written to HDFS, it can ensure that the data output and the offset update are either completed or not completed, which indirectly achieves Exactly once. (As far as high-level API is concerned, the offset is stored in Zookeeper and cannot be stored in HDFS, while the offset of SimpleConsuemr API is maintained by itself and can be stored in HDFS.) In short
, Kafka guarantees by default
At least once, and allows At most once to be achieved by setting the producer to submit asynchronously (see the article "kafka consumer prevents data loss"). Exactly once requires cooperation with external storage systems. Fortunately, the offset provided by kafka can be used very directly and easily.

For more information on kafka delivery semantics, please refer to "Message Delivery Semantics".

6.5 Consumer rebalance Rebalance is triggered

when a consumer joins or exits, and when a partition changes (such as a broker joins or exits). The consumer rebalance algorithm is as follows:

Copy code
1. Sort all partitions under the target topic and store them in PT
2. Sort all consumers under a consumer group and store them in CG, and the ith consumer is recorded as Ci
3. N=size(PT)/size(CG), rounded up
4. Release Ci's consumption right to the originally allocated partition (i starts from 0)
5. Convert the i*N to (i+1)*N -1 partition is allocated to Ci
replication code
In version 0.8.*, each consumer is only responsible for adjusting the partitions it consumes. In order to ensure the consistency of the entire consumer group, when a consumer triggers rebalance, the consumer group All other consumers should also trigger rebalance at the same time. This will lead to the following problems:

Copy code
1.Herd effect
　　, any increase or decrease of brokers or consumers will trigger the rebalance of all consumers
2.Split Brain
　　, each consumer separately judges which brokers and consumers are down through zookeeper, so different The view that the consumer sees from the zookeeper at the same time may be different, which is determined by the characteristics of the zookeeper, which will result in incorrect reblance attempts.
3. The adjustment result is uncontrollable.
　　All consumers do not know whether the rebalance of other consumers is successful, which may cause kafka to work in an incorrect state.
copy code
Based on the above problems, the kafka designers considered using the central coordinator to control the consumer rebalance in version 0.9.*, and then from the perspective of simplicity and verification requirements, they planned to implement the allocation scheme on the consumer client. (See the articles "Kafka Detailed Consumer Coordinator Design" and "Kafka Client-side Assignment Proposal"), which will not be repeated here.

7. Precautions
7.1 The problem that

the At first, a kafka pseudo-cluster was built on the local machine, and the local producer client successfully published messages to the broker. Then a kafka cluster was built on the server, and the cluster was connected locally, but the producer could not publish messages to the broker (strange and no error). At first, I suspected that iptables was not open, so I opened the port, but the result was not good (it started to be a code problem, version problem, etc., and it took a long time). In the end, there is no way, check the server.properties configuration one by one, and find the following two configurations:

Copy code
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = security_protocol://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT://:9092
　# Hostname and port the broker will advertise to producers and consumers. If not set,
　# it uses the value for "listeners" if configured . Otherwise, it will use the value
　# returned from java.net.InetAddress.getCanonicalHostName(). #advertised.listeners= PLAINTEXT
　://your.host.name:9092 Used to connect with the consumer, if not set, use listeners, and if host_name is not set, use the host name returned by the java.net.InetAddress.getCanonicalHostName() method. Modification method: 1. listeners=PLAINTEXT://121.10.26.XXX:9092 2. advertised.listeners=PLAINTEXT://121.10.26.XXX:9092

After the modification, restart the service and it works normally. For more kafka configuration instructions, see the article "Kafka Learning and Finishing Three (borker (0.9.0 and 0.10.0) Configuration)".

Eight, reference articles
1. "Kafka Analysis (1): Kafka Background and Architecture Introduction"

2. "Kafka Design Analysis (2): Kafka High Availability (1)"

3. "Kafka Design Analysis (2): Kafka High Availability ( Next)"

4. "Kafka Design Analysis (4): Kafka Consumer Analysis"

5. "Kafka Design Analysis (5): Kafka Benchmark"

6. "Kafka Learning and Finishing Three (borker (0.9.0 and 0.10.0) Configuration) "

7. "Using the High Level Consumer"

8. "Using SimpleConsumer"

9. "Consumer Client Re-Design"

10. "Message Delivery Semantics"

11. "Kafka Detailed Consumer Coordinator Design"

12. "Kafka Client-side Assignment Proposal" "

13. "Kafka and DistributedLog technology comparison"

14. "kafka installation and startup"

15. "kafka consumer to prevent data loss"

Author: cyfonly
Source: http://www.cnblogs.com/cyfonly/

Kafka study notes: knowledge point sorting

Guess you like