kafka production process analysis (face important questions !!!)

1. The ad hoc mode (one, active pull data consumers, the clear message received message) point-based model is usually a pull model messaging or polling request information from the queue this model, rather than the push message client. The characteristics of the model are sent to the message queue, and only one receiver is a reception process, even if a plurality of true message listeners

2. Publish / Subscribe mode (after many, data production, pushed to all subscribers) publish and subscribe messaging model is a push-based model. Publish and subscribe model can have a variety of different subscribers, temporary subscribers only receive messages only when active listener theme, all messages and durable subscribers are listening to a topic, even if the current subscriber is not available offline.

1.Producer: news producer, is to kafka broker message client.

2.Consumer: news consumers get news to kafka broker clients

3.Topic: it can be understood as a queue.

4. Consumer Group (CG): This is a topic kafka used for broadcast messages (sent to all consumer) and unicast (issued any consumer) means. A topic may have a plurality of CG. topic of the message copy (not true copy, is conceptual) to all of the CG, but each partion only the message to a consumer in the CG. If you want a broadcast, as long as each consumer has an independent CG on it. To achieve as long as all of unicast consumer in the same CG. The consumer can also be grouped by freely without CG message is transmitted multiple times to a different topic.

5.Broker: a kafka server is a broker. A cluster composed of a plurality of broker. A broker can receive a plurality of topic.

6.Partition: In order to achieve scalability, a very large topic may be distributed to a plurality Broker (i.e., server), the topic can be divided into a plurality of partition, each partition is an ordered queue. partition each message is assigned a sequential id (offset). kafka order to ensure that only one partition in a message to the consumer, does not guarantee a whole topic (s partition between) sequence.

7.Offset: kafka stored files are named according to offset.kafka to do with the name of offset benefits are easy to find. For example, you want to find the location in 2049, just find 2048.kafka of files. Of course, the first offset is 00000000000.kafka

1.producer start zookeeper's "/brokers/.../state" node finds leader of the partition

2.producer send a message to the leader

3.leader write messages to the local log

4.followers, after writing to the local log transmits the ACK message from the leader to pull leader

After 5.leader received ACK replication of all the ISR, increasing HW (high watermark, and finally commit the offset) and sends ACK producer

Guess you like

Origin www.cnblogs.com/liujinqq7/p/12404828.html