[News] kafka Middleware

A, kafka overall architecture

        kafka is a news release subscription model queues, producers and consumers are many relationships, the sender and the receiver really decoupling;

       Broker sends a message to the producer;

       Consumers using pull (pull) mode subscribe and consume messages;

 

Second, producers introduced

        Zoning is: the development of partition, partition the msg key, the same key guarantee of message delivery to the same partiton. For example, according to the last two uid as key, there may 100partiron, uid to ensure the same time in a partition.

        sending a message sent in bulk kafka + asynchronously transmitted, performance, message ordering can not be guaranteed;

Third, consumers introduction

       Consumers are entities calls the poll method, which can be a thread, it can also be a service;

       To avoid wasting the consumer, the number of consumers to be less than the number of partition;

       3, pulling, message processing model

(1) synchronization message processing: a partiton corresponds to a thread, to ensure that the ordered message partiton consumption performance is limited by the speed of message processing.

(2) to process messages asynchronously: a message thread is responsible for pulling the thread pool handles messages mode, the message can not be guaranteed partition orderly, quick news consumption rate, save tcp connection overhead.

 

Four, the message ordering and repeatability

1, causes the message out of sequence:

(1) Sender: + asynchronous transmit message transmission failure resulting scrambled retry

(2) the recipient: pulling a single message thread, different, simultaneous multi-thread processing the message sequence, causing the time the message is processed out of order;

A plurality of thread pull message, since the scrambled message to be processed leads gc;

(3) broker: partition a single message ordered, unordered messages between a plurality of partition;

2, cause repeated message

Consumers submit news offset a message with the time difference really deal with messages caused repeated consumption;

Partiton between consumers and increase consumer due to "re-balanced" information leading to the duplication of spending;

Conclusion, to ensure performance under the premise messaging middleware is impossible to guarantee message delivery is not repeated, unless sacrificing performance and high availability, need to do downstream idempotent.

3, from a business point of view see the message ordered

要保证消息的严格有序,需要生产者、消费者、broker之间严密的配合并且牺牲掉系统的并发性,例如将topic的partiton设置为1个。而对于99%的业务需求来说,并不需要100%的按照时间戳的全局严格有序。

可以将全局消息拆成按照业务类型分区的有序,例如订单A的发单、完单、支付与订单B的发单、完单、支付之间并不需要严格有序,但是订单内各种事件的消息顺序却很重要,一个业务需要首先发单事件,并且在随后的支付事件时依赖于前面那个发单事件的一些属性。

所以我们可以将全局的消息按照业务属性拆成局部有序。

 

从上面的分析看,要保证消息有序性就要降低系统并行度,系统整体吞吐量下降。

严格的按照消息生产的时间戳有序是几乎不可能实现的,所以一个可用的系统是在正常情况下保证消息有序,在几种异常情况下允许乱序,并且对这几种异常情况导致的乱序做好监控和补救措施。

对于消息重复的情况,应该要求下游做好幂等,不能完全依赖于mq,因为mq在保证高可用和高吞吐凉的前提下是不可能做到消息不重复的。

 

      

 

Guess you like

Origin www.cnblogs.com/jlf0103/p/11923343.html