RocketMQ vomiting summary

RocketMQ vomiting summary

Architecture

 

Conceptual Model

The most basic conceptual model conceptual model and the extended rear section 

Storage Model

 

RocketMQ vomiting summary

User Guide

  • RocketMQ is a distributed messaging middleware originally developed by Alibaba messaging middleware used in the production team and large-scale systems to meet the needs of online massive accumulation of messages, donated to the open source Apache Foundation in late 2016 to become the incubator project, after less than a year officially became the top-level Apache project; Ali had earlier developed based on ActiveMQ messaging system, with the increasing scale of business news, gradually emerging bottlenecks, it was also considered Kafka, but because of the low latency and high reliability there was no choice, and finally independent research and development RocketMQ, all aspects of performance than the currently existing message queue is better, RocketMQ and Kafka on the concepts and principles are very similar, so it is often used to compare; RocketMQ default with long wheel Inquiry pull mode, stand-alone support thousands of levels of accumulation of the message, the application can be very good at mass message system.
  • NameServer can deploy multiple, independent of each other, while the other roles report status information to a plurality of NameServer machines, so as to achieve hot backup. NameServer itself is stateless, i.e. NameServer the Broker, Topic and other status information is not stored in persistent, it is reported by the timing of the respective roles and stored into memory (NameServer support persistent configuration parameters, generally less than ).
  • Why not ZooKeeper? ZooKeeper is powerful, including automatic Master elections, RocketMQ architecture design decision that it does not require Master election, less than these complex functions, we only need a lightweight metadata server is sufficient. It is noteworthy that, NameServer does not provide a mechanism similar Zookeeper watcher, instead of using the 30s every heartbeat mechanism.
  • Heartbeat mechanism
    • 单个Broker跟所有Namesrv保持心跳请求,心跳间隔为30秒,心跳请求中包括当前Broker所有的Topic信息。Namesrv会反查Broer的心跳信息, 如果某个Broker在2分钟之内都没有心跳,则认为该Broker下线,调整Topic跟Broker的对应关系。但此时Namesrv不会主动通知Producer、Consumer有Broker宕机。
    • Consumer跟Broker是长连接,会每隔30秒发心跳信息到Broker。Broker端每10秒检查一次当前存活的Consumer,若发现某个Consumer 2分钟内没有心跳, 就断开与该Consumer的连接,并且向该消费组的其他实例发送通知,触发该消费者集群的负载均衡(rebalance)。
    • 生产者每30秒从Namesrv获取Topic跟Broker的映射关系,更新到本地内存中。再跟Topic涉及的所有Broker建立长连接,每隔30秒发一次心跳。 在Broker端也会每10秒扫描一次当前注册的Producer,如果发现某个Producer超过2分钟都没有发心跳,则断开连接。
  • Namesrv压力不会太大,平时主要开销是在维持心跳和提供Topic-Broker的关系数据。但有一点需要注意,Broker向Namesrv发心跳时, 会带上当前自己所负责的所有Topic信息,如果Topic个数太多(万级别),会导致一次心跳中,就Topic的数据就几十M,网络情况差的话, 网络传输失败,心跳失败,导致Namesrv误认为Broker心跳失败。
  • 每个主题可设置队列个数,自动创建主题时默认4个,需要顺序消费的消息发往同一队列,比如同一订单号相关的几条需要顺序消费的消息发往同一队列, 顺序消费的特点的是,不会有两个消费者共同消费任一队列,且当消费者数量小于队列数时,消费者会消费多个队列。至于消息重复,在消 费端处理。RocketMQ 4.3+支持事务消息,可用于分布式事务场景(最终一致性)。
  • 关于queueNums:
    • 客户端自动创建,Math.min算法决定最多只会创建8个(BrokerConfig)队列,若要超过8个,可通过控制台创建/修改,Topic配置保存在store/config/topics.json
    • 消费负载均衡的最小粒度是队列,Consumer的数量应不大于队列数
    • 读写队列数(writeQueueNums/readQueueNums)是RocketMQ特有的概念,可通过console修改。当readQueueNums不等于writeQueueNums时,会有什么影响呢?

   
   
  1. topicRouteData = this.mQClientAPIImpl.getDefaultTopicRouteInfoFromNameServer(defaultMQProducer.getCreateTopicKey(), 1000 * 3);
  2. if (topicRouteData != null) {
  3. for (QueueData data : topicRouteData.getQueueDatas()) {
  4. int queueNums = Math.min(defaultMQProducer.getDefaultTopicQueueNums(), data.getReadQueueNums());
  5. data.setReadQueueNums(queueNums);
  6. data.setWriteQueueNums(queueNums);
  7. }
  8. }
  • Broker上存Topic信息,Topic由多个队列组成,队列会平均分散在多个Broker上。Producer的发送机制保证消息尽量平均分布到 所有队列中,最终效果就是所有消息都平均落在每个Broker上。
  • RocketMQ的消息的存储是由ConsumeQueue和CommitLog配合来完成的,ConsumeQueue中只存储很少的数据,消息主体都是通过CommitLog来进行读写。 如果某个消息只在CommitLog中有数据,而ConsumeQueue中没有,则消费者无法消费,RocketMQ的事务消息实现就利用了这一点。
    • CommitLog:是消息主体以及元数据的存储主体,对CommitLog建立一个ConsumeQueue,每个ConsumeQueue对应一个(概念模型中的)MessageQueue,所以只要有 CommitLog在,ConsumeQueue即使数据丢失,仍然可以恢复出来。
    • ConsumeQueue:是一个消息的逻辑队列,存储了这个Queue在CommitLog中的起始offset,log大小和MessageTag的hashCode。每个Topic下的每个Queue都有一个对应的 ConsumeQueue文件,例如Topic中有三个队列,每个队列中的消息索引都会有一个编号,编号从0开始,往上递增。并由此一个位点offset的概念,有了这个概念,就可以对 Consumer端的消费情况进行队列定义。
  • RocketMQ的高性能在于顺序写盘(CommitLog)、零拷贝和跳跃读(尽量命中PageCache),高可靠性在于刷盘和Master/Slave,另外NameServer 全部挂掉不影响已经运行的Broker,Producer,Consumer。
  • 发送消息负载均衡,且发送消息线程安全(可满足多个实例死循环发消息),集群消费模式下消费者端负载均衡,这些特性加上上述的高性能读写, 共同造就了RocketMQ的高并发读写能力。
  • 刷盘和主从同步均为异步(默认)时,broker进程挂掉(例如重启),消息依然不会丢失,因为broker shutdown时会执行persist。 当物理机器宕机时,才有消息丢失的风险。另外,master挂掉后,消费者从slave消费消息,但slave不能写消息。
  • RocketMQ具有很好动态伸缩能力(非顺序消息),伸缩性体现在Topic和Broker两个维度。
    • Topic维度:假如一个Topic的消息量特别大,但集群水位压力还是很低,就可以扩大该Topic的队列数,Topic的队列数跟发送、消费速度成正比。
    • Broker维度:如果集群水位很高了,需要扩容,直接加机器部署Broker就可以。Broker起来后向Namesrv注册,Producer、Consumer通过Namesrv 发现新Broker,立即跟该Broker直连,收发消息。
  • Producer: 失败默认重试2次;sync/async;ProducerGroup,在事务消息机制中,如果发送消息的producer在还未commit/rollback前挂掉了,broker会在一段时间后回查ProducerGroup里的其他实例,确认消息应该commit/rollback
  • Consumer: DefaultPushConsumer/DefaultPullConsumer,push也是用pull实现的,采用的是长轮询方式;CLUSTERING模式下,一条消息只会被ConsumerGroup里的一个实例消费,但可以被多个不同的ConsumerGroup消费,BROADCASTING模式下,一条消息会被ConsumerGroup里的所有实例消费。
  • DefaultPushConsumer: Broker收到新消息请求后,如果队列里没有新消息,并不急于返回,通过一个循环不断查看状态,每次waitForRunning一段时间(5s),然后在check。当一直没有新消息,第三次check时,等待时间超过suspendMaxTimeMills(15s),就返回空结果。在等待的过程中,Broker收到了新的消息后会直接调用notifyMessageArriving返回请求结果。“长轮询”的核心是,Broker端Hold住(挂起)客户端客户端过来的请求一小段时间,在这个时间内有新消息到达,就利用现有的连接立刻返回消息给Consumer。“长轮询”的主动权还是掌握在Consumer手中,Broker即使有大量消息积压,也不会主动推送给Consumer。长轮询方式的局限性,是在Hold住Consumer请求的时候需要占用资源,它适合用在消息队列这种客户端连接数可控的场景中。
  • DefaultPullConsumer: 需要用户自己处理遍历MessageQueue、保存Offset,所以PullConsumer有更多的自主性和灵活性。
  • 对于集群模式的非顺序消息,消费失败默认重试16次,延迟等级为3~18。(messageDelayLevel = "1s 5s 10s 30s 1m 2m 3m 4m 5m 6m 7m 8m 9m 10m 20m 30m 1h 2h")
  • MQClientInstance client types of the underlying class Producer and Consumer, and NameServer deal with it by a Broker. If you create a Consumer or Producer is not the time to manually specify the type instanceName, the process will only have a MQClientInstance objects, that is, when a Java program needs to connect multiple MQ cluster, you must manually specify a different instanceName. Needs mentioning that, when the consumer (different jvm instance) are both on the same physical machine, if specified instanceName, load balancing consumption will fail (each instance will consume all messages). In addition, when a jvm, consumption simulation cluster, you must specify a different instanceName, otherwise it will prompt ConsumerGroup already exists at startup.

More

Original summary: https: //blog.csdn.net/javahongxi/article/details/84931747

Guess you like

Origin www.cnblogs.com/jpfss/p/11607365.html