Summary about RocketMQ source code

1. Architecture Overview

RocketMQ architecture is mainly divided into four parts, as shown in the figure above:

  • Producer: The role of message publishing, supporting distributed cluster deployment. Producer selects the corresponding Broker cluster queue for message delivery through the load balancing module of MQ. The delivery process supports fast failure and low latency.
  • Consumer: The role of message consumption, supporting distributed cluster deployment. Supports two modes of push and pull to consume messages. It also supports cluster mode and broadcast mode consumption. It provides a real-time message subscription mechanism that can meet the needs of most users.
  • NameServer: NameServer is a very simple Topic routing registration center. Its role is similar to the zookeeper in Dubbo and supports the dynamic registration and discovery of Brokers. It mainly includes two functions: Broker management. NameServer accepts the registration information of the Broker cluster and saves it as the basic data of routing information. Then provide a heartbeat detection mechanism to check whether the Broker is still alive; routing information management, each NameServer will save the entire routing information about the Broker cluster and queue information for client queries. Then Producer and Conumser can know the routing information of the entire Broker cluster through NameServer, so as to deliver and consume messages. NameServer is usually deployed in a cluster, and each instance does not communicate with each other. Broker registers its own routing information with each NameServer, so each NameServer instance stores a complete copy of routing information. When a NameServer goes offline for some reason, the Broker can still synchronize its routing information with other NameServers, and the Producer and Consumer can still dynamically perceive the Broker's routing information.
  • BrokerServer: Broker is mainly responsible for message storage, delivery and query as well as service high availability guarantee

2. The relationship between topic, broker and queue

The relationship between the three is as follows:

In the picture, there are 2 brokers, 2 topics, and each topic has 4 queues. When the producer sends a message, it is sent to a specific queue. When the consumer obtains the message, it is also obtained from the queue.

Note: RocketMqThe topic can be created manually in the console or automatically (configuration needs to be turned on autoCreateTopicEnable=true). The official recommendation is to turn off automatic creation in the production environment.

3. Message processing flow

The entire process of RocketMq message processing is as follows:

  1. Message reception: Message reception refers to the received producermessage. The processing class is that after SendMessageProcessorthe message is written to commigLogthe file, the reception process is completed;
  2. Message distribution: brokerThe class that handles message distribution ReputMessageServicewill start a thread and continuously commitLongdistribute it to the corresponding files consumerQueue. This step will write two files: consumerQueueand indexFile, after writing, the message distribution process is completed;
  3. Message delivery: Message delivery refers to consumerthe process of sending a message to a message. consumerIt will initiate a request to obtain the message. brokerAfter receiving the request, PullMessageProcessorthe class processing is called consumerQueueto obtain the message from the file. After returning to the message consumer, the delivery process is completed.

4. Three high guarantees

4.1 High concurrency

  1. Netty high-performance transmission: Netty communication is used between producer, broker, comsumerand high-performance transmission; during business processing, a custom worker thread pool is used, and the final processing operation is NettyServerHandlerthrown to the worker thread pool in .
  2. Spin lock reduces context switching: RocketMQ's CommitLog uses one in order to avoid concurrent writes PutMessageLock. PutMessageLockThere are 2 个实现版本:PutMessageReentrantLockand PutMessageSpinLock. PutMessageReentrantLockIt is a synchronous waiting wake-up mechanism based on Java; PutMessageSpinLockit uses Java's CAS primitive to realize locking and unlocking through spin setting. RocketMQ uses by default PutMessageSpinLockto improve locking and unlocking efficiency during high concurrent writing and reduce the number of thread context switches.
  3. Sequential writing of files: commitLogWhen writing, sequential writing is used, which has much higher performance than random writing. When writing, commitLogit is not written directly to the disk, but written first , and finally the data is PageCacheasynchronously transferred by the operating system. PageCacheFlash to disk
  4. MappedFileWarming up 零拷贝机制: When writing data, the Linux system does not write the data directly to the disk, but writes it to the corresponding page on the disk PageCache, and marks the page as a dirty page. When the dirty pages accumulate to a certain extent or after a certain period of time, the data is flushed to the disk (of course, if the system is powered off during this period, the dirty page data will be lost).
  5. brokerDuoduo Queuemode: Use brokerDuoduo Queuemode to improve the parallel processing capabilities of messages.

4.2 High availability

RocketMq's high availability is provided by DLedger. The high availability architecture of the entire broker is as follows:

DLedgerIt is a multi-node cluster. raftAn algorithm is used internally to elect leadernodes, and the leadernodes in the brokerpair perform failover.

  1. NameServerAvoid multiple NameServersingle points of failure
  2. Multiple brokerclusters, when one brokercluster fails, other brokerclusters can still work normally
  3. Each brokercluster has one masternode and multiple slavenodes. When mastera node fails, DLedgerafter sensing the failure, one of the slavenodes will be switched to mastera node to ensure that the cluster continues to work normally.

4.3 High expansion

brokerThere is no coupling relationship with producer/ consumer. When you need to add brokera cluster (one master and multiple slaves), you only need to nameServerconfigure the address and then add it. In theory, brokerit can be expanded arbitrarily.

When brokeradded to a cluster, the newly added brokercluster will be registered on nameServer/ producerand consumerthe brokercluster can be discovered.

5. Message reliability

RocketMqMessage reliability is divided into the following stages:

  1. Reliability during the message sending phase
  2. Message storage phase reliability
  3. Reliability in the message consumption phase

Below we will introduce how the reliability of these three stages is achieved.

5.1 Reliability in the message sending phase

The reliability of the message sending phase is handled producerby origin , rocketmqand three main message sending methods are supported.

  • Synchronization: After the message is sent, the thread will block until the result is returned.
  • Asynchronous: When sending a message, you can set up monitoring of the message sending result. After the message is sent, the thread will not be blocked. After the message is sent, the sending result will be monitored.
  • One-way: After the message is sent, the thread will not block, no result will be returned, and the monitoring of the sending result cannot be set. That is, just send it, and it does not care about the sending result or whether the sending is successful.

In terms of message reliability,

  • Synchronous sending: When a message fails to be sent, it will be retried internally (default is 1 send + 2 failed retries, a total of 3 times). In addition, since the sending result can be obtained after the sending is completed, the failed result can also be processed independently.
  • Asynchronous sending: When a message fails to be sent, there are internal retries at the same time (default is 1 send + 2 failed retries, a total of 3 times). In addition, you can set the listening rules for the message when sending the message. When the sending fails, you can listen Process failed messages autonomously in the code
  • One-way sending: In this mode, there is no retry when the message fails to be sent (just a warn level log is output), and no sending result is returned and no result is monitored.

5.2 Reliability of message storage stage

The reliability of the message storage stage is guaranteed brokerby

In a single- masterarchitecture system broker, the message is first written into the memory PageCache, and then the disk is flushed. There are two ways to flush the disk:

  • SYNC_FLUSH(Synchronous disk brushing): After the message is written into the PageCache of the memory, the disk brushing thread is immediately notified to brush the disk, and then waits for the disk brushing to be completed. After the disk brushing thread is completed, it wakes up the waiting thread and returns the message writing success status. This method can ensure absolute data security, but the throughput is not large.
  • ASYNC_FLUSH(Asynchronous flushing (default)): When the message is written to the PageCache in the memory, the write operation success will be immediately returned to the client. When the messages in the PageCache accumulate to a certain amount, a write operation will be triggered, or a strategy such as timing will Messages in PageCache are written to disk. This method has high throughput and high performance, but the data in PageCache may be lost and the absolute security of the data cannot be guaranteed.

Summary: Synchronous disk flashing does not lose data but affects performance; asynchronous disk flashing has high performance, but if a power outage accident occurs before the message flashing, the message will be lost.

In a one-master, multiple-slave brokerarchitecture, masternodes have two role options:

  • SYNC_MASTER(Synchronization host): After receiving the message, it will be synchronized to slavethe node immediately. When slavethe node synchronization is successful, success will be returned. High reliability.
  • ASYNC_MASTER(Asynchronous host): When a message is received, it is not synchronized to slavethe node immediately. The synchronization operation is performed by a background thread. If the synchronization operation has not yet been performed when a master-slave switch occurs, data may be lost.

Summary: The synchronous host has high reliability and will not lose data when a master-slave switch occurs. However, because it needs to wait for the slavenode synchronization to be successful before returning, the performance is slightly lower; the asynchronous host has high performance, but if a master-slave switch occurs before the synchronization operation, , the masteroriginal data may not be synchronized slave, thus causing message loss.

Summary: If you want to ensure the reliability of the message, masterthe refresh method of a single node can be selected SYNC_FLUSH(synchronous flush); brokerin the architecture of one master and multiple slaves, masterthe refresh method of the node can be selected ASYNC_FLUSH(asynchronous flush), and masterthe role of the node uses SYNC_MASTER( Synchronization host), in practice, make reasonable choices based on specific scenarios.

5.3 Reliability in the message consumption stage

The reliability of the message consumption phase comsumeris guaranteed. When consuming messages, two results can be returned:

  • CONSUME_SUCCESS: consumption success
  • RECONSUME_LATER: Consumption failed, please consume again later

ConsumerAfter the message consumption fails, RocketMqa retry mechanism will be provided to consume the message again. ConsumerFailure to consume messages can usually be considered to have the following situations:

  • Due to the reasons of the message itself, such as deserialization failure, the message data itself cannot be processed (such as phone bill recharge, the mobile phone number of the current message is canceled and cannot be recharged), etc. This kind of error usually requires skipping this message and then consuming other messages. Even if you try to consume this failed message immediately, 99% of it will not be successful, so it is best to provide a scheduled retry mechanism, that is, after 10 seconds Try again.
  • Because the dependent downstream application services are unavailable, for example, the db connection is unavailable, the external system network is unreachable, etc. When encountering this kind of error, even if the current failed message is skipped, an error will still be reported when consuming other messages. In this case, it is recommended to apply sleep for 30s before consuming the next message. This can reduce the pressure on the Broker to retry the message.

%RETRY%+consumerGroupRocketMQ will set up a retry queue with a Topic name for each consumer group (it should be noted here that the retry queue of this Topic is for the consumer group, not for each Topic), which is used to temporarily save because Messages that cannot be consumed by the Consumer due to various exceptions. Considering that it takes some time to recover from an exception, multiple retry levels will be set for the retry queue. Each retry level has a corresponding re-delivery delay. The more retries, the greater the delivery delay. RocketMQ processes retry messages by first saving them to SCHEDULE_TOPIC_XXXXthe delay queue with the topic name. The background scheduled task delays according to the corresponding time and then saves them again to the %RETRY%+consumerGroupretry queue.

6. Load balancing

6.1 producer load balancing

When the Producer sends a message, it will first find the specified one based on the Topic TopicPublishInfo. After obtaining TopicPublishInfothe routing information, the RocketMQ client willselectOneMessageQueue() select a queue (MessageQueue) among them by default to send the message. Specific fault tolerance strategies are defined in this class.TopicPublishInfomessageQueueListMQFaultStrategy

There is a switch variable here . If it is turned on, the Broker agents sendLatencyFaultEnablewill be filtered out based on the random increment modulo . not availableThe so-called "latencyFaultTolerance" refers to backing off for a certain period of time for previous failures. For example, if the latency of the last request exceeds 550Lms, it will back off 3000Lms; if it exceeds 1000L, it will back off 60000L; if it is closed, a random increment modulo method will be used to select a queue ( ) to send messages. The mechanism is the core of achieving high availability for message MessageQueuesending latencyFaultTolerance. The point is.

6.2 consumer load balancing

In RocketMQ, the two consumption modes (Push/Pull) on the Consumer side are based on the pull mode to obtain messages. The Push mode is just an encapsulation of the pull mode. Its essence is that the message pulling thread pulls the message from the server. After getting a batch of messages, and then submitting them to the message consumption thread pool, they will continue to try to pull messages from the server "non-stop". If the message is not fetched, it will be delayed for a while and then fetched again. In both pull-based consumption methods (Push/Pull), the Consumer side needs to know which message queue-queue on the Broker side to get messages from. Therefore, it is necessary to do load balancing on the Consumer side, that is, multiple MessageQueue on the Broker side are allocated to which Consumers in the same ConsumerGroup consume.

  1. Heartbeat packet sending on the Consumer side

After the Consumer is started, it will continuously send heartbeat packets to all Broker instances in the RocketMQ cluster through scheduled tasks (which contains information such as the name of the message consumption group, the collection of subscription relationships, the message communication mode and the value of the client id) . After receiving the Consumer's heartbeat message, the Broker will maintain it in the local cache variable of the ConsumerManager - consumerTable, and at the same time save the encapsulated client network channel information in the local cache variable - channelInfoTable, for subsequent load on the Consumer side. Balance provides metadata information that can be relied upon.

  1. The core class for implementing load balancing on the Consumer side—RebalanceImpl

In the startup process of the Consumer instance, the startup of the MQClientInstance instance will complete the startup of the load balancing service thread - RebalanceService (executed every 20 seconds). By looking at the source code, we can find that the run() method of the RebalanceService thread ultimately calls the rebalanceByTopic() method of the RebalanceImpl class, which is the core of achieving consumer-side load balancing. Here, the rebalanceByTopic() method will perform different logical processing depending on whether the consumer communication type is "broadcast mode" or "cluster mode". Here we mainly look at the main processing procedures in cluster mode:

(1) Obtain the message consumption queue set (mqSet) under the Topic topic from the local cache variable of the rebalanceImpl instance—topicSubscribeInfoTable;

(2) Call the mQClientFactory.findConsumerIdList() method based on topic and consumerGroup as parameters to send an RPC communication request to the Broker to obtain the list of consumer IDs under the consumer group (the consumerTable built by the Broker based on the heartbeat packet data reported by the previous Consumer) Response is returned, business request code: GET_CONSUMER_LIST_BY_GROUP);

(3) First sort the message consumption queues and consumer IDs under the Topic, and then use the message queue allocation strategy algorithm (the default is: the average allocation algorithm of the message queue) to calculate the message queue to be pulled. The average allocation algorithm here is similar to the paging algorithm. It sorts all MessageQueues like records, sorts all consumer consumers like the number of pages, and finds the average size and records of each page that each page needs to contain. range, and finally traverse the entire range to calculate the record that should be allocated to the current Consumer (here: MessageQueue).

(4) Then, call the updateProcessQueueTableInRebalance() method. The specific method is to first perform a filtering comparison between the assigned message queue set (mqSet) and processQueueTable.

  • The red part marked by processQueueTable in the above figure indicates that it is not included in the allocated message queue set mqSet. Set the Dropped attribute of these queues to true, and then check whether these queues can be removed from the processQueueTable cache variable. Here, the removeUnnecessaryMessageQueue() method is specifically executed, that is, every 1s, check whether the lock of the current consumption processing queue can be obtained, and return true if obtained. . If after waiting for 1s, the lock of the current consumption processing queue is still not obtained, false will be returned. If true is returned, the corresponding Entry is removed from the processQueueTable cache variable;
  • The green part of processQueueTable in the above figure represents the intersection with the assigned message queue set mqSet. Determine whether the ProcessQueue has expired. Don't worry about it in Pull mode. If it is in Push mode, set the Dropped attribute to true, and call the removeUnnecessaryMessageQueue() method, and try to remove the Entry as above;

Finally, create a ProcessQueue object for each MessageQueue in the filtered message queue set (mqSet) and store it in the processQueueTable queue of RebalanceImpl (where the computePullFromWhere(MessageQueue mq) method of the RebalanceImpl instance is called to obtain the next progress consumption of the MessageQueue object value offset, and then fill it into the attribute of the pullRequest object to be created next), and create the pull request object-pullRequest and add it to the pull list-pullRequestList. Finally, execute the dispatchPullRequest() method and put the request object PullRequest of the Pull message in sequence. Enter the blocking queue pullRequestQueue of the PullMessageService service thread, and then initiate a Pull message request to the Broker after the service thread is taken out. Among them, you can focus on comparison. The dispatchPullRequest() method of the two implementation classes RebalancePushImpl and RebalancePullImpl is different. The method in the RebalancePullImpl class is empty, which answers the last question in the previous article.

The core design concept of message consumption queue load balancing among different consumers in the same consumer group is that a message consumption queue is only allowed to be consumed by one consumer in the same consumer group at the same time, and a message consumer can consume multiple messages at the same time. message queue.

7. Broadcast mode and cluster mode

7.1 Broadcast mode

In broadcast mode, the same message will consumerGroupbe consumerconsumed by everyone under the same

As topicshown there are three in the same group MessageQueue, and there is one consumerGroup, and there are two in the group consumer. In broadcast mode, consumer1and consumer2will consume messages from and MessageQueue1.MessageQueue2MessageQueue3

7.2 Cluster mode

In cluster mode, the same message will only consumerGroupbe consumerconsumed by the same

As shown in the figure, there are three in topicthe same place MessageQueue, and two of them consume the messages on consumerGroupthese three . The one in will message , the message on, and the one in will message the message on; the one in will message , the message on, and the The message on the meeting message .MessageQueueconsumerGroup1consumer1MessageQueue1MessageQueue2consumerGroup1consumer2MessageQueue3consumerGroup2consumer1MessageQueue1MessageQueue2consumerGroup2consumer2MessageQueue3

8. Sequential messages

Message ordering means that when a type of message is consumed, it can be consumed in the order in which it was sent. For example: an order generates three messages: order creation, order payment, and order completion. It makes sense to consume in this order, but orders can be consumed in parallel at the same time.

RocketMQ can strictly guarantee the order of messages.

Sequential messages are divided into global sequential messages and partitioned sequential messages. Global sequence means that all messages under a certain Topic must be in order; partial sequential messages only need to ensure that each group of messages is consumed sequentially.

  • Global order: For a specified Topic, all messages are published and consumed in strict first-in, first-out (FIFO) order. Applicable scenarios: Scenarios where performance requirements are not high and all messages are released and consumed strictly in accordance with the FIFO principle
  • Partition order: For a specified Topic, all messages are partitioned into blocks based on the sharding key. Messages within the same partition are published and consumed in strict FIFO order. Sharding key is a key field used to distinguish different partitions in sequential messages. It is a completely different concept from the key of ordinary messages. Applicable scenarios: Scenarios with high performance requirements, using sharding key as the partition field, and strictly following the FIFO principle for message publishing and consumption in the same block.

9. Transaction messages

The above figure illustrates the general scheme of transaction messages, which is divided into two processes: the sending and submission of normal transaction messages, and the compensation process of transaction messages.

  1. Transaction message sending and submission:
  • Send a message (half message).
  • The server responds to the message and writes the result.
  • Execute local transactions based on the sending results (if the write fails, the half message will not be visible to the business and the local logic will not be executed).
  • Execute Commit or Rollback based on the local transaction status (the Commit operation generates a message index, and the message is visible to consumers)
  1. Compensation process:
  • For transaction messages without Commit/Rollback (messages in pending status), initiate a "review" from the server
  • Producer receives the review message and checks the status of the local transaction corresponding to the review message.
  • According to the local transaction status, re-Commit or Rollback

Among them, the compensation phase is used to solve the problem of message timeout or failure Commit.Rollback

10. Delayed messages

rocketmqWhen implementing delayed messages, there are 18 default delay levels. The delay times corresponding to these levels are as follows:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1s 5s 10s 30s 1 m 2m 3m 4m 5 m 6m 7m 8m 9m 10m 20m 30m 1h 2h

When sending a message, just set the delayLevel level: msg.setDelayLevel(level). level has the following three situations:

  • level == 0, the message is a non-delayed message
  • 1<=level<=maxLevel, the message is delayed for a specific time, for example, level==1, the delay is 1s
  • level > maxLevel, then level== maxLevel, for example, level==20, delay 2h

Scheduled messages will be temporarily stored in a topic named SCHEDULE_TOPIC_XXXX, and stored in a specific queue according to delayTimeLevel, queueId = delayTimeLevel – 1, that is, a queue only stores messages with the same delay, ensuring that messages with the same sending delay can be consumed sequentially. The broker will consume SCHEDULE_TOPIC_XXXX in a scheduled manner and write the message to the real topic.

11. Message filtering

The message filtering method of RocketMQ distributed message queue is different from other MQ middleware. It filters messages when the Consumer subscribes to the message. RocketMQ does this because the Producer side writes messages and the Consumer side subscribes to messages using a separate storage mechanism. The Consumer side needs to get an index through ConsumeQueue, the logical queue for message consumption, and then read it from the CommitLog. Get the real message entity content, so in the final analysis, its storage structure cannot be bypassed. The storage structure of the ConsumeQueue is as follows. You can see that there are 8 bytes of the hash value of the Message Tag stored in it. Tag-based message filtering is officially based on this field value.

Mainly supports the following two filtering methods:

  1. Tag filtering method: When subscribing to a message, the consumer can also specify a TAG in addition to the Topic. If a message has multiple TAGs, they can be separated by ||. Among them, the Consumer side will construct this subscription request into a SubscriptionData and send a Pull message request to the Broker side. Before the Broker reads data from RocketMQ's file storage layer - Store, it will use the data to build a MessageFilter and then pass it to the Store. After Store reads a record from ConsumeQueue, it will use the message tag hash value recorded by it for filtering. Since the server only judges based on the hashcode, it cannot accurately filter the original string of the tag, so it is pulled on the message consumer. After receiving the message, the original tag string of the message needs to be compared. If they are different, the message will be discarded and no message consumption will be performed.
  2. SQL92 filtering method: The general approach of this method is the same as the above Tag filtering method, except that the specific filtering process at the Store layer is different. The construction and execution of the real SQL expression is handled by the rocketmq-filter module. Executing SQL expressions every time you filter will affect efficiency, so RocketMQ uses BloomFilter to avoid executing it every time. The expression context of SQL92 is a property of the message.

Guess you like

Origin blog.csdn.net/2301_76607156/article/details/130558033