[RocketMQ interview questions (23)]

RocketMQ interview questions (23 questions)

Base

1. Why use message queue?

The message queue has three main uses. Let’s take an order in an e-commerce system as an example:

  • Decoupling : Before the message queue is introduced, after the order is placed, the order service is needed to call the inventory service to reduce inventory, and to call the marketing service to add marketing data... After the message queue is introduced, the order completion message can be thrown into the queue, and the downstream service itself Just call it, thus completing the decoupling of the order service and other services.

image-20230918172901235

  • Asynchronous : After the order is paid, we need to deduct inventory, increase points, send messages, etc., so the link will become longer. As the link becomes longer, the response time will become longer. By introducing the message queue, 更新订单状态everything else can be done asynchronously , so that the response time can be reduced as soon as it comes.

image-20230918173002606

  • Peak shaving : The message queue is combined into one to shaving peaks. For example, in the flash sale system, the traffic is usually very low, but when there is a flash sale activity, the traffic rushes in like crazy during the flash sale. Our servers, Redis, and MySQL each have different tolerances. There must be a problem if you directly collect all the traffic according to the order. In serious cases, it may be directly blocked.

We can throw requests into the queue and only release the traffic that our service can handle, so that we can withstand large traffic in a short period of time.

image-20230918181310247

Decoupling, asynchronousness, and peak clipping are the three most important functions of message queues.

2. Why choose RocketMQ?

A comparison of several major message queues on the market is as follows:

RabbitMQ ActiveMQ RocketMQ Kafka
company Rabbit Apache Ali Apache
language Erlang Java Java Scala&Java
Protocol support AMPQ OpenWire、STOMP、REST、 XMPP、AMQP customize Custom protocol, the community encapsulates http protocol support
Client supported languages Officially supports Erlang, Java, Ruby, etc. The community has found a variety of APIs that support almost all languages. Java、C、C++、Python、PHP、Perl、.net etc Java, C++ (immature) Officially supports Java, and the community produces a variety of APIs, such as PHP, Python, etc.
click throughput 3. Level 10,000 4. Level 10,000 1. One hundred thousand level 2. One hundred thousand level
message delay Microsecond level millisecond level millisecond level Within milliseconds
Availability High availability based on master-slave architecture High availability based on master-slave architecture Very high, distributed architecture Very high, distributed architecture, multiple copies of one data
message reliability - There is a lower probability of losing data After parameter optimization and configuration, zero loss can be achieved After parameter configuration, messages can be lost zero
Function support Developed based on erlang, so concurrency is extremely strong, performance is excellent, and latency is low. The functions in the MQ field are extremely complete MQ has relatively complete functions and good distributed scalability. The function is relatively simple, mainly supporting the single MQ function
Advantage Erlang language development, excellent performance, low latency, throughput of 10,000 levels, complete MQ functions, very good management interface, active community; widely used by Internet companies Very mature and powerful, it is used in a large number of companies and projects in the industry The interface is simple and easy to use, Alibaba products are guaranteed, the throughput is large, the distributed expansion is convenient, the community is active, it supports large-scale topics and complex business scenarios, and can be customized and developed based on the source code Ultra-high throughput, ms-level latency, extremely high availability and reliability, and convenient distributed expansion
Disadvantages The throughput is low, Erlang voice development is not easy to customize, and dynamic expansion of the cluster is troublesome. Occasionally there is a low probability of losing messages, and the community activity is not high The interface does not follow the standard JMS specification. Some system migrations require modification of a large amount of code, and the technology is at risk of being abandoned.
It is possible to re-consume messages
application All are used Mainly used for decoupling and asynchronous, less used in large-scale throughput scenarios Used in large-scale throughput and complex businesses It is used on a large scale in real-time calculation and log collection of big data and is an industry standard.

To summarize :

Those who choose middleware can consider these dimensions: reliability, performance, functionality, operability, scalability, and community activity. Currently, there are several commonly used middlewares. ActiveMQ is an "old antique" and not many are used on the market. There are several other middlewares:

  • RabbitMQ:

    • Advantages: lightweight, fast, easy to deploy and use, with flexible routing configuration
    • Disadvantages: performance and throughput are not ideal, and secondary development is not easy
  • RocketMQ:

    • Advantages: good performance, high throughput, stable and reliable, active Chinese community
    • Disadvantages: Not very good in terms of compatibility
  • Kafka:

    • Advantages: powerful performance and throughput, good compatibility
    • Disadvantages: The delay is relatively high due to "save a wave and then process it"

Our system is a user-oriented C-side system with a certain amount of concurrency and relatively high performance requirements, so we chose RocketMQ with low latency, relatively high throughput, and relatively good availability.

3.What are the advantages and disadvantages of RocketMQ?

RocketMQ advantages:

  • Single machine throughput: 100,000 levels
  • Availability: very high, distributed architecture
  • Message reliability: After parameter optimization and configuration, messages can be lost zero
  • Function support: MQ has relatively complete functions, is distributed, and has good scalability.
  • Supports 1 billion level message accumulation without performance degradation due to accumulation.
  • The source code is Java, which is convenient for secondary development combined with the company's own business.
  • Born for the financial Internet field, it has high reliability requirements, especially order deductions in e-commerce and business peak shaving. When a large number of transactions pour in, the backend may not be able to process it in time.
  • RoketMQ may be more trustworthy in terms of stability. These business scenarios have been tested many times during Alibaba Double 11. If your business has the above concurrency scenarios, it is recommended to choose RocketMQ.

RocketMQ disadvantages:

  • There are not many supported client languages, currently Java and c++, of which c++ is immature.
  • JMS and other interfaces are not implemented in the MQ core . Some systems need to modify a lot of code to migrate.

4. What message models does the message queue have?

There are two models of message queues: queue model and publish/subscribe model .

  • queue model

    This is the original message queue model, corresponding to the message queue "send-deposit-receive" model. Producers send messages to a queue. A queue can store messages from multiple producers, and a queue can also have multiple consumers. However, there is a competition relationship between consumers, which means that each message can only be sent by one Consumer consumption.

image-20230918200328531

  • publish/subscribe model

If a message data needs to be distributed to multiple consumers, and each consumer requires receiving the full amount of messages. Obviously, the queue model cannot meet this demand. The solution is the publish/subscribe model.

In the publish-subscribe model, the sender of the message is called the publisher, the receiver of the message is called the subscriber, and the container where the message is stored on the server is called the topic. The publisher sends messages to the topic, and the subscriber needs to "subscribe to the topic" before receiving the message. "Subscription" here is not only an action, but also can be considered as a logical copy of the topic when consuming it. In each subscription, the subscriber can receive all messages of the topic.

image-20230918200358126

The similarities and differences between it and the "queue mode": the producer is the publisher, the queue is the topic, and the consumer is the subscriber. There is no essential difference. The only difference is whether a piece of message data can be consumed multiple times.

5. What about RocketMQ’s message model?

The message model used by RocketMQ is the standard publish-subscribe model. In RocketMQ's glossary, producers, consumers, and topics are exactly the same as the concepts in the publish-subscribe model.

The message of RocketMQ itself is composed of the following parts:

image-20230918200532191

  • Message

Message is the information to be transmitted.

A message must have a topic (Topic), and the topic can be regarded as the address to which your letter will be mailed.

A message can also have an optional tag (Tag) and a key-value pair at the end, which can be used to set a business Key and find this message on the Broker to find problems during development.

  • Topic

Topic can be regarded as the classification of messages, which is the first-level type of messages. For example, an e-commerce system can be divided into: transaction messages, logistics messages, etc. A message must have a Topic.

Topic has a very loose relationship with producers and consumers. A Topic can have 0, 1, or multiple producers sending messages to it, and a producer can also send messages to different Topics at the same time.

A Topic can also be subscribed by 0, 1, or multiple consumers.

  • Tag

Tags can be thought of as subtopics, which are second-level types of messages used to provide additional flexibility to users. Using tags, messages with different purposes in the same business module can be identified with the same Topic but different Tags . For example, transaction messages can be divided into: transaction creation messages, transaction completion messages, etc. A message may not have a tag .

Tags help keep your code clean and coherent, and also aid in the query system provided by RocketMQ .

  • Group

In RocketMQ, the concept of subscribers is represented by Consumer Group. Each consumer group consumes a complete message in the topic, and the consumption progress between different consumer groups is not affected by each other. In other words, a message consumed by Consumer Group1 will also be consumed by Consumer Group2.

A consumption group contains multiple consumers. Consumers in the same group compete for consumption. Each consumer is responsible for consuming part of the messages in the group. By default, if a message is consumed by consumer Consumer1, other consumers in the same group will no longer receive this message.

  • Message Queue

Message Queue (Message Queue), multiple message queues can be set under a Topic, and a Topic includes multiple Message Queues. If a Consumer needs to obtain all the messages under the Topic, it must traverse all Message Queues.

RocketMQ also has some other Queues - such as ConsumerQueue.

  • Offset

During the Topic consumption process, since messages need to be consumed multiple times by different groups, the consumed messages will not be deleted immediately. This requires RocketMQ to maintain a consumption position on each queue for each consumer group ( Consumer Offset), the messages before this position have been consumed, and the subsequent messages have not been consumed. Every time a message is successfully consumed, the consumption position is increased by one.

It can also be said that Queueit is an array with infinite length, and Offset is the subscript.

In RocketMQ's message model, these are the key concepts. Draw a picture to summarize:

image-20230918201004095

6. Do you understand the consumption pattern of messages?

There are two message consumption modes: Clustering (cluster consumption) and Broadcasting (broadcast consumption).

image-20230918201137328

The default is cluster consumption. In this mode 一个消费者组共同消费一个主题的多个队列,一个队列只会被一个消费者消费, if a consumer dies, other consumers in the group will take over and continue consuming.

The broadcast consumption message will be sent to every consumer in the consumer group for consumption.

7.Do you understand the basic structure of RoctetMQ?

Let’s look at the picture first, the basic architecture of RocketMQ:

image-20230918201312128

RocketMQ consists of four parts: NameServer, Broker, Producer, and Consumer. They correspond to: discovery, sending, storing, and receiving. In order to ensure high availability, each part is generally deployed in a cluster.

8. Can you introduce these four parts?

Analogy to the postal system we live in——

The normal operation of the postal system cannot be separated from the following four roles: one is the sender, the other is the recipient, the third is the post office responsible for temporary storage and transmission, and the fourth is the management agency responsible for coordinating various local post offices. Corresponding to RocketMQ, these four roles are Producer, Consumer, Broker, and NameServer.

image-20230918201430096

NameServer

NameServer is a stateless server whose role is similar to Zookeeper used by Kafka, but is more lightweight than Zookeeper.
Features:

  • Each NameServer node is independent of each other and does not have any information exchange with each other.
  • The Nameserver is designed to be almost stateless. It identifies itself as a pseudo-cluster by deploying multiple nodes. The Producer obtains the routing information of the Topic from the NameServer before sending the message, that is, which Broker it is sent to. The Consumer will also periodically send the message from the NameServer. To obtain the routing information of the Topic, the Broker will register with the NameServer when it starts, conduct heartbeat connections regularly, and synchronize the maintained Topics to the NameServer regularly.

There are two main functions:

  • 1. Maintain a long connection with the Broker node.
  • 2. Maintain the routing information of the Topic.
Broker

The message storage and relay role is responsible for storing and forwarding messages.

  • Broker maintains a Consumer Queue internally, which is used to store the index of the message. The actual place where the message is stored is the CommitLog (log file).

  • A single Broker maintains long connections and heartbeats with all Nameservers, and regularly synchronizes Topic information to the NameServer. The underlying communication with the NameServer is implemented through Netty.

Producer

Message producer, the business end is responsible for sending messages, which is implemented and distributed deployed by the user.

  • Producers are deployed in a distributed manner by users, and messages are sent to the Broker cluster by Producers through multiple load balancing modes. The messages are sent with low latency and support fast failure.

  • RocketMQ provides three ways to send messages: synchronous, asynchronous and one-way

    • Synchronous sending : Synchronous sending means that after the message sender sends the data, it will only send the next data packet after receiving the response from the receiver. Generally used for important notification messages, such as important notification emails and marketing text messages.
    • Asynchronous sending : Asynchronous sending means that after the sender sends the data, it does not wait for the receiver to send back a response, and then sends the next data packet. It is generally used in business scenarios where the link may take a long time and is sensitive to response time, such as user video uploading. After the notification starts the transcoding service.
    • One-way sending : One-way sending means that it is only responsible for sending messages without waiting for the server's response and no callback function is triggered. It is suitable for certain scenarios that take a very short time but do not require high reliability, such as log collection.
Consumer

Message consumers are responsible for consuming messages. Generally, the backend system is responsible for asynchronous consumption.

  • Consumer is also deployed by users, supports two consumption modes: PUSH and PULL, supports cluster consumption and broadcast consumption , and provides a real-time message subscription mechanism .
  • Pull : Pull consumers actively pull information from the message server. As long as messages are pulled in batches, the user application will start the consumption process, so Pull is called active consumption.
  • Push : Push Consumer encapsulates message pulling, consumption progress and other internal maintenance work, leaving the callback interface executed when the message arrives to the user application. So Push is called a passive consumption type, but in fact, from an implementation point of view, it still pulls messages from the message server. Different from Pull, Push must first register a consumption listener, and only starts consuming messages when the listener is triggered.

Advanced

9. How to ensure the availability/reliability/non-loss of messages?

At what stages might messages be lost? Loss may occur in these three stages: production stage, storage stage, and consumption stage.

So consider these three stages:

image-20230918202031047

Production

In the production stage, the request confirmation mechanism is mainly used to ensure reliable delivery of messages .

  • 1. When sending synchronously, pay attention to handling response results and exceptions. If the response is OK, it means that the message was successfully sent to the Broker. If the response fails or other exceptions occur, you should try again.
  • 2. When sending asynchronously, you should check it in the callback method. If the sending fails or is abnormal, you should try again.
  • 3. If a timeout occurs, you can also check whether the storage in the Broker is successful by querying the log API.
storage

In the storage phase, you can configure Broker parameters that prioritize reliability to avoid losing messages due to downtime . Simply put, synchronization should be used in scenarios where reliability is prioritized.

  • 1. As long as the messages are persisted to the CommitLog (log file), even if the Broker goes down, unconsumed messages can be restored and consumed again.
  • 2. Broker's disk brushing mechanism: synchronous disk brushing and asynchronous disk brushing. No matter which kind of disk brushing, the message can be guaranteed to be stored in the pagecache (in memory), but synchronous disk brushing is more reliable. It is the data that the Producer waits for after sending the message. After persisting to disk, the response is returned to the Producer.
  • 3. Broker ensures high availability through the master-slave mode. Broker supports Master and Slave synchronous replication, Master and Slave asynchronous replication mode. Producer messages are sent to Master, but consumption can be consumed from Master or Slave. . The synchronous replication mode can ensure that even if the Master is down, the message will definitely be backed up in the Slave, ensuring that the message will not be lost.

image-20230918202927817

Consumption

From a consumer perspective, how to ensure that messages are successfully consumed?

  • The key for Consumer to ensure successful consumption of messages lies in the timing of confirmation. Do not send consumption confirmation immediately after receiving the message. Instead, send consumption confirmation after executing all consumption business logic. Because the message queue maintains the consumption position, the logic execution fails and there is no confirmation. If you go to the queue to pull the message again, it will still be the previous one.

10. How to deal with the problem of message duplication?

For distributed message queues, it is difficult to ensure certain delivery and non-repeated delivery at the same time, which is the so-called "one and only once". RocketMQ has chosen to ensure certain delivery to ensure that messages are not lost, but it may cause message duplication.

The problem of message duplication is mainly handled by the business end itself. There are two main methods: business idempotence and message deduplication .

Business idempotence : The first one is to ensure the idempotence of consumption logic, that is, the effect of multiple calls and one call is the same. In this way, no matter how many times the message is consumed, it will have no impact on the business.

Message deduplication : The second type is the business end, which will no longer consume duplicate messages. This method needs to ensure that each message has a unique number, which is usually business-related, such as an order number. The consumption record needs to be stored in the database, and the atomicity of the message confirmation step needs to be ensured.

The specific method is to create a consumption record table and get this message to perform an insert operation in the database. Give this message a unique primary key (primary key) or unique constraint. Even if repeated consumption occurs, it will cause a primary key conflict and the message will no longer be processed.

11. How to deal with the message backlog?

When a backlog of messages occurs, you have to find a way to quickly consume the backlog of messages, and you have to consider improving your consumption capacity. There are generally two methods:

image-20230918203127950

  • Consumer expansion : If the number of the current Topic's Message Queue is greater than the number of consumers, the consumers can be expanded and more consumers can be added to increase consumption capacity and consume the backlog of messages as soon as possible.
  • Message migration Queue expansion : If the number of Message Queues in the current Topic is less than or equal to the number of consumers, in this case, it is useless to expand the number of consumers, and you must consider expanding the Message Queue. You can create a new temporary Topic, set up some more Message Queues on the temporary Topic, and then use some consumers to throw the consumed data to the temporary Topic. Because there is no need for business processing, it is just forwarding the message, which is still very fast. Next, use the expanded consumer to consume the data in the new Topic. After consumption is completed, restore the original state.

image-20230918203319817

12. How to implement sequential messages?

Sequential messages mean that the message consumption order is the same as the production order. Under some business logic, the order must be guaranteed, such as order generation, payment, and delivery. This message must be processed in order.

Sequential messages are divided into global sequential messages and partial sequential messages:

Global sequential messages mean that all messages under a certain Topic must ensure order;

For some sequential messages, you only need to ensure that each group of messages is consumed in order. For example, for order messages, you only need to ensure that messages with the same order ID can be consumed in order.

partial sequence message

Partial sequential messages are relatively easy to implement. The production end needs to send messages with the same ID to the same Message Queue; during the consumption process, the messages read from the same Message Queue must be processed sequentially - the consumer end cannot be concurrent. Process sequential messages so that partial ordering can be achieved.

image-20230918203535409

The sender uses the MessageQueueSelector class to control which Message Queue the message is sent to.

The consumer side uses MessageListenerOrderly to solve the problem of messages in a single Message Queue being processed concurrently.

global sequence message

RocketMQ does not guarantee order by default. For example, when creating a Topic, there are eight write queues and eight read queues by default. At this time, a message may be written to any queue; during the data reading process, there may be multiple Consumers , each Consumer may also start multiple threads for parallel processing, so it is uncertain which Consumer the message is consumed by, and whether the order in which it is consumed is consistent with the order in which it was written.

To ensure global sequential messages, you need to first set the number of read and write queues of the Topic to one, and then set the concurrency setting of the Producer Consumer to one. To put it simply, in order to ensure that the global messages of the entire Topic are in order, all concurrent processing can only be eliminated and each part is set to single-thread processing. At this time, the high concurrency and high throughput characteristics of RocketMQ are completely sacrificed.

image-20230918203719183

13. How to implement message filtering?

There are two options:

  • One is to filter on the Broker side according to the Consumer's deduplication logic. The advantage of this is to avoid useless messages being transmitted to the Consumer side. The disadvantage is that it increases the burden on the Broker and is relatively complicated to implement.
  • The other is to filter on the consumer side, such as deduplication according to the tag set in the message. The advantage of this is that it is simple to implement. The disadvantage is that a large number of useless messages arrive at the consumer side and can only be discarded without processing.

Generally, Cosumer side filtering is used. If you want to improve throughput, Broker filtering can be used.

There are three ways to filter messages:

image-20230918203833864

  • Filtering based on Tag: This is the most common one and is efficient and simple to use.

    DefaultMQPushConsumer consumer = new DefaultMQPushConsumer("CID_EXAMPLE");
    consumer.subscribe("TOPIC", "TAGA || TAGB || TAGC");
    
  • SQL expression filtering: SQL expression filtering is more flexible

DefaultMQPushConsumer consumer = new DefaultMQPushConsumer("please_rename_unique_group_name_4");
// 只有订阅的消息有这个属性a, a >=0 and a <= 3
consumer.subscribe("TopicTest", MessageSelector.bySql("a between 0 and 3");
consumer.registerMessageListener(new MessageListenerConcurrently() {
    
    
   @Override
   public ConsumeConcurrentlyStatus consumeMessage(List<MessageExt> msgs, ConsumeConcurrentlyContext context) {
    
    
       return ConsumeConcurrentlyStatus.CONSUME_SUCCESS;
   }
});
consumer.start();
  • Filter Server method: the most flexible and complex method, allowing users to customize functions for filtering

14. Do you understand the delayed message?

The automatic cancellation of e-commerce orders after timeout is a typical example of using delayed messages. After the user submits an order, he or she can send a delayed message. After 1 hour, the status of the order will be checked. If the payment is still not made, the order will be canceled and released. in stock.

RocketMQ supports delayed messages. You only need to set the delay level of the message when producing the message:

// 实例化一个生产者来产生延时消息
DefaultMQProducer producer = new DefaultMQProducer("ExampleProducerGroup");
// 启动生产者
producer.start();
int totalMessagesToSend = 100;
for (int i = 0; i < totalMessagesToSend; i++) {
    
    
    Message message = new Message("TestTopic", ("Hello scheduled message " + i).getBytes());
    // 设置延时等级3,这个消息将在10s之后发送(现在只支持固定的几个时间,详看delayTimeLevel)
    message.setDelayTimeLevel(3);
    // 发送消息
    producer.send(message);
}

However, the delay levels currently supported by RocketMQ are limited:

private String messageDelayLevel = "1s 5s 10s 30s 1m 2m 3m 4m 5m 6m 7m 8m 9m 10m 20m 30m 1h 2h";

How does RocketMQ implement delayed messages?

Simple, eight words: 临时存储+ 定时任务.

When the Broker receives the delayed message, it will first send it to the Message Queue of the corresponding time period of the topic (SCHEDULE_TOPIC_XXXX), and then poll these queues through a scheduled task. After expiration, the message will be delivered to the queue of the target Topic, and then Consumers can consume these messages normally.

image-20230918204154153

15. How to implement distributed message transactions? Half a message?

Semi-message: refers to the message that cannot be consumed by the Consumer for the time being. The Producer successfully sends the message to the Broker. However, this message is marked as "temporarily undeliverable" and can only be confirmed after the Producer has completed the local transaction. , Consumer can consume this message.

Relying on semi-message, distributed message transactions can be implemented, the key of which lies in secondary confirmation and message review:

image-20230918204332135

  • 1. Producer sends half message to broker
  • 2. The Producer receives the response and the message is sent successfully. At this time, the message is a half-message, marked as "undeliverable" and cannot be consumed by the Consumer.
  • 3. The Producer side executes local transactions.
  • 4. Under normal circumstances, when the local transaction execution is completed, the Producer sends a Commit/Rollback to the Broker. If it is a Commit, the Broker will mark the half message as a normal message, and the Consumer can consume it. If it is a Rollback, the Broker will discard the message.
  • 5. In abnormal circumstances, the Broker cannot wait for the second confirmation. After a certain period of time, all half messages will be queried, and then the execution status of the half messages will be queried on the Producer side.
  • 6. The Producer side queries the status of local transactions
  • 7. Submit commit/rollback to the broker according to the status of the transaction. (5, 6, 7 are message review)
  • 8. After the consumer segment consumes the message, it executes local transactions and executes local transactions.

16.Do you know about the dead letter queue?

The dead letter queue is used to process messages that cannot be consumed normally, that is, dead letter messages.

When a message fails to be consumed for the first time, Message Queue RocketMQ will automatically retry the message ; after reaching the maximum number of retries, if the consumption still fails, it means that the consumer cannot consume the message correctly under normal circumstances. At this time, Message Queue RocketMQ The message will not be discarded immediately, but will be sent to a special queue corresponding to the consumer . This special queue is called a dead letter queue .

Characteristics of dead letter messages :

  • It will no longer be consumed normally by consumers.
  • The validity period is the same as that of normal messages, which is 3 days. It will be automatically deleted after 3 days. Therefore, dead letter messages need to be processed promptly within 3 days after they are generated.

Characteristics of the dead letter queue :

  • A dead letter queue corresponds to a Group ID, not to a single consumer instance.
  • If a Group ID does not generate a dead letter message, Message Queue RocketMQ will not create a corresponding dead letter queue for it.
  • A dead letter queue contains all dead letter messages generated by the corresponding Group ID, regardless of which Topic the message belongs to.

The RocketMQ console provides the functions of querying, exporting and resending bad letter messages.

17. How to ensure the high availability of RocketMQ?

Because NameServers are stateless and do not communicate with each other, high availability can be guaranteed as long as they are deployed in a cluster.

image-20230918205657048

The high availability of RocketMQ is mainly reflected in the high availability of Broker's reading and writing. The high availability of Broker is achieved through 集群and .主从

image-20230918205744712

Broker can be configured with two roles: Master and Slave. The Broker in the Master role supports reading and writing, and the Broker in the Slave role only supports reading. The Master will synchronize messages to the Slave.

In other words, the Producer can only write messages to the Broker in the Master role, and the Cosumer can read messages from the Broker in the Master and Slave roles.

In the Consumer's configuration file, there is no need to set whether to read from the Master or the Slave. When the Master is unavailable or busy, the Consumer's read request will be automatically switched to the Slave. With the mechanism of automatic switching of Consumers, when a machine in the Master role fails, the Consumer can still read messages from the Slave without affecting the Consumer's reading of messages. This achieves high read availability.

How to achieve high availability of writes on the sending side? When creating a Topic, create multiple Message Queues of the Topic on multiple Broker groups (the same Broker name, different brokerId machines form a Broker group), so that when the Master of the Broker group becomes unavailable, the Master of other groups is still available, and the Producer You can still send messages. RocketMQ currently does not support automatic conversion of Slave to Master. If the machine resources are insufficient and you need to convert Slave to Master, you must manually stop the Slave-colored Broker, change the configuration file, and start the Broker with a new configuration file.

principle

18. Tell me about the overall workflow of RocketMQ?

Simply put, RocketMQ is a distributed message queue, that is 消息队列+ 分布式系统.

As a message queue, it is a model of - , corresponding to Producer, Broker, and Cosumer; as a distributed system, it must have a server, client, and registration center, corresponding to Broker, Producer/Consumer, and NameServer

So let’s take a look at its main workflow: RocketMQ consists of NameServer registration center cluster, Producer cluster, Consumer cluster and several Brokers (RocketMQ processes):

  1. Broker registers with all NameServers when it starts, maintains a long connection, and sends a heartbeat every 30 seconds.
  2. The Producer obtains the Broker server address from the NameServer when sending a message, and selects a server to send the message based on the load balancing algorithm.
  3. When Conusmer consumes messages, it also obtains the Broker address from NameServer, and then actively pulls messages for consumption.

image-20230918214743408

19.Why doesn’t RocketMQ use Zookeeper as the registration center?

We all know that Kafka uses Zookeeper as the registration center - of course, it has gradually begun to use Zookeeper. RocketMQ does not use Zookeeper. In fact, it may be mainly considered from the following aspects:

CAP theory refers to the fact that in a distributed system, Consistency, Availability, and Partition Tolerance cannot be established at the same time.

  1. Based on availability considerations, according to the CAP theory, only two points can be satisfied at the same time, and Zookeeper satisfies CP, which means that Zookeeper cannot guarantee the availability of the service. When Zookeeper conducts an election, the entire election time is too long. During this period, the entire cluster is in an unavailable state, which is definitely unacceptable for a registration center. As a service discovery, it should be designed for availability.
  2. Based on performance considerations, the implementation of NameServer itself is very lightweight and can be expanded horizontally by adding machines to increase the cluster's pressure resistance. However, Zookeeper's writing is not scalable. Zookeeper can only solve this problem by dividing areas. Dividing multiple Zookeeper clusters to solve this problem is first of all too complicated to operate. Secondly, it violates the design of A in CAP, resulting in disconnection between services.
  3. Problems caused by the persistence mechanism. ZooKeeper's ZAB protocol will keep writing a transaction log on each ZooKeeper node for each write request, and at the same time, it will regularly mirror (Snapshot) the memory data to the disk to ensure data Consistency and persistence, but for a simple service discovery scenario, this is actually not necessary. This implementation solution is too heavy. And the data stored itself should be highly customized.
  4. Message sending should weakly rely on the registration center, and the design concept of RocketMQ is based on this. When the producer sends the message for the first time, it obtains the Broker address from the NameServer and caches it locally. If the entire NameServer cluster is unavailable, it will be cached locally in a short time. It will not have much impact on producers and consumers.

20.How does Broker save data?

RocketMQ's main storage files include CommitLog files, ConsumeQueue files, and Indexfile files.

CommitLog : The storage body of the message body and metadata. It stores the message body content written by the Producer. The message content is not fixed length. The default size of a single file is 1G, the length of the file name is 20 digits, zeros are padded on the left, and the remainder is the starting offset. For example, 00000000000000000000 represents the first file, the starting offset is 0, and the file size is 1G=1073741824; when The first file is full, the second file is 00000000001073741824, the starting offset is 1073741824, and so on. Messages are mainly written to the log file sequentially. When the file is full, the next file is written.

The CommitLog file is saved in the ${Rocket_Home}/store/commitlog directory. From the figure, we can clearly see the offset of the file name. Each file defaults to 1G. When full, a new file is automatically generated.

ConsumeQueue : Message consumption queue. The main purpose of introducing it is to improve the performance of message consumption. Since RocketMQ is a subscription model based on the topic, message consumption is for the topic. If you want to traverse the commitlog file to retrieve messages based on the topic, it is very inefficient. .

Consumer can use ConsumeQueue to find messages to be consumed. Among them, ConsumeQueue (logical consumption queue) serves as the index of consumption messages and saves the starting physical offset offset of the queue message under the specified Topic in the CommitLog, the message size size and the HashCode value of the message Tag.

The ConsumeQueue file can be regarded as a CommitLog index file based on Topic. Therefore, the ConsumeQueue folder is organized as follows: topic/queue/file three-layer organizational structure. The specific storage path is: $HOME/store/consumequeue/{topic}/{queueId }/{fileName}. Similarly, the ConsumeQueue file adopts a fixed-length design. Each entry has a total of 20 bytes, which are 8-byte CommitLog physical offset, 4-byte message length, and 8-byte tag hashcode. A single file consists of 30W entries. Each entry can be accessed randomly like an array, and the size of each ConsumeQueue file is about 5.72M;

IndexFile : IndexFile (index file) provides a method to query messages by key or time interval. The storage location of the Index file is: {fileName}. The file name fileName is named after the timestamp when it was created. The fixed size of a single IndexFile file is about 400M. An IndexFile can save 2000W indexes. The underlying storage of IndexFile is designed to be in the file. The HashMap structure is implemented in the system, so the underlying implementation of RocketMQ's index file is a hash index.

image-20230918214945104

To summarize: RocketMQ uses a hybrid storage structure, which means that all queues under a single Broker instance share a log data file (CommitLog) to store.

RocketMQ's hybrid storage structure (message entity contents of multiple topics are stored in a CommitLog) uses a storage structure that separates the data and index parts for Producer and Consumer respectively. Producer sends messages to the Broker side, and then the Broker side uses synchronization Or asynchronously persist the message and save it to the CommitLog.

As long as the message is persisted to the disk file CommitLog, the message sent by the Producer will not be lost. Because of this, Consumer will definitely have the opportunity to consume this message. When the message cannot be pulled, you can wait for the next message pull. At the same time, the server also supports long polling mode. If a message pull request does not pull the message, the Broker is allowed to wait for 30 seconds, as long as this time When new messages arrive, they will be returned directly to the consumer.

Here, RocketMQ's specific approach is to use the background service thread on the Broker side - ReputMessageService to continuously distribute requests and asynchronously build ConsumeQueue (logical consumption queue) and IndexFile (index file) data.

image-20230918215018016

21. Talk about how RocketMQ reads and writes files?

RocketMQ's reading and writing of files cleverly utilizes some efficient file reading and writing methods of the operating system - PageCache, 顺序读写, 零拷贝.

  • PageCache, sequential reading

In RocketMQ, the ConsumeQueue logical consumption queue stores less data and is read sequentially. Under the pre-reading effect of the page cache mechanism, the reading performance of the Consume Queue file is almost close to reading the memory, even when there is message accumulation. No impact on performance. For the log data files stored in CommitLog messages, more random access reads will occur when reading the message content, seriously affecting performance. If you choose an appropriate system IO scheduling algorithm, such as setting the scheduling algorithm to "Deadline" (if the block storage uses SSD at this time), the performance of random reading will also be improved.

Page cache (PageCache) is the OS's cache of files, which is used to speed up reading and writing of files. Generally speaking, the speed of a program's sequential reading and writing of files is almost close to the reading and writing speed of the memory. The main reason is that the OS uses the PageCache mechanism to optimize the performance of read and write access operations and uses a part of the memory as PageCache. For data writing, the OS will first write it into the Cache, and then the pdflush kernel thread will flush the data in the Cache to the physical disk in an asynchronous manner. For data reading, if a PageCache miss occurs when reading a file, the OS will sequentially pre-read data files of other adjacent blocks while accessing the read file from the physical disk.

  • zero copy

In addition, RocketMQ mainly reads and writes files through MappedByteBuffer. Among them, the FileChannel model in NIO is used to directly map physical files on the disk to user-mode memory addresses (this Mmap method reduces traditional IO and stores disk file data in the buffer of the operating system kernel address space, and The performance overhead of copying back and forth between buffers in the user application address space), converting file operations into direct operations on memory addresses, thus greatly improving the efficiency of reading and writing files (precisely because of the need to use the memory mapping mechanism , so RocketMQ’s file storage uses fixed-length structures to facilitate mapping the entire file to memory at one time).

What is zero copy?

In the operating system, using the traditional method, the data needs to be copied several times and needs to go through user mode/kernel mode switching.

image-20230918215058881

  1. Copy data from disk to kernel mode memory;
  2. Copy from kernel mode memory to user mode memory;
  3. Then copy from user mode memory to kernel mode memory of network driver;
  4. Finally, it is copied from the kernel mode memory of the network driver to the network card for transmission.

Therefore, zero copy can be used to reduce the number of context switches and memory copies between user mode and kernel mode to improve I/O performance. The more common implementation of zero copy is mmap . This mechanism is implemented in Java through MappedByteBuffer.

image-20230918215120209

22. How is message flushing implemented?

RocketMQ provides two disk brushing strategies: synchronous disk brushing and asynchronous disk brushing.

  • Synchronous flushing: After the message reaches the Broker's memory, it must be flushed to the commitLog log file to be considered successful, and then it will be returned to the Producer that the data has been sent successfully.
  • Asynchronous flushing: Asynchronous flushing means that after the message reaches the Broker memory, it returns to the Producer that the data has been sent successfully, and a thread is awakened to persist the data to the CommitLog log file.

Broker directly operates memory (memory mapped files) when accessing messages. This can improve the throughput of the system, but it cannot avoid data loss when the machine is powered off, so it needs to be persisted to disk.

The final implementation of disk brushing uses MappedByteBuffer.force() in NIO to write the data in the mapping area to the disk. If it is a synchronous disk brushing, after the Broker writes the message to the CommitLog mapping area, it will wait for the writing to complete. .

In terms of asynchronous operation, it only wakes up the corresponding thread and does not guarantee the timing of execution. The process is as shown in the figure.

image-20230918215157713

22. Can you tell me how RocketMQ’s load balancing is implemented?

Load balancing in RocketMQ is completed on the client side. Specifically, it can be divided into load balancing when the Producer side sends messages and load balancing when the Consumer side subscribes to messages.

Producer load balancing

When the Producer sends a message, it will first find the specified TopicPublishInfo based on the Topic. After obtaining the TopicPublishInfo routing information, the RocketMQ client will select a queue (MessageQueue) from the messageQueueList in TopicPublishInfo by default in the selectOneMessageQueue() method. to send a message. There is a sendLatencyFaultEnable switch variable here. If it is turned on, the Broker agents that are not available will be filtered out based on the random increment modulo.

image-20230918215220144

The so-called "latencyFaultTolerance" refers to backing off for a certain period of time for previous failures. For example, if the latency of the last request exceeds 550Lms, it will back off 3000Lms; if it exceeds 1000L, it will back off 60000L; if it is closed, a queue (MessageQueue) will be selected to send messages using random increment modulus. The latencyFaultTolerance mechanism is to achieve high availability for message sending. the core key.

Consumer load balancing

In RocketMQ, the two consumption modes (Push/Pull) on the Consumer side are based on the pull mode to obtain messages. The Push mode is just an encapsulation of the pull mode. Its essence is that the message pulling thread pulls the message from the server. After getting a batch of messages, and then submitting them to the message consumption thread pool, they will continue to try to pull messages from the server "non-stop". If the message is not fetched, it will be delayed for a while and then fetched again. In both pull-based consumption methods (Push/Pull), the Consumer side needs to know which message queue on the Broker side to obtain messages from. Therefore, it is necessary to do load balancing on the Consumer side, that is, multiple MessageQueue on the Broker side are allocated to which Consumers in the same ConsumerGroup consume.

  1. Heartbeat packet sending on the Consumer side

After the Consumer is started, it will continuously send heartbeat packets to all Broker instances in the RocketMQ cluster through scheduled tasks (which contains information such as the name of the message consumption group, the collection of subscription relationships, the message communication mode and the value of the client id) . After receiving the Consumer's heartbeat message, the Broker will maintain it in the local cache variable of the ConsumerManager - consumerTable, and at the same time save the encapsulated client network channel information in the local cache variable - channelInfoTable, for subsequent load on the Consumer side. Balance provides metadata information that can be relied upon.

  1. The core class for implementing load balancing on the Consumer side—RebalanceImpl

In the startup process of the Consumer instance, the startup of the MQClientInstance instance will complete the startup of the load balancing service thread - RebalanceService (executed every 20 seconds).

By looking at the source code, we can find that the run() method of the RebalanceService thread ultimately calls the rebalanceByTopic() method of the RebalanceImpl class. This method is the core of achieving consumer-side load balancing.

The rebalanceByTopic() method will perform different logical processing depending on whether the consumer communication type is "broadcast mode" or "cluster mode". Here we mainly look at the main processing procedures in cluster mode:

(1) Obtain the message consumption queue set (mqSet) under the Topic topic from the local cache variable of the rebalanceImpl instance—topicSubscribeInfoTable;

(2) Call the mQClientFactory.findConsumerIdList() method based on topic and consumerGroup as parameters to send a communication request to the Broker to obtain the list of consumer IDs under the consumer group;

(3) First sort the message consumption queues and consumer IDs under the Topic, and then use the message queue allocation strategy algorithm (the default is: the average allocation algorithm of the message queue) to calculate the message queue to be pulled. The average allocation algorithm here is similar to the paging algorithm. It sorts all MessageQueues like records, sorts all consumer consumers like the number of pages, and finds the average size and records of each page that each page needs to contain. range, and finally traverse the entire range to calculate the MessageQueue that should be allocated to the current Consumer side.

image-20230918215304192

(4) Then, call the updateProcessQueueTableInRebalance() method. The specific method is to first perform a filtering comparison between the assigned message queue set (mqSet) and processQueueTable.

image-20230918215322343

  • The red part marked by processQueueTable in the above figure indicates that it is not included in the allocated message queue set mqSet. Set the Dropped attribute of these queues to true, and then check whether these queues can be removed from the processQueueTable cache variable. Here, the removeUnnecessaryMessageQueue() method is specifically executed, that is, every 1s, check whether the lock of the current consumption processing queue can be obtained, and return true if obtained. . If after waiting for 1s, the lock of the current consumption processing queue is still not obtained, false will be returned. If true is returned, the corresponding Entry is removed from the processQueueTable cache variable;
  • The green part of processQueueTable in the above figure represents the intersection with the assigned message queue set mqSet. Determine whether the ProcessQueue has expired. Don't worry about it in Pull mode. If it is in Push mode, set the Dropped attribute to true, and call the removeUnnecessaryMessageQueue() method, and try to remove the Entry as above;
  • Finally, create a ProcessQueue object for each MessageQueue in the filtered message queue set (mqSet) and store it in the processQueueTable queue of RebalanceImpl (where the computePullFromWhere(MessageQueue mq) method of the RebalanceImpl instance is called to obtain the next progress consumption of the MessageQueue object value offset, and then fill it into the attribute of the pullRequest object to be created next), and create the pull request object-pullRequest and add it to the pull list-pullRequestList. Finally, execute the dispatchPullRequest() method and put the request object PullRequest of the Pull message in sequence. Enter the blocking queue pullRequestQueue of the PullMessageService service thread, and then initiate a Pull message request to the Broker after the service thread is taken out. Among them, you can focus on comparison. The dispatchPullRequest() method of the two implementation classes RebalancePushImpl and RebalancePullImpl is different. The method in the RebalancePullImpl class is empty.

The core design concept of message consumption queue load balancing among different consumers in the same consumer group is that a message consumption queue is only allowed to be consumed by one consumer in the same consumer group at the same time, and a message consumer can consume multiple messages at the same time. message queue.

23.Do you understand RocketMQ message long polling?

The so-called long polling means that the Consumer pulls messages. If the corresponding Queue has no data, the Broker will not return immediately. Instead, it will hold the PullReuqest and wait for the queue to have messages or the long polling blocking time has expired before restarting. Process all PullRequests on this queue.

image-20230918215353728

  • PullMessageProcessor#processRequest

    //如果没有拉到数据
    case ResponseCode.PULL_NOT_FOUND:
    // broker 和 consumer 都允许 suspend,默认开启
    if (brokerAllowSuspend && hasSuspendFlag) {
          
          
        long pollingTimeMills = suspendTimeoutMillisLong;
        if (!this.brokerController.getBrokerConfig().isLongPollingEnable()) {
          
          
            pollingTimeMills = this.brokerController.getBrokerConfig().getShortPollingTimeMills();
        }
    
        String topic = requestHeader.getTopic();
        long offset = requestHeader.getQueueOffset();
        int queueId = requestHeader.getQueueId();
        //封装一个PullRequest
        PullRequest pullRequest = new PullRequest(request, channel, pollingTimeMills,
                                                  this.brokerController.getMessageStore().now(), offset, subscriptionData, messageFilter);
        //把PullRequest挂起来
        this.brokerController.getPullRequestHoldService().suspendPullRequest(topic, queueId, pullRequest);
        response = null;
        break;
    }
    

For pending requests, a service thread will constantly check to see if there is data in the queue, or it will time out.

  • PullRequestHoldService#run()
@Override
public void run() {
    
    
    log.info("{} service started", this.getServiceName());
    while (!this.isStopped()) {
    
    
        try {
    
    
            if (this.brokerController.getBrokerConfig().isLongPollingEnable()) {
    
    
                this.waitForRunning(5 * 1000);
            } else {
    
    
                this.waitForRunning(this.brokerController.getBrokerConfig().getShortPollingTimeMills());
            }

            long beginLockTimestamp = this.systemClock.now();
            //检查hold住的请求
            this.checkHoldRequest();
            long costTime = this.systemClock.now() - beginLockTimestamp;
            if (costTime > 5 * 1000) {
    
    
                log.info("[NOTIFYME] check hold request cost {} ms.", costTime);
            }
        } catch (Throwable e) {
    
    
            log.warn(this.getServiceName() + " service has exception. ", e);
        }
    }

    log.info("{} service end", this.getServiceName());
}

Source of data: Counterattack of scumbags: RocketMQ Twenty-three Questions (qq.com)

Guess you like

Origin blog.csdn.net/weixin_45483322/article/details/132998780