The basic concept and system architecture of RocketMQ

basic concept

Message

Message refers to the physical carrier of information transmitted by the message system, the smallest unit of production and consumption data, and each message must belong to a topic.

Topic

insert image description here
Topic represents a collection of messages. Each topic contains several messages. Each message can only belong to one topic. It is the basic unit of RocketMQ for message subscription. topic:message 1:n message:topic 1:1

A producer can send messages of multiple topics at the same time. producer:topic 1:n; And a consumer is only interested in a specific Topic consumer:topic 1:1, that is, only one Topic message can be subscribed and consumed

Tags

The label set for the message is used to distinguish different types of messages under the same topic. For messages from the same business unit, different tags can be set under the same topic according to different business purposes. Tags can effectively maintain the clarity and coherence of the code, and optimize the query system provided by RocketMQ. Consumers can implement different consumption logics for different subtopics based on Tags to achieve better scalability.

Queue

A physical entity that stores messages. A Topic can contain multiple Queues, and each Queue stores the messages of the Topic. A Topic Que is also called a partition of messages in a Topic.

Messages in a Topic's Queue can only be consumed by one consumer in a consumer group. The messages in a Queue do not allow multiple consumers in the same consumer group to consume at the same time.
insert image description here
When learning to refer to other related materials, you will also see a concept: Sharding. Shards are different from partitions. In RocketMQ, fragmentation refers to the number of Brokers that store the corresponding Topic. A corresponding number of partitions, namely Queues, will be created in each slice, and the size of each Queue is the same.
insert image description here

Message ID (MessageId/Key)

RocketMQEach message in has 唯一的Messageldand can carry a Key with a business identifier to facilitate message query. However, it should be noted that there are two Messagelds: one is automatically generated when the producer sends a message Messageld(msgld), and one is automatically generated by the Broker when the message reaches the Broker MessageId(offsetMsgld). msgId、offsetMsgldBoth keyare called message identifiers.

  • msgld: Generated by the producer side, its generation rules are:
    producerIp+进程pid+MessageclientIDSetter类的classLoader的hashcode+当 前时间+AutomicInteger自增计数器
  • offsetMsgld: Generated by the broker side, its generation rules are:brokerIp +物理分区的offset
  • key: specified by the user业务相关的唯一标识

system structure

The RocketMQ architecture is mainly divided into four parts:
insert image description here

Producer Producer

The message producer is responsible for producing messages. The Producer selects the corresponding Broker cluster queue for message delivery through the MQ load balancing module. The delivery process supports fast failure and low latency.

For example, the process of writing the logs generated by the business system to MQ is the process of message production.
For example, the process of writing the seckill request submitted by the user on the e-commerce platform to MQ is the process of message production.

Message producers in RocketMQ all (Producer Group)appear in the form of producer groups. A producer group is a collection of producers of the same type that send messages of the same topic type. A producer group can send messages on multiple topics at the same time.

Consumer Consumer

The message consumer is responsible for consuming messages. A message consumer will obtain messages from the Broker server and perform related business processing on the messages.

For example, the process of the QoS system reading logs from MQ
and parsing and processing the logs is the process of message consumption.
For another example, the business system of the e-commerce platform reads the seckill request from MQ and processes the request is the process of message consumption.

RocketMQ中的消息消费者都是以消费者组(Consumer Group)的形式出现的。消费者组是同一类消费者 的集合,这类Consumer消费的是同一个Topic类型的消息。消费者组使得在消息消费方面,实现负载均衡 (将一个Topic中的不同的Queue平均分配给同一个Consumer Group的不同的Consumer,注意,并不是 将消息负载均衡)和容错(一个Consmer挂了,该Consumer Group中的其它Consumer可以接着消费原 Consumer消费的Queue)的目标变得非常容易。
insert image description here
消费者组中Consumer的数量应该小于等于订阅Topic的Queue数量。如果超出Queue数量,则多出的Consumer将不能消费消息。
insert image description here
一个Topic类型的消息可以被多个消费者组同时消费。

注意,1)一个消费者组中的消费者必须订阅完全相同的Topic
2)消费者组只能消费一个Topic的消息,不能同时消费多个Topic消息

名字服务器 NameServer

功能介绍

NameServer是一个Broker与Topic路由的注册中心,支持Broker的动态注册与发现。

RocketMQ的R想来自于Kafka,而Kafka是依赖了Zookeeper的。所以,在RocketMQ的早期版本,即在 MetaQv1.0与v2.0版本中,也是依赖于Zookeeper的。从MetaQ v3.0,即RocketMQ开始去掉了Zookeeper 依赖,使用了自己的NameServer。

主要包括两个功能:

  • Broker管理:
    接受Broker集群的注册信息并且保存下来作为路由信息的基本数据;提供心跳检测机 制,检查Broker是否还存活。
  • 路由信息管理:
    Each NameServer stores the entire routing information of the Broker cluster and queue information for client queries. Producer and Conumser can obtain the routing information of the entire Broker cluster through the NameServer, so as to deliver and consume messages.

route registration

NameServer is also usually deployed in the form of a cluster, but NameServer is stateless, that is, there is no difference between the nodes in the NameServer cluster, and there is no information communication between the nodes. How is the data in each node synchronized? When the Broker node starts, it trains the NameServer list in rotation, establishes a long connection with each NameServer node, and initiates a registration request. A Broker list is maintained inside the NameServer to dynamically store Broker information.

In order to prove that it is alive and maintain a long connection with the NameServer, the Broker node will report the latest information to the NameServer in the form of a heartbeat packet, and send a heartbeat every 30 seconds. The heartbeat packet contains Brokerld, Broker address, Broker name, the name of the cluster to which the Broker belongs, and so on. After the NameServer receives the heartbeat packet, it will update the heartbeat timestamp to record the latest survival time of the Broker.

Note that this is different from other registration centers like k, Eureka, Nacos, etc.
What are the advantages and disadvantages of this stateless NameServer method:
Advantages: NameServer clusters are easy to build and easy to scale down.
Disadvantage: For Broker, all NameServer addresses must be clearly indicated. Otherwise, those not indicated will not be registered. Because of this, NameServer cannot be expanded casually. Because, if the Broker is not reconfigured, the newly added NameServer is invisible to the Broker, and it will not register with the NameServer.

In order to prove that it is alive and maintain a long connection with the NameServer, the Broker node will report the latest information to the NameServer in the form of a heartbeat packet, and send a heartbeat every 30 seconds. The heartbeat packet contains Brokerld, Broker address (IP+Port), Broker name, the name of the cluster to which the Broker belongs, and so on. After the NameServer receives the heartbeat packet, it will update the heartbeat timestamp to record the latest survival time of the Broker.

route removal

Due to reasons such as Broker shutdown, downtime, or network jitter, NameServer does not receive the heartbeat of Broker, and NameServer may remove it from the Broker list.
There is a scheduled task in the NameServer, which scans the Broker table every 10 seconds to check whether the latest heartbeat timestamp of each Broker is more than 120 seconds from the current time. If it exceeds, it will determine that the Broker is invalid, and then delete it from the Broker list removed.

Expansion: For the daily operation and maintenance of RocketMQ, such as Broker upgrade, it is necessary to stop the work of Broker. What does OP need to do?
The OP needs to disable the Broker's read and write permissions. Once the client (Consumer or Producer) sends a request to the broker, it will receive the broker's NO PERMISSION response, and then the client will retry to other brokers.
When the OP observes that the Broker has no traffic, it will be shut down to realize the removal of the Broker from the NameServer.
OP: Operation and Maintenance Engineer
SRE: Site Reliability Engineer (Site Reliability Engineer)

route discovery

RocketMO's route discovery uses the Pull model. When the topic routing information changes, the NameServer will not actively push it to the client, but the client will periodically pull the latest topic routing information. By default, the client will pull the latest route every 30 seconds.

Extension:
1) Push model: push model. Its real-time performance is better, and it is a "publish-subscribe" model, which needs to maintain a long connection. The maintenance of long connections requires resource costs. Scenarios that this model is suitable for:
High real-time requirements
• The number of clients is small, and server data changes frequently
2) Pull model: pull model. The problem is that the real-time performance is poor.
3) Long Polling model: long polling model. It is an integration of Push and Pull models, making full use of the advantages of these two models and shielding their disadvantages.

Client NameServer Selection Policy

The client must write the address of the NameServer cluster when configuring, so which NameServer node is the client connected to? The client will first give a random number, and then take the modulus with the number of NameServer nodes. What you get at this time is what you want The node index to join, and then the join will be made. If the connection fails, the round-robin strategy will be used to try to connect to other nodes one by one.

The choice made is taken first 随机策略, and the one that fails is taken 轮询策略.

Extension: How does Zookeeper Client choose Zookeeper Server?
To put it simply, after two Shuffles, the first Zookeeper Server is selected.
In detail, the Zookeeper server address in the configuration file is shuffled for the first time, and then one is randomly selected. This selection is generally a hostname. Then get all the IPs corresponding to the hostname, and then perform a second Shuffle on these IPs, and connect to the first server address from the Shuffled results.

Proxy server Broker

Features

Broker acts as a message transfer role, responsible for storing and forwarding messages. Broker is responsible for receiving and storing messages sent from producers in the RocketMQ system, while preparing for consumer pull requests. Broker also stores message-related metadata, including consumer group consumption progress offset offset, topic, queue, etc.

After Kafka version 0.8, the offset is stored in Broker, and the previous version is stored in Zookeeper.

Module composition

The following figure is a schematic diagram of the functional modules of Broker Server.
insert image description here
Remoting Module:The entire Broker entity is responsible for processing requests from clients. And this Broker entity is composed of the following modules.
Client Manager:Client Manager. Responsible for receiving and parsing client (Producer/Consumer) requests and managing clients. For example, maintain the Topic subscription information
Store Service:storage service of Consumer. Provides a convenient and simple API interface to handle message storage to physical hard disk and message query functions.
HA Service:High availability service, providing data synchronization function between Master Broker and Slave Broker.
Index Service:indexing service. According to a specific Message key, the indexing service is performed on the message delivered to the Broker, and the function of quickly querying the message according to the Message Key is also provided.

cluster deployment

insert image description here
In order to enhance Broker performance and throughput, Brokers generally appear in the form of clusters. Different queues of the same topic may be stored in each cluster node. However, there is a problem here. If a Broker node goes down, how to ensure that the data is not lost? The solution is to expand each Broker cluster node horizontally, that is, to rebuild the Broker node into an HA cluster to solve the single point problem.

The Broker node cluster is a master-slave cluster, that is, there are two roles in the cluster, Master and Slave. Master is responsible for processing read and write operation requests, and Slave is responsible for backing up the data in Master. When the Master hangs up, the Slave will automatically switch to the Master to work. So this Broker cluster is an active and standby cluster. A Master can contain multiple Slaves, but a Slave can only belong to one Master. The corresponding relationship between Master and Slave is determined by specifying the same BrokerName but different BrokerId. BrokerId is 0 means Master, non-zero means Slave. Each Broker establishes a persistent connection with all nodes in the NameServer cluster, and regularly registers Topic information to all NameServers.

work process

specific process

  • 1) Start the NameServer. After the NameServer starts, it starts to listen to the port, waiting for the connection of Broker, Producer and Consumer.

  • 2) When the Broker is started, the Broker will establish and maintain a long connection with all NameServers, and then send a heartbeat packet to the NameServer every 30 seconds.

  • 3) Before sending a message, you can create a Topic first. When creating a Topic, you need to specify which Brokers the Topic will be stored on. Of course, when creating a Topic, the relationship between the Topic and the Broker will also be written into the NameServer. However, this step is optional, and a topic can also be created automatically when a message is sent.

  • 4) When the Producer sends a message, it first establishes a long connection with one of the NameServer clusters at startup, and obtains the routing information from the NameServer, that is, the mapping relationship between the Queue of the currently sent Topic message and the address (IP+Port) of the Broker. Then select a Queue from the team according to the algorithm strategy, establish a long connection with the Broker where the queue is located, and send a message to the Broker. Of course, after obtaining the routing information, the Producer will first cache the routing information locally, and then update the routing information from the NameServer every 30 seconds.

  • 5) Consumer is similar to Producer. It establishes a long connection with one of the NameServers, obtains the routing information of the Topic it subscribes to, and then obtains the Queue it wants to consume from the routing information according to the algorithm strategy, and then directly establishes a long connection with Broker to start consume the news. After the consumer obtains the routing information, it also updates the routing information from the NameServer every 30 seconds. However, unlike Producer, Consumer also sends heartbeats to Broker to ensure the survival status of Broker.

Topic creation mode

When creating a topic manually, there are two modes:

  • Cluster mode: Topics created in this mode are in the cluster, and the number of Queues in all Brokers is the same
  • Broker mode: Topics created in this mode are in the cluster, and the number of Queues in each Broker can be different.
    When automatically creating a topic, the broker mode is adopted by default, and 4 queues will be created for each broker by default.

read/write queue

Physically speaking, the read/write queue is the same queue. Therefore, there is no read/write queue data synchronization problem. Read/write queues are logically distinct concepts. In general, the read/write queue numbers are the same.

For example, when creating a topic, the number of write queues is set to 8, and the number of read queues is set to 4. At this time, the system will create 8 Queues, which are 0 1 2 3 4 5 6 7. Producer will write messages to these 8 queues, but Consumer will only consume messages in 4 queues 0 1 2 3, and messages in 4 5 6 7 will not be consumed.

For another example, when creating a Topic, set the number of write queues to 4 and the number of read queues to 8. At this time, the system will create 8 Queues, which are 0 1 2 3 4 5 6 7. Producer will write messages to the 4 queues of 0 1 2 3, but Consumer will only consume the messages in the 8 queues of 0 1 2 3 4 5 6 7, but there is no message in 4 5 6 7. At this time, it is assumed that the Consumer Group contains two Consumers, Consumer1 consumes 0 1 2 3, and Consumer2 consumes 4 5 6 7. But the reality is that Consumer2 has no messages to consume.

That said, there is always a problem when the number of read/write queues is set differently. So, why is it designed this way?

The purpose of this design is to facilitate the shrinking of the Queue of the Topic.

For example, the originally created Topic contains 16 Queues, how can the Queues be reduced to 8 without losing messages? The number of write queues can be dynamically modified to 8, and the number of read queues remains unchanged. At this time, new messages can only be written to the first 8 queues, but all consumers consume data in 16 queues. When it is found that the messages in the last 8 Queues have been consumed, the number of read queues can be dynamically set to 8. During the entire shrinking process, no messages were lost.

perm is used to set the operation permission for the currently created Topic: 2 means write-only, 4 means read-only, and 6 means read-write.

Guess you like

Origin blog.csdn.net/Java_Fly1/article/details/129153467
Recommended