[Alibaba Middleware Technology Series] "RocketMQ Technology Special Topic" Xiaobai zone has a taste of the most basic RocketMQ!

In response to private messages from some friends, I hope to introduce the basics of RocketMQ, so let's start from 0 now, enter the basic learning and concept introduction of RocketMQ, and lay a solid foundation for learning and using RocketMQ!

RocketMQ is a fast, reliable, distributed, and easy-to-use message middleware developed by Alibaba. Its predecessor is Metaq. Metaq can be regarded as the java version of linkedin's Kafka (scala) and adds transactions to it. support.

RocketMQ is Metaq3.0. Compared with the original Kafka, it is good at not only pointing out the original log collecting , but also adding features such as HA and transactions, so that it can replace most of the traditional MQ functionally.

Reliable FIFO and strict message order
Pub/Sub and P2P messaging models
Ability to hold millions of messages in a single queue
Pull and push queues
Various messaging protocols, such as JMS, MQTT, etc.
Distributed cluster, support fault tolerance
Docker images for isolated testing and cloud Isolated clusters
Management of rich configuration and monitoring functions

Topic is a topic. In a system, we can divide messages into Topics, so that different messages are sent to different queues.

Under a topic, we can set multiple queues, and each queue is what we usually call a message queue;
Because the queue is completely subordinate to a specific topic, when we want to send a message, we always specify what topic the message belongs to.
Through equeue, you can know how many queues there are under the topic, but which queue is it sent to? For example, there are 4 queues under the topic, so which queue should the message under this topic be sent to when sending it?
At present, equeue's method is that when sending a message, the user is required to specify the topic corresponding to the message and an object type parameter for routing.
The equeue will get all the queues according to the topic, and then get the number of the queue to be sent through the hash code according to the object parameter, then take the modulo number of the queue, so as to know which queue to send to.
The process of routing messages is done on the side that sends the messages, which is the producer to be mentioned below. The reason why it is not done on the message server is that it allows users to decide how to route messages, which has greater flexibility.

Producer of the message queue. We know that the essence of the message queue is to implement the publish-subscribe mode, that is, the producer-consumer mode. Producers produce messages and consumers consume messages. So the Producer here is used to produce and send messages.

The consumer of the message queue, a message can have multiple consumers.

Consumer grouping, this may be a new concept for everyone. The reason for creating a consumer group is to realize the cluster consumption mentioned below . A consumer group contains some consumers. If these consumers want to consume in a cluster, these consumers will consume the messages in the group on average.

The broker in the equeue is responsible for the transfer of messages, that is, receiving the messages sent by the producer, then persisting the messages to disk, receiving the request from the consumer to pull the message, and then pulling the corresponding message to the consumer according to the request.
Therefore, a broker can be understood as a message queue server that provides services for receiving, storing, and fetching messages.
The broker is the core of the equeue, it must not hang up, once it hangs up, the producer and consumer will not be able to implement publish-subscribe.
Use CPU resources in exchange for network card traffic resources;
FilterServer and Broker are deployed on the same machine, and the data is communicated through the local loopback without going through the network card;
One Broker deploys multiple FilterServers to make full use of CPU resources, because it is difficult for a single JVM to fully utilize high-configuration physical machine CPU resources;
Because the filtering code is written in Java, the application can do almost any form of server-side message filtering, such as filtering through Messgae Header, or even filtering according to Message Body;
Using the Java language as a filtering expression is a double-edged sword, which facilitates the filtering operation of the application, but brings security risks on the server side. Applications are required to ensure the security of the filtering code. For example, in the filtering program, operations such as applying for large memory and creating threads should be avoided as much as possible to avoid resource leaks on the Broker server.
SEND_OK: The message is sent successfully;
FLUSH_DISK_TIMEOUT: The message is sent successfully, but the server flush timed out, and the message has entered the server queue. Only when the server is down at this time, the message will be lost;
FLUSH_SLAVE_TIMEOUT: The message is sent successfully, but the server timed out when synchronizing to the slave, and the message has entered the server queue. Only when the server is down this time, the message will be lost;
SLAVE_NOT_AVAILABLE: The message is sent successfully, but the slave is not available at this time, and the message has entered the server queue. Only when the server is down at this time, the message will be lost;

Cluster consumption means that a consumer under a consumer group consumes queues under a topic on average.

If there are 4 queues under a topic, and there is currently a consumer group, and there are 4 consumers under this group, then each consumer will be assigned to a queue under the topic, so that the average consumption of the queues under the topic is reached Purpose.
If there are only two consumers in the consumer group, each consumer consumes two queues.
If there are 3 consumers, the first consumes 2 queues, and each of the latter two consumes a queue, so as to achieve the average consumption as much as possible.

Try to make the number of consumers under the consumer group consistent with or in multiples of the number of topic queues. In this way, the number of queues consumed by each consumer is always the same, so that the pressure on each consumer server will be similar. The current premise is that the number of messages in each queue under this topic is always about the same. We can guarantee this by performing hash routing on the message according to a user-defined key.

Broadcast consumption means that as long as a consumer subscribes to the news of a certain topic, it will receive the messages in all queues under the topic, regardless of the group of the consumer. So for broadcast consumption, the consumer group has no practical significance. When consumer can be instantiated, we can specify whether it is cluster consumption or broadcast consumption.

For cluster consumption and broadcast consumption, the place where the consumption progress is persisted is different. The consumption progress of cluster consumption is placed on the broker, that is, the message queue server, while the consumption progress of broadcast consumption is stored on the consumer's local disk.

* Because the consumer of a queue may be replaced, because the number of consumers under the consumer group may increase or decrease, and then it will recalculate which queues each consumer should consume, so when there is a change in the consumer of a queue , How does the new consumer know where to start consuming this queue?

If the consumption progress of this queue is stored on the previous consumer server, it is difficult to obtain the consumption progress, because it is possible that the server has been hung up or taken off the shelf. And because the broker is always serving all consumers, in the case of cluster consumption, the consumption position of the subscribed topic queue is stored on the broker, which is isolated according to different consumer groups when storing. Ensure that the consumption progress of consumers under different consumer groups complements each other.

For broadcast consumption, since the consumer of a queue will not change, there is no need for the broker to save the consumption position, so it is saved on the consumer's own server.

Consumption progress means that when a consumer in a consumer group is consuming messages in a queue, the equeue knows where the current consumption is by recording the consumption position (offset). So that the consumer can continue to consume from this location after restarting.

For example, if a topic has 4 queues and a consumer group has 4 consumers, each consumer is allocated to a queue, and each consumer consumes the messages in its own queue respectively.

The equeue will record the consumption progress of each consumer on its queue separately, so as to ensure that each consumer knows where to continue to consume next time after restarting.

In fact, maybe after the next restart, the queue will not be consumed by the consumer, but consumed by other consumers in the group. It doesn't matter, because we have already recorded the consumption position of the queue.

The consumption location has nothing to do with the consumer. The consumption location is completely an attribute of the queue, which is used to record where it is currently consumed. Another important point is that a topic can be subscribed by consumers in multiple consumer groups.

Even if consumers in different consumer groups consume the same queue under the same topic, the consumption progress is stored separately. That is to say, the consumption of consumers in different consumer groups is completely isolated and unaffected by each other.

share resources

Information sharing
To obtain the above resources, please visit the open source project and click to jump

[Alibaba Middleware Technology Series] "RocketMQ Technology Special Topic" Xiaobai zone has a taste of the most basic RocketMQ!

share resources

Guess you like