RabbitMQ reliability: reliability guarantee during message delivery

Author: Zen and the Art of Computer Programming

1 Introduction

RabbitMQ (Rabbit Message Queue) is an open source, multi-purpose message queue based on the AMQP protocol. It originally originated from financial systems and is used to store, forward and exchange data in distributed systems. But over time, it has been used in more and more applications and fields. Currently, many companies are using RabbitMQ to implement their internal message communication and service communication. Therefore, understanding RabbitMQ is increasingly important. This article will combine the author's own work experience to elaborate on the message reliability mechanism in RabbitMQ.

2. Explanation of basic concepts and terms

2.1 Message model

First, we need to have a simple understanding of the message model in RabbitMQ. There are three types of entities in RabbitMQ: producers, exchanges, queues and consumers.

producer

Producers are entities that send messages to RabbitMQ. Messages can be posted to queues or exchanges by applications by calling APIs. Producers can choose different options to decide when to deliver messages to which queue or exchange, and can also specify message attributes such as message priority, timeout, number of repetitions, etc. The producer can also set multiple return addresses. If the message cannot be delivered to the queue or exchange for some reason, it will be routed to the specified return address for retry.

switch

Switches in RabbitMQ are similar to switches in our real life. It is responsible for storing the received information and passing the information to the corresponding queue according to some rules. Each queue can be configured to be bound to a routing key on the switch. When a producer publishes a message to the switch, the switch delivers the message to the corresponding queue based on the routing key of the message and the bound routing key.

queue

A queue is a temporary container used to store messages. RabbitMQ can create many queues, each of which can store different types of messages. For example, you can create three different types of queues: on-demand queue, live broadcast queue and private message queue. Of course, you can also create a mixed type queue, for example, store both on-demand messages and live broadcast messages in the same queue. Here, we only discuss content related to message reliability. Therefore, the queue type here can be simply understood as an on-demand queue, a live broadcast queue, or a private message queue.

consumer

A consumer is an entity that obtains messages from RabbitMQ and processes them. When there are messages in the queue, consumers can obtain and process these messages. Consumers can set the prefetch count attribute to indicate how many messages are obtained at one time, which can improve performance. At the same time, you can also set QoS parameters, namely Quality of Service, to control the speed of message processing by RabbitMQ. QoS can be configured with two parameters: prefetch count and prefetch size. prefetch count indicates the maximum number of messages that can be obtained at the same time; prefetch size indicates the size limit of each message obtained.

2.2 Reliability

Message reliability is a very important issue and one of the important features provided by RabbitMQ. Because there is no reliable message delivery mechanism, even if the message is correctly delivered to the target queue or exchange, there is no guarantee that the message will be received by the consumer. Therefore, RabbitMQ provides various reliability mechanisms to ensure the reliability of message transmission. The following introduces the main reliability mechanisms in RabbitMQ.

2.2.1 Transactions

RabbitMQ supports transactions, that is, a series of actions are either done or none. If the transaction completes successfully, all actions will be executed; if the transaction fails, no actions will be executed. Using transactions ensures that during message delivery, either all is done or none is done.

//开启事务
channel.txSelect();
try{
    // 发送消息
    channel.basicPublish("","",null,message);
    //提交事务
    boolean commitSuccess = channel.txCommit();
    if(commitSuccess){
        System.out.println("事务提交成功!");
    }else {
        System.out.println("事务提交失败!");
    }
}catch (Exception e){
    try {
        //回滚事务
        boolean rollbackSuccess = channel.txRollback();
        if(rollbackSuccess){
            System.out.println("事务回滚成功!");
        }else {
            System.out.println("事务回滚失败!");
        }
    } catch (IOException ex) {
        ex.printStackTrace();
    }
}finally {
    //关闭连接
    channel.close();
    connection.close();
}

2.2.2 Publisher Confirmations

Publisher Confirmations is a reliability mechanism in RabbitMQ. It allows you to confirm that the consumer received the message after confirming the message sent by the producer. This ensures that messages are not lost and can be retried as needed.

boolean isConfirm = true;
if(isConfirm){
    channel.confirmSelect();
}
ListenableFuture<Sent> future = channel.basicPublish("", "", null, message);
if(!isConfirm){
    future.addCallback(new FutureCallback<Sent>() {
        @Override
        public void onFailure(Throwable throwable) {
            // 抛出异常时重试
            doRetry(future);
        }

        @Override
        public void onSuccess(Sent sent) {
            // 确认消息投递成功
        }
    });
}else {
    future.addCallback(new PublishConfirmListener());
}
...
class PublishConfirmListener implements ConfirmListener {
    @Override
    public void handleAck(long deliveryTag, boolean multiple) throws IOException {
        // 当消息投递到队列并得到ACK时,调用此方法
    }

    @Override
    public void handleNack(long deliveryTag, boolean multiple) throws IOException {
        // 当消息投递到队列但没得到ACK时,调用此方法
        doRetry(deliveryTag);
    }

    private void doRetry(Object arg) {
        // 重新发送已确认但未确认的消息
       ...
    }

    private void doRetry(long deliveryTag) {
        // 根据deliveryTag重新发送已确认但未确认的消息
       ...
    }
}

2.2.3 Endurance

RabbitMQ supports message persistence. This means that messages are paused in the queue and then stored on disk rather than just held in memory. Even if the consumer suddenly crashes, previously published messages will not be affected because they have been persisted to disk. Persistent messages can ensure reliable delivery of messages.

2.2.4 Message persistence tracking

When a message is persisted, RabbitMQ will record the message metadata, including whether the message is confirmed to be consumed, whether the message is persisted, etc. This helps you track the status of messages and know when they can be safely deleted.

2.2.5 Dead letter queue

When RabbitMQ drops certain messages due to various reasons (such as queue length being too long, consumer processing failure, etc.), it can save by rerouting these messages to the dead letter queue. Certain conditions can be set to decide whether to reroute the message to the dead letter queue, such as the message's TTL (Time To Live) expiration or the message being rejected (redelivered).

2.2.6 Flow control

Flow control refers to limiting the number of messages during the message delivery process based on the consumer's ability to process messages. RabbitMQ controls the consumer's message processing rate by setting how many messages it can receive per second through prefetch count.

2.2.7 Copy Queue

To ensure reliable delivery of messages, RabbitMQ supports the creation of multiple queues, called replica queues. These queues can hold the same messages, but only one of the replica queues is elected as the real queue. When a failover occurs, messages can be automatically copied from the wrong queue to the new queue. This approach reduces the possibility of message loss.

3. Explanation of core algorithm principles, specific operating steps and mathematical formulas

3.1 Multi-level flow control

RabbitMQ is designed with a multi-level flow control strategy to limit the inflow rate of messages. In order to achieve better performance, RabbitMQ uses priority queues by default. The priority queue guarantees the order of messages, but in the RabbitMQ configuration items, you can control the maximum number of consumers of the queue, which limits the consumption rate of each queue. Assuming that the maximum number of consumers per queue is k, then ideally, the consumption rate of the i-th priority queue should be qi = kp, where p is the total number of consumers of all priority queues. RabbitMQ further adjusts the queue's traffic by calculating the current maximum consumption rate of each queue. The following figure shows the flow control strategy of RabbitMQ:

According to the default configuration, RabbitMQ allocates a virtual node to each priority queue. Each virtual node can only be consumed by one consumer, which can balance the traffic between each priority queue. For each virtual node, RabbitMQ will count the average consumption delay of all previous virtual nodes in order to determine the weight that the current virtual node should receive. This weight directly affects the message processing rate of the virtual node. The weights are divided into three levels: 1, 2, and 3. The higher the weight, the faster the message processing rate of the virtual node. Each priority queue has k virtual nodes, and the virtual nodes of each priority queue share the same weight. When a consumer connects to RabbitMQ, RabbitMQ will assign an initial weight to the consumer (default is 1), and dynamically adjust the weight based on historical message processing delays.

For example, assume that a queue has 10 virtual nodes and 50 messages are consumed per second. If the average latency of a virtual node is 0.5s, then its weight is 5. If the average delay of a consumer is 1s, then its weight is 1. If the maximum number of consumers of a certain priority queue is 100, then the consumption rate of the queue is approximately 100 × (k/p=5)/0.5 ≈ 100 × 5 / 2 ≈ 2000 msg/s.

If the consumer's processing capabilities deteriorate, or there is too much backlog of messages in the current queue, it will be blocked until other consumers have processed the remaining messages.

In addition, RabbitMQ also provides two mechanisms to manage the backlog of messages. One is publish rate limit (message publishing rate limit), and the other is backpressure.

The publish rate limit refers to how many messages a publisher is allowed to publish per second. This can be set in the publisher configuration file, for example, "x-max-rate": 100 limits publishing to 100 messages per second. Backpressure means that RabbitMQ does not want the consumer to be overloaded in a short period of time, so it will stop accepting new messages. RabbitMQ stops pushing messages when the message backlog exceeds a certain value.

3.2 Meet persistent message requirements

RabbitMQ uses the AMQP protocol for messaging. It uses TCP/IP as the transport layer protocol and uses the Erlang programming language to implement server-side logic. Erlang is an interpreted functional programming language that runs in a virtual machine.

In order to ensure message persistence, RabbitMQ saves messages on disk and synchronizes them to multiple disks. It uses a message log to store messages. Each message has a unique message number and the message log is stored sequentially. After RabbitMQ starts, it will read the message log on the disk and sort it according to the message number. RabbitMQ then reads the corresponding message from the message log and pushes it to the message queue. When a consumer consumes a message, RabbitMQ will mark it for deletion. When a new consumer joins the queue, RabbitMQ can still read the messages in the message log and push them to the queue in order of message number.

After the consumer has finished consuming a certain message, but the consumer is down, the message still remains in the message log. RabbitMQ simply marks the message as not deleted until all consumers have consumed the message. RabbitMQ then deletes the message and the corresponding file from disk. When the message backlog is large, the disk usage may be large. RabbitMQ provides plug-ins to compress message logs to reduce disk space usage.

Guess you like

Origin blog.csdn.net/universsky2015/article/details/133502538