The interviewer ask me how to ensure RocketMQ message is not lost, this time I laughed!

Recently read @JavaGuide published an article , "the interviewer asked me how to ensure that the message is not lost Kafka? I cried! " This article to undertake this topic to talk about how to ensure RocketMQ not lost messages.

0x00. Message transmission process

A message is from production to consumption, we will go through three stages:

  • Production stage, Producer new message, then the message delivered by the network to the MQ Broker
  • Storage stage, the message will be stored in the disk end Broker
  • Message phase, Consumer will pull message from the Broker

Any of these stages may be lost messages, as long as we find these three stages reasons lost messages, using reasonable way to avoid the loss, you can solve the problem of lost messages.

0x01. Production stage

Producers (Producer) Broker sends a message to the network, when the received Broker will return an acknowledgment response message to the Producer. So long as the producer receives the returned acknowledgment response message on behalf not lost at the production stage.

Send message RocketMQ following sample code:

DefaultMQProducer mqProducer=new DefaultMQProducer("test");
// 设置 nameSpace 地址
mqProducer.setNamesrvAddr("namesrvAddr");
mqProducer.start();
Message msg = new Message("test_topic" /* Topic */,
        "Hello World".getBytes(RemotingHelper.DEFAULT_CHARSET) /* Message body */
);
// 发送消息到一个Broker
try {
    SendResult sendResult = mqProducer.send(msg);
} catch (RemotingException e) {
    e.printStackTrace();
} catch (MQBrokerException e) {
    e.printStackTrace();
} catch (InterruptedException e) {
    e.printStackTrace();
}

sendThe method is a synchronous operation, as long as this method does not throw any exceptions, on behalf of the message has been sent successfully .

Message has been sent successfully to the message on behalf of only end Broker, Broker in different configurations, it may return a different state in response to:

  • SendStatus.SEND_OK
  • SendStatus.FLUSH_DISK_TIMEOUT
  • SendStatus.FLUSH_SLAVE_TIMEOUT
  • SendStatus.SLAVE_NOT_AVAILABLE

Cited official status descriptions:

image-20200319220927210

Different broker terminal figure above configuration will be explained in detail hereinafter

Further RocketMQ also provides asynchronous transmission mode, adapted to link time-consuming, more sensitive to response time of the business scene.

DefaultMQProducer mqProducer = new DefaultMQProducer("test");
// 设置 nameSpace 地址
mqProducer.setNamesrvAddr("127.0.0.1:9876");
mqProducer.setRetryTimesWhenSendFailed(5);
mqProducer.start();
Message msg = new Message("test_topic" /* Topic */,
        "Hello World".getBytes(RemotingHelper.DEFAULT_CHARSET) /* Message body */
);

try {
    // 异步发送消息到,主线程不会被阻塞,立刻会返回
    mqProducer.send(msg, new SendCallback() {
        @Override
        public void onSuccess(SendResult sendResult) {
            // 消息发送成功,
        }

        @Override
        public void onException(Throwable e) {
            // 消息发送失败,可以持久化这条数据,后续进行补偿处理
        }
    });
} catch (RemotingException e) {
    e.printStackTrace();
} catch (InterruptedException e) {
    e.printStackTrace();
}

Asynchronous message sending must be noted that override callback method, to check the transmission result in the callback method.

Whether synchronous or asynchronous way, you will encounter network problems leading to send failure. In view of this situation, we can set a reasonable number of retries when network problems, you can automatically retry. Set as follows:

// 同步发送消息重试次数,默认为 2
mqProducer.setRetryTimesWhenSendFailed(3);
// 异步发送消息重试次数,默认为 2
mqProducer.setRetryTimesWhenSendAsyncFailed(3);

0x02. Broker storage stage

By default, as long as the message to the terminal Broker will preferentially saved to memory, and then immediately returns a confirmation response to the producer. Then Broker periodic batch of a group of asynchronous messages from memory to disk brush.

In this way reducing I / O times, you can achieve better performance, but if the machine power failure occurs, downtime and other abnormal situations, the message has not been timely brush into the disk, the case of a lost message will appear.

To ensure Broker client does not lose the message, to ensure reliability of the message, the message we need to modify the synchronization mechanism to save disk brush way, the message storage disks successful , will return a response.

Broker side configuration modified as follows:

## 默认情况为 ASYNC_FLUSH 
flushDiskType = SYNC_FLUSH 

If Broker is not time synchronized within the disk brush ( default 5S ) complete brush plate, it will return SendStatus.FLUSH_DISK_TIMEOUTthe state to the producer.

Cluster deployment

In order to ensure the availability, Broker usually a master ( Master ) from multiple ( Slave ) deployment. In order to ensure that messages are not lost, the message also needs to be copied to the slave node.

By default mode, message writing master the success, you may return acknowledgment to the producer, then the message will be copied to the asynchronous slave node.

NOTE: master configuration: flushDiskType = SYNC_FLUSH

At this point if the master suddenly downtime and unrecoverable , it has not been copied to the slave of the message will be lost.

In order to further improve the reliability of the message, we can use synchronous replication, Master node will wait for synchronization slave node replication is complete, it will return confirmation response.

Asynchronous and synchronous replication replication FIG differences are as follows:

From the network

Note: Please do not be misled on the map, broker master copy can configure only one way, only to explain the concept of synchronous replication and asynchronous replication on the map.

Replication Broker master node configured as follows:

## 默认为 ASYNC_MASTER 
brokerRole=SYNC_MASTER

If a slave node is not synchronized response is returned within a specified time, the producer will receive a SendStatus.FLUSH_SLAVE_TIMEOUTreturn status.

summary

Combined production stage and storage stage, if need strictly to ensure the message is not lost , broker needs to adopt the following configuration:

## master 节点配置
flushDiskType = SYNC_FLUSH
brokerRole=SYNC_MASTER

## slave 节点配置
brokerRole=slave
flushDiskType = SYNC_FLUSH

At the same time we also need this process with producers, to determine whether the return status is SendStatus.SEND_OK. If other states, we need to consider compensation and try again.

While the above configuration improves the high reliability of the message, but will reduce performance , production practice requires comprehensive selection.

0x03. Consumption stage

Consumers pull message from the broker, and then performs the corresponding service logic. Once executed successfully, it will return to ConsumeConcurrentlyStatus.CONSUME_SUCCESSthe state to the Broker.

If you do not receive consumption Broker acknowledgment or receipt of other states, consumers will pull the next piece of news again, try again. In such a way to effectively prevent the abnormal process occurs consumer spending, or the message is lost in network transmission.

Message consumer code is as follows:

// 实例化消费者
DefaultMQPushConsumer consumer = new DefaultMQPushConsumer("test_consumer");

// 设置NameServer的地址
consumer.setNamesrvAddr("namesrvAddr");

// 订阅一个或者多个Topic,以及Tag来过滤需要消费的消息
consumer.subscribe("test_topic", "*");
// 注册回调实现类来处理从broker拉取回来的消息
consumer.registerMessageListener(new MessageListenerConcurrently() {
    @Override
    public ConsumeConcurrentlyStatus consumeMessage(List<MessageExt> msgs, ConsumeConcurrentlyContext context) {
        // 执行业务逻辑
        // 标记该消息已经被成功消费
        return ConsumeConcurrentlyStatus.CONSUME_SUCCESS;
    }
});
// 启动消费者实例
consumer.start();

More news consumption process, we need to pay attention to the message status is returned . Only when the real business logic is executed successfully, we can return ConsumeConcurrentlyStatus.CONSUME_SUCCESS. Otherwise, we need to go back ConsumeConcurrentlyStatus.RECONSUME_LATERand try again later.

0x04. Summary

After reading the message is not lost RocketMQ approach, look back on this Kafka used to live , there is not found, the two resolved idea is the same, the difference is just not the same configuration parameters.

So the next time, ask the interviewer how you XX message queues to ensure that the message is not lost? If you have not used the message queue, do not cry, smile face him, calmly gave his analysis that steps will be lost, and roughly Solutions.

Finally, we can also tell our thinking, although improving the reliability of the message, but the message may lead to retransmission, repeated consumption. So for consumer clients, they need to be taken to ensure idempotency .

But pay attention, then the interviewer might tell you the topic, let's talk about how to ensure you idempotency, must decide before you say oh.

What? You do not know how to achieve power and so on? Then quickly focus ** @ Interpreter program , we'll talk about later in the article idempotent ** this topic.

0x05. Reference

One last word (seeking attention)

Caishuxueqian, it is inevitable there will be flaws, if you find the wrong place, please leave a message and pointed out to me, and I modify them.

Thanks again for your reading, I was downstairs small Heige , a tool has not yet bald ape, the next article we will meet ~

I welcome the attention of the public number: Interpreter program, and getting daily dry push. If you are interested in my topic content, you can focus on my blog: studyidea.cn

Guess you like

Origin www.cnblogs.com/goodAndyxublog/p/12563813.html
Recommended