RocketMQ news consumption, and progress management and analysis

Recent news ONS accumulation is very serious, and often found that some consumers almost no news consumption also suggest that accumulation, it is necessary to look further into
RocketMQ design ideas, take a look at how to calculate the amount of accumulation, as well as how to properly use and Consumer Topic and other components.

Problems arising from the background is that since the beginning to RocketMQnot know enough about, but enough to bother reasons, resulting in all of our businesses have only one topic, all the different lines of business by subscribing tagto the consumer, will carry out this in-depth understanding of business reconstruction, use the correct posture RocketMQ.

This minor problem investigation include:
1, message pull model, whether the consumer will be non-news message also pulled to the client?
2, how to calculate accumulation?

The essential problem Question 1 is to filter messages pull model is that the client or on the server? The essential problem Question 2 is to calculate how to store the message? To explore the question of the need to clear RocketMQthe underlying storage model design, the message queue from the top-level design overlooking the entire frame.

The underlying storage model

4216420-ac32f0a5484c84d9.png
Taken RocketMQ Inside .png

commitlogThe entire message queue file is stored in the core, while consumerququea logical message queue, the main storage commitlog offset, , 消息长度, tag的hashcodefor positioning messages as fast consumption commit logfile location, easy to read the message. IndexFileCommonly known as the index file, the main storage 消息key的hashcodeand commitlog offsetfor quickly target key in a message to the commit logfile location, easy to read the message.

Message pull Model

What to do when the answer to your question before 1, first thought the message queue delivery?

DefaultMQProducer producer = new DefaultMQProducer("please_rename_unique_group_name");
// Specify name server addresses.
producer.setNamesrvAddr("localhost:9876");
//Launch the instance.
producer.start();
for (int i = 0; i < 100; i++) {
    //Create a message instance, specifying topic, tag and message body.
    Message msg = new Message("TopicTest" /* Topic */,
        "TagA" /* Tag */,
        ("Hello RocketMQ " +
            i).getBytes(RemotingHelper.DEFAULT_CHARSET) /* Message body */
    );
    //Call send message to deliver message to one of brokers.
    SendResult sendResult = producer.send(msg);
    System.out.printf("%s%n", sendResult);
}

These are the codes are from the official website address of the copy from, although the basic conditions for a simple but sufficient from which to find the message delivered when required, including namesrvAddr, topic, tag.

Message delivery

// DefaultProducerImpl#sendDefaultImpl()
// 省略大部分代码,关键看备注部分
TopicPublishInfo topicPublishInfo = this.tryToFindTopicPublishInfo(msg.getTopic());// 从本地缓存或namesrv远程读取topic信息
        if (topicPublishInfo != null && topicPublishInfo.ok()) {
            boolean callTimeout = false;
            MessageQueue mq = null;
            Exception exception = null;
            SendResult sendResult = null;
            int timesTotal = communicationMode == CommunicationMode.SYNC ? 1 + this.defaultMQProducer.getRetryTimesWhenSendFailed() : 1;
            int times = 0;
            String[] brokersSent = new String[timesTotal];
            for (; times < timesTotal; times++) {
                String lastBrokerName = null == mq ? null : mq.getBrokerName();‘
                // 根据某种策略选择一个逻辑消息队列
                MessageQueue mqSelected = this.selectOneMessageQueue(topicPublishInfo, lastBrokerName);

You can see from the text in the message delivery process, it has been found by some specified policy in the Clients topiclogical queues, logical queue refer specifically to consumerqueuedeal with the file, the server corresponding mainly written specifically interested can understand the SendMessageProcessorclass, finally passed DefaultMessageStoreto achieve the writing of data, but did not see written consumerqueue, because the implementation consumerqueuefile write is done by another thread, please refer to the specific implementation ReputMessageService, this is no longer deeply.

We mainly know that in addition to the client to upload data outside of the basic properties, while also choosing a client logical message queue to be written.

Pull message

Message of pulling the client will not be a repeat, mainly to see the realization of the service side. Interested can understand PullMessageService#run(). Check the focus on the service sidePullMessageProcessor#processRequest()


MessageFilter messageFilter;
if (this.brokerController.getBrokerConfig().isFilterSupportRetry()) {
    messageFilter = new ExpressionForRetryMessageFilter(subscriptionData, consumerFilterData,
        this.brokerController.getConsumerFilterManager());
} else {
         // 构建消息过滤
    messageFilter = new ExpressionMessageFilter(subscriptionData, consumerFilterData,
        this.brokerController.getConsumerFilterManager());
}

// 消息过滤的核心源码在ExpressionMessageFilter#isMatchedByConsumeQueue方法
@Override
public boolean isMatchedByConsumeQueue(Long tagsCode, ConsumeQueueExt.CqExtUnit cqExtUnit) {
    if (null == subscriptionData) {
        return true;
    }

    if (subscriptionData.isClassFilterMode()) {
        return true;
    }

    // by tags code.
    if (ExpressionType.isTagType(subscriptionData.getExpressionType())) {

        if (tagsCode == null) {
            return true;
        }

        if (subscriptionData.getSubString().equals(SubscriptionData.SUB_ALL)) {
            return true;
        }
                // tagecode其实就是tag的hashcode
        return subscriptionData.getCodeSet().contains(tagsCode.intValue());
    }
  /// ....
}


// 接着PullMessageProcessor#processRequest()往下看
final GetMessageResult getMessageResult =
            this.brokerController.getMessageStore().getMessage(requestHeader.getConsumerGroup(), requestHeader.getTopic(),
                requestHeader.getQueueId(), requestHeader.getQueueOffset(), requestHeader.getMaxMsgNums(), messageFilter);
// 注意该消息读取的参数,包括topic, queueid, queueoffset, 已经消息最大条数

// 通过DefaultMessageStore#getMessage()继续查看
// 注意,这里的offset是queueoffset,而不是commitlog offset
public GetMessageResult getMessage(final String group, final String topic, final int queueId, final long offset,
    
    // ...
    // 查找consumerqueue
    ConsumeQueue consumeQueue = findConsumeQueue(topic, queueId);
    
    // 
    SelectMappedBufferResult bufferConsumeQueue = consumeQueue.getIndexBuffer(offset);
    if (bufferConsumeQueue != null) {
        try {
            status = GetMessageStatus.NO_MATCHED_MESSAGE;

            long nextPhyFileStartOffset = Long.MIN_VALUE;
            long maxPhyOffsetPulling = 0;

            int i = 0;
            final int maxFilterMessageCount = Math.max(16000, maxMsgNums * ConsumeQueue.CQ_STORE_UNIT_SIZE);
            final boolean diskFallRecorded = this.messageStoreConfig.isDiskFallRecorded();
            ConsumeQueueExt.CqExtUnit cqExtUnit = new ConsumeQueueExt.CqExtUnit();
            for (; i < bufferConsumeQueue.getSize() && i < maxFilterMessageCount; i += ConsumeQueue.CQ_STORE_UNIT_SIZE) {
                long offsetPy = bufferConsumeQueue.getByteBuffer().getLong();
                int sizePy = bufferConsumeQueue.getByteBuffer().getInt();
                long tagsCode = bufferConsumeQueue.getByteBuffer().getLong();
                                /// .....
                                // 消息匹配,这个对象由前文的MessageFilter定义
                if (messageFilter != null
                    && !messageFilter.isMatchedByConsumeQueue(isTagsCodeLegal ? tagsCode : null, extRet ? cqExtUnit : null)) {
                    if (getResult.getBufferTotalSize() == 0) {
                        status = GetMessageStatus.NO_MATCHED_MESSAGE;
                    }

                    continue; //不匹配的消息则继续往下来读取
                }
 
                SelectMappedBufferResult selectResult = this.commitLog.getMessage(offsetPy, sizePy);// offsetPy与sizePy查找commitlog上存储的消息内容
                
  ///....
}

After reading the above source, issue 1 proved unfounded, filtering on the server good news, but it is clear that access to complete source code can be clearly determined, not every pull message can filter messages you want to that the consumer may pull message when at a certain comsumerqueuepull on less than the message, because filled with the same topicother at tagthe news, which means not always make sense to pull, and Ali cloud ONSbilling tips on significant pull message to costed.

Message accumulation

Message accumulation means the consumer to maintain the progress of the server message.

First look at a view of FIG brokerOffset - consumerOffset = diffTotal, and diffTotal refers accumulation amount, the accumulation amount is the descriptor for the number of messages.

4216420-0eec2265bf67d35e.png
image.png

Commitlog from the point of view, since a large number of messages stored in the file, and non-sequential message consumer consumption, which in turn is difficult to see from commitlog
which consumer accumulation amount.

So where can a clear description of the number of news? First to gain insight into Consumer Queuethe design

ConsumerQueue

consumerqueueThe design topicof each logical partition as topicthe plurality of divided message queue, the message queue storing reference specific number of brokerconfiguration parameters, the queue name to zero the array, for example four arranged 0,1,2,3 message queue.

Please refer to the configuration parameters BrokerConfig, a parameter which hasprivate int defaultTopicQueueNums = 8;

Understanding semantically deposit of consumption should not refer to the existence of brokerthe number of messages on, which is the basic knowledge.

commitlogStores brokerall messages on Imagine if every time you want to query the message and consumption needs from the file traversal query performance difference may want
conceivable, in order to improve the query message, the priority is to think of such MySQLan index on the design. Similarly, consumerqueuethe beginning of the design is to
quickly navigate to the corresponding consumer can consume the news, of course, RocketMQalso provided indexfile, commonly known as index files, mainly resolved through key
way to quickly locate messages.

consumerqueue message structure

4216420-7d0a640f8683b091.png
Taken RocketMQ Inside .png

In the consumerqueuestructural design, the consumequeueentry is a fixed design, and it corresponds to a whole good message. consumerqueueThe default is a single file entries 30w, 30w * 20 single byte file length. As can be seen from the file storage model, consumerqueuestorage dimension it is topic, is not consumer. So how to find the consumeramount of accumulation?

Hypothesis

Suppose a topiccorresponding consumer, topicdeposited i.e. the amount consumerof accumulation amount. From this dimension to reasoning, the previously mentioned part consumeris hardly news, but the accumulation prompt message that is reasonable, because the news is not the accumulation consumerof need to consume news, but the consumerqueuecorresponding topicaccumulation

The demonstration process

From rocketmq consolethe number of stacked background see the consumer to see AdminBrokerProcess#getConsumeStats().


private RemotingCommand getConsumeStats(ChannelHandlerContext ctx,   
    // ...
    for (String topic : topics) {
        // ...
        for (int i = 0; i < topicConfig.getReadQueueNums(); i++) {
            MessageQueue mq = new MessageQueue();
            mq.setTopic(topic);
            mq.setBrokerName(this.brokerController.getBrokerConfig().getBrokerName());
            mq.setQueueId(i);

            OffsetWrapper offsetWrapper = new OffsetWrapper();
            // 核心的问题在于要确定brokerOffset 以及consumerOffset的语义
            long brokerOffset = this.brokerController.getMessageStore().getMaxOffsetInQueue(topic, i);
            if (brokerOffset < 0)
                brokerOffset = 0;

            long consumerOffset = this.brokerController.getConsumerOffsetManager().queryOffset(
                requestHeader.getConsumerGroup(),
                topic,
                i);
            if (consumerOffset < 0)
                consumerOffset = 0;

    // ....
}

// 队列最大索引
public long getMaxOffsetInQueue(String topic, int queueId) {
    ConsumeQueue logic = this.findConsumeQueue(topic, queueId);
    if (logic != null) {
        long offset = logic.getMaxOffsetInQueue();
        return offset;
    }
    return 0;
}
public long getMaxOffsetInQueue() {
    return this.mappedFileQueue.getMaxOffset() / CQ_STORE_UNIT_SIZE;
  // 总的逻辑偏移量 / 20 = 总的消息条数
}

public static final int CQ_STORE_UNIT_SIZE = 20;// 前文提到每个条目固定20个字节

// 当前消费者的消费进度
long consumerOffset = this.brokerController.getConsumerOffsetManager().queryOffset(requestHeader.getConsumerGroup(),topic,i);
if (consumerOffset < 0)
    consumerOffset = 0;
    public long queryOffset(final String group, final String topic, final int queueId) {
    // topic@group
    String key = topic + TOPIC_GROUP_SEPARATOR + group;
    ConcurrentMap<Integer, Long> map = this.offsetTable.get(key);// 从offsetTable中读取
    if (null != map) {
        Long offset = map.get(queueId);
        if (offset != null)
            return offset;
    }
    return -1;
}

The core problem is that the cache is read out from the offset, then the offset of the data is coming from?


// 通过IDE快速可以很快找到如下代码
@Override
public String configFilePath() {
    return
    BrokerPathConfigHelper.getConsumerOffsetPath(this.brokerController.getMessageStoreConfig().
        getStorePathRootDir());
}
@Override
public void decode(String jsonString) {
    if (jsonString != null) {
        ConsumerOffsetManager obj = RemotingSerializable.fromJson(jsonString,ConsumerOffsetManager.class);
        if (obj != null) {
            this.offsetTable = obj.offsetTable;
        }
    }
}
public static String getConsumerOffsetPath(final String rootDir) {
    return rootDir + File.separator + "config" + File.separator + "consumerOffset.json";
}

That offset data is loaded from json file in coming.

4216420-e363be77f0c45bed.png
image.png

This document describes the relationship between the topic and consumers, each queue corresponding consumption schedule. But consumption is updated in real time, so the consumer must be updated in real time progress, the progress of the updating of the consumer is made to pull from the message.

DefaultStoreMessage

Previously seen part of the code of the class, is part of the main pull, where supplementary offset when pulling worth semantics.


ConsumeQueueExt.CqExtUnit cqExtUnit = new ConsumeQueueExt.CqExtUnit();
for (; i < bufferConsumeQueue.getSize() && i < maxFilterMessageCount; i += ConsumeQueue.CQ_STORE_UNIT_SIZE) {
    long offsetPy = bufferConsumeQueue.getByteBuffer().getLong();
    int sizePy = bufferConsumeQueue.getByteBuffer().getInt();
    long tagsCode = bufferConsumeQueue.getByteBuffer().getLong();

    // ...
        // offsetPy 是commitlog的逻辑偏移量
    SelectMappedBufferResult selectResult = this.commitLog.getMessage(offsetPy, sizePy);
    if (null == selectResult) {
        if (getResult.getBufferTotalSize() == 0) {
            status = GetMessageStatus.MESSAGE_WAS_REMOVING;
        }

        nextPhyFileStartOffset = this.commitLog.rollNextFile(offsetPy);
        continue;
    }
        // 消息过滤
    if (messageFilter != null
        && !messageFilter.isMatchedByCommitLog(selectResult.getByteBuffer().slice(), null)) {
        if (getResult.getBufferTotalSize() == 0) {
            status = GetMessageStatus.NO_MATCHED_MESSAGE;
        }
        // release...
        selectResult.release();
        continue;
    }
    // ....
}

// ...
//
// 计算下一次开始的offset,是前文的offset
// i 是ConsumeQueue.CQ_STORE_UNIT_SIZE的倍数
// ConsumeQueue.CQ_STORE_UNIT_SIZE是每一条consumerqueue中的条目的大小,20字节
nextBeginOffset = offset + (i / ConsumeQueue.CQ_STORE_UNIT_SIZE);

long diff = maxOffsetPy - maxPhyOffsetPulling;
long memory = (long) (StoreUtil.TOTAL_PHYSICAL_MEMORY_SIZE
    * (this.messageStoreConfig.getAccessMessageInMemoryMaxRatio() / 100.0));
getResult.setSuggestPullingFromSlave(diff > memory);

See here, you can clearly when consumers pull message nextBeginOffsetis consumerqueueoffset / 20, which means a similar array index index.
To pull here to make a reconfirmation of the consumer is not updated to progress to offsetTable? Look core RemoteBrokerOffsetStoreclass

Consumer news

Posted a few pictures simple process to understand the progress of the consumption reported by the client

4216420-a3ececdf49257e50.png
image.png
4216420-e69e7ee48b22de77.png
image.png
4216420-49d2293d61e6e8eb.png
image.png

At this point, you can see the actual accumulation amount is to count according to topic, in accordance with the foregoing the beginning of the assumptions inferred fact is true, so now those messages no news accumulation of accumulation also shows why it is understandable.

to sum up

Consumer news service belongs end filtration mode, but in fact, but also other 消息过滤模式, but did not mention this article ( Class). However, due to topicunreasonable results in the use of pulling not possible message data, but the calculated fee is ONS. At the same time accumulation of clear meaning of the message, then use the RocketMQgesture is self-evident, rational use by business topicand tagso on.

Reference material

Source: https://github.com/apache/rocketmq
official website: http://rocketmq.apache.org/docs/rmq-deployment/
books: "RocketMQ Technology Insider," especially recommend the book, so you RocketMQarchitecture design , a deeper understanding of the code

Reproduced in: https: //www.jianshu.com/p/fcfc662368f4

Guess you like

Origin blog.csdn.net/weixin_34355559/article/details/91246398