MQ selection in project development

0, summary

RocketMQ entry to soil (1) Principles and actual combat that novices can understand!

RocketMQ entry to land (two) transaction message & sequence message

From entry to soil (3) How does RocketMQ ensure that the message is not lost?

RocketMQ entry to the soil (four) the source code analysis of producer production news

RocketMQ entry to soil (five) message persistent storage source code analysis

RocketMQ entry to the soil (6) What are the algorithms for selecting queue when sending messages?

RocketMQ entry to land (7) Why does the same consumer group set different tags will appear strange phenomenon

From entry to soil (8) How does RocketMQ's Consumer do load balancing

From getting started to getting into the soil (9) teach you how to build a RocketMQ dual-master dual-slave synchronization cluster, don't believe you can't learn it!

From entry to soil (10) RocketMQ cluster process and core concepts

1. Tell me what message middleware is used in your company's online production environment?

See [2, how to select multiple mq?

2. How to select multiple mqs?

SQM description
RabbitMQ Erlang development, the support for message accumulation is not good, when a large number of messages are backlogged, it will cause the performance of RabbitMQ to drop sharply. It can process tens of thousands to hundreds of thousands of messages per second.
RocketMQ Java development, rich in Internet-oriented clustering functions, and a lot of optimizations on the response delay of online services. In most cases, it can achieve millisecond-level response, and it can process hundreds of thousands of messages per second.
Kafka Scala development, rich log-oriented functions, the highest performance. In your business scenario, when the number of messages per second is not that many, Kafka's latency will be relatively high. Therefore, Kafka is not suitable for online business scenarios.
ActiveMQ Java development is simple, stable, and performance is not as good as the previous three. It's ok for small systems, but it's not recommended. It is recommended to use the mainstream Internet.

3. Why use MQ?

Because the project is relatively large and a distributed system is built, all remote service invocation requests are executed synchronously and often problems occur, so mq is introduced

effect description
Decoupling The degree of system coupling is reduced, and there is no strong dependency
asynchronous Remote calls that do not need to be executed synchronously can effectively improve response time
Peak clipping After the request reaches the peak, the back-end service can still maintain a fixed consumption rate and will not be overwhelmed.

4. What roles does RocketMQ consist of, and what are the functions and characteristics of each role?

Character effect
Nameserver Stateless, dynamic list; this is also one of the important differences from zookeeper. Zookeeper is stateful.
Producer The message producer is responsible for sending messages to the Broker.
Broker It is MQ itself, responsible for sending and receiving messages, persisting messages, etc.
Consumer Message consumers are responsible for pulling messages from Broker for consumption, and ack after consumption.

5. What is the difference between Topic and JMS queue in RocketMQ?

Queue is the FIFO queue derived from the data structure. Topic is an abstract concept. The bottom layer of each topic corresponds to N queues, and the data actually exists on the queue.

6. Will the messages in RocketMQ Broker be deleted immediately after being consumed?

No, every message will be persisted to CommitLog. After each Consumer connects to the Broker, it will maintain consumption progress information. When a message is consumed, only the current Consumer's consumption progress (CommitLog offset) is updated.

Follow-up: So will the news accumulate? When to clean up expired messages?

The 4.6 version will delete CommitLog files that are no longer in use after 48 hours by default

  • Check the last access time of this file

  • Determine whether it is greater than the expiration time

  • Delete at the specified time, the default is 4 o'clock in the morning

The source code is as follows:

/**
 * {@link org.apache.rocketmq.store.DefaultMessageStore.CleanCommitLogService#isTimeToDelete()}
 */
private boolean isTimeToDelete() {
    // when = "04";
    String when = DefaultMessageStore.this.getMessageStoreConfig().getDeleteWhen();
    // 是04点,就返回true
    if (UtilAll.isItTimeToDo(when)) {
        return true;
    }
 // 不是04点,返回false
    return false;
}

/**
 * {@link org.apache.rocketmq.store.DefaultMessageStore.CleanCommitLogService#deleteExpiredFiles()}
 */
private void deleteExpiredFiles() {
    // isTimeToDelete()这个方法是判断是不是凌晨四点,是的话就执行删除逻辑。
    if (isTimeToDelete()) {
        // 默认是72,但是broker配置文件默认改成了48,所以新版本都是48。
        long fileReservedTime = 48 * 60 * 60 * 1000;
        deleteCount = DefaultMessageStore.this.commitLog.deleteExpiredFile(72 * 60 * 60 * 1000, xx, xx, xx);
    }
}
                                                                       
/**
 * {@link org.apache.rocketmq.store.CommitLog#deleteExpiredFile()}
 */
public int deleteExpiredFile(xxx) {
    // 这个方法的主逻辑就是遍历查找最后更改时间+过期时间,小于当前系统时间的话就删了(也就是小于48小时)。
    return this.mappedFileQueue.deleteExpiredFileByTime(72 * 60 * 60 * 1000, xx, xx, xx);
}

7. What are the consumption modes of RocketMQ?

The consumption model is determined by Consumer, and the consumption dimension is Topic.

  • Cluster consumption

1. A message will only be consumed by one Consumer in the same Group

2. When multiple groups consume a topic at the same time, each group will have a Consumer to consume the data

  • Broadcast consumption

The message will be consumed for each Consumer instance under a Consumer Group. That is, even if these Consumers belong to the same Consumer Group, the message will be consumed once by each Consumer in the Consumer Group.

8. Is the consumer news push or pull?

RocketMQ has no real meaning of push, which are all pulls. Although there are push classes, the actual underlying implementation uses the long polling mechanism , that is, the pull method.

The broker property longPollingEnable marks whether long polling is enabled. Default on

The source code is as follows:

// {@link org.apache.rocketmq.client.impl.consumer.DefaultMQPushConsumerImpl#pullMessage()}
// 看到没,这是一只披着羊皮的狼,名字叫PushConsumerImpl,实际干的确是pull的活。

// 拉取消息,结果放到pullCallback里
this.pullAPIWrapper.pullKernelImpl(pullCallback);

Follow-up: Why do you want to take the initiative to pull messages instead of using event monitoring methods?

The event-driven method is to establish a long connection and push it in real time by means of events (sending data).

If the broker actively pushes messages, it is possible that the push speed is fast and the consumption speed is slow, which will cause the message to accumulate too much on the consumer side, and at the same time, it will not be consumed by other consumers. The pull method can be pulled according to the current situation, without causing too much pressure and causing a bottleneck. So the method of pull is adopted.

9. How does the broker handle pull requests?

Consumer requests Broker for the first time

  • Whether there are eligible messages in the Broker

  • Yes->

    • Responding to Consumer

    • Waiting for the next Consumer request

  • No

    • DefaultMessageStore#ReputMessageService#run方法

    • PullRequestHoldService connects to Hold, and it is executed every 5s to check whether there are any messages in the pullRequestTable, and push them immediately if there are any.

    • Check whether there are new messages in commitLog every 1ms, and write them to pullRequestTable if there are any

    • Return request when there is a new message

    • Suspend the request of the consumer, that is, neither disconnect nor return data

    • Use consumer's offset,

10. How does RocketMQ do load balancing?

Realize distributed storage in multiple Brokers through Topic.

producer端

The sender specifies the message queue to send the message to the corresponding broker to achieve load balancing when writing

  • Improve write throughput. When multiple producers write data to a broker at the same time, performance will decrease

  • Messages are distributed among multiple brokers to prepare for load consumption

The default strategy is to choose randomly:

  • The producer maintains an index

  • Each time the node is taken, it will increase automatically

  • index takes the remainder of all brokers

  • Built-in fault tolerance strategy

Other implementations:

  • SelectMessageQueueByHash

    • The hash is the incoming args

  • SelectMessageQueueByRandom

  • SelectMessageQueueByMachineRoom is not implemented

You can also customize the select method in the MessageQueueSelector interface

MessageQueue select(final List<MessageQueue> mqs, final Message msg, final Object arg);

consumer端

The average distribution algorithm is used for load balancing.

Other load balancing algorithms

Average distribution strategy (default) (AllocateMessageQueueAveragely) Circular distribution strategy (AllocateMessageQueueAveragelyByCircle) Manual configuration distribution strategy (AllocateMessageQueueByConfig) Computer room distribution strategy (AllocateMessageQueueByMachineRoom) Consistent hash distribution strategy (AllocateMessageQueueConsistentHash) Near computer room strategy (AllocateMessageQueueByMachineRoom)

Follow-up: What happens when the consumer load balancer and queue are not equal?

Consumers and queues will be allocated equally first. If the number of Consumers is less than the number of queues, some consumers will consume multiple queues. If the number of Consumers is equal to the number of queues, then one Consumer will consume one queue. If the number of Consumers is greater than The number of queues, then some Consumers will be spared, which is wasted.

11. Repeated consumption of messages

An important reason that affects the normal sending and consumption of messages is the uncertainty of the network.

Causes of repeated consumption

  • ACK

Under normal circumstances, after the consumer actually consumes the message, it should send an ack to notify the broker that the message has been consumed normally, and remove it from the queue

When the ack cannot be sent to the broker due to network reasons, the broker will think that the entry message has not been consumed, and then the message retransmission mechanism will be activated to deliver the message to the consumer again

  • Consumption pattern

In the CLUSTERING mode, the message will be guaranteed to be consumed by consumers of the same group once in the broker, but consumers of different groups will be pushed multiple times

solution

  • Database Table

Before processing the message, use the message primary key to insert in the constrained field in the table

  • Map

You can use map ConcurrentHashMap  -> putIfAbsent guava cache in stand-alone  mode

  • Redis

Distributed locks are up.

12. How to make RocketMQ guarantee the sequential consumption of messages

When you use message middleware for online business, do you need to ensure the order of messages?

If there is no need to guarantee the order of messages, why not? If I have a scenario where I want to guarantee the order of messages, how should you guarantee it?

First, multiple queues can only guarantee the order in a single queue. The queue is a typical FIFO, natural order. Consumption of multiple queues at the same time cannot absolutely guarantee the orderliness of messages. So the summary is as follows:

The same topic, the same QUEUE, one thread sends a message when sending a message, and one thread consumes a message in a queue when consuming.

Follow-up: How to ensure that messages are sent to the same queue?

Rocket MQ provides us with the MessageQueueSelector interface, you can rewrite the interface inside and implement your own algorithm. To give the simplest example: judge i % 2 == 0, then put it in queue1, otherwise put it in queue2.

for (int i = 0; i < 5; i++) {
    Message message = new Message("orderTopic", ("hello!" + i).getBytes());
    producer.send(
        // 要发的那条消息
        message,
        // queue 选择器 ,向 topic中的哪个queue去写消息
        new MessageQueueSelector() {
            // 手动 选择一个queue
            @Override
            public MessageQueue select(
                // 当前topic 里面包含的所有queue
                List<MessageQueue> mqs,
                // 具体要发的那条消息
                Message msg,
                // 对应到 send() 里的 args,也就是2000前面的那个0
                Object arg) {
                // 向固定的一个queue里写消息,比如这里就是向第一个queue里写消息
                if (Integer.parseInt(arg.toString()) % 2 == 0) {
                    return mqs.get(0);
                } else {
                    return mqs.get(1);
                }
            }
        },
        // 自定义参数:0
        // 2000代表2000毫秒超时时间
        i, 2000);
}

13. How does RocketMQ ensure that messages are not lost?

First of all, there may be lost messages in the following three parts:

  • Producer端

  • Broker side

  • Consumer端

13.1 How to ensure that the message is not lost on the Producer side

  • Use send() to send messages synchronously, and the sending result is synchronously aware.

  • You can retry after sending failed and set the number of retries. The default is 3 times.

producer.setRetryTimesWhenSendFailed(10);

  • Cluster deployment, for example, the reason for the failure of sending may be that the current Broker is down, and it will be sent to other Brokers when retrying.

13.2 How does the Broker end ensure that the message is not lost?

  • Modify the brushing strategy to synchronous brushing. By default, it is asynchronous flushing.

flushDiskType = SYNC_FLUSH

  • Cluster deployment, master-slave mode, high availability.

13.3. How does the Consumer end ensure that the message is not lost?

  • After the complete consumption is normal, a manual ack confirmation is performed.

14. How to deal with the message accumulation of rocketMQ

If the downstream consumer system goes down, causing a backlog of millions of messages in the message middleware, what should I do at this time?

Have you encountered a production failure with a backlog of news online? If you have not encountered it, how do you consider how to deal with it?

The first thing is to find out what caused the accumulation of messages, whether it is caused by too many Producers, too few Consumers, or other situations. In short, locate the problem first.

Then check whether the message consumption rate is normal. If it is normal, you can temporarily solve the problem of message accumulation by launching more consumers online

Follow-up: What should I do if the Consumer and Queue are not equal, and multiple units are online but cannot consume the accumulated messages in a short time?

  • Prepare a temporary topic

  • The number of queues is several times the accumulation

  • Queue is distributed to multiple Brokers

  • Launch a Consumer as a message porter, move the messages in the original Topic to the new Topic, do not do business logic processing, just move it over

  • N Consumers are online and consume the data in the temporary topic at the same time

  • Fix bug

  • Restore the original Consumer and continue to consume the previous Topic

Follow-up: The accumulation time is too long and the message timed out?

The message in RocketMQ will only disappear when the commitLog is deleted, and will not time out. That is to say, messages that have not been consumed will not be deleted over time.

Follow-up: Will the accumulated messages enter the dead letter queue?

No, the message will enter the retry queue (%RETRY%+ConsumerGroup) after the consumption fails, 18 times (default 18 times, all articles on the Internet say 16 times, without exception. But I don’t understand why it is 16 Second, isn’t it 18 hours?) before entering the dead letter queue (%DLQ%+ConsumerGroup).

The source code is as follows:

public class MessageStoreConfig {
    // 每隔如下时间会进行重试,到最后一次时间重试失败的话就进入死信队列了。
 private String messageDelayLevel = "1s 5s 10s 30s 1m 2m 3m 4m 5m 6m 7m 8m 9m 10m 20m 30m 1h 2h";
}

15. The underlying principle of RocketMQ's distributed transaction support mechanism?

Are you using RocketMQ? A great feature of RocketMQ is its support for distributed transactions. Tell me about the underlying principle of supporting this mechanism in distributed transactions?

Transactions in distributed systems can use TCC (Try, Confirm, Cancel), 2pc to solve the atomicity of messages in distributed systems

RocketMQ 4.3+ provides distributed transaction functions, through RocketMQ transaction messages, the final consistency of distributed transactions can be achieved

RocketMQ implementation method:

**Half Message: **Preprocess the message, when the broker receives such a message, it will be stored in the message consumption queue of RMQ_SYS_TRANS_HALF_TOPIC

**Check the transaction status: **Broker will start a timed task to consume messages in the RMQ_SYS_TRANS_HALF_TOPIC queue. Each time the task is executed, the transaction execution status (commit, rollback, unknown) will be confirmed to the message sender. If it is unknown, the Broker will Go to the callback regularly and recheck.

**Timeout: **If the number of checkbacks is exceeded, the message will be rolled back by default.

That is, he did not actually enter the Topic queue, but used a temporary queue to put the so-called half message, and after the transaction was submitted, the half message was actually transferred to the queue under the topic.

16. If you are asked to implement a distributed messaging middleware, how would you design and implement the overall architecture?

I personally think to answer from the following points:

  • Need to consider the ability to quickly expand and naturally support clusters

  • Persistent posture

  • High availability

  • Data 0 loss considerations

  • Easy to deploy on the server side and easy to use on the client side

17. Have you read the source code of RocketMQ? If you have read it, tell me your understanding of RocketMQ source code?

If you really want me to tell me, I will complain. First of all, there is no comment. Maybe Alibaba wrote a Chinese comment before. After donating it to apache, apache felt that the Chinese comment could not be kept, and I was too lazy to write English comments, so I gave them all. deleted. The more typical design patterns are singleton, factory, strategy, and facade patterns. Singleton factories are everywhere, and strategies are impressive. For example, when sending and consuming messages, the load balancing of queue is N strategy algorithm types, including random, hash, etc. This is also one of the necessary reasons to be able to quickly expand the natural support cluster. The persistence is also relatively complete, using CommitLog to place the disk, synchronously and asynchronously.

18.How to optimize the performance of producers and consumers under high throughput?

Development

  • Multi-machine deployment and parallel consumption under the same group

  • A single Consumer increases the number of consumer threads

  • Bulk consumption

    • Message batch pull

    • Business logic batch processing

Operation and maintenance

  • Network card tuning

  • jvm tuning

  • Multithreading and cpu tuning

  • Cache Page

19. Let me talk about how RocketMQ guarantees the high fault tolerance of data?

  • When fault tolerance is not enabled, poll the queue for sending. If it fails, filter the failed Broker when retrying

  • If the fault tolerance strategy is enabled, RocketMQ's prediction mechanism will be used to predict whether a Broker is available

  • If the Broker that failed last time is available, then the Broker's queue will still be selected

  • If the above situation fails, randomly select one to send

  • When sending a message, it will record the time of the call and whether an error is reported, and predict the available time of the broker based on the time

In fact, it is the choice of queue when sending a message. The source code is as follows:

org.apache.rocketmq.client.latency.MQFaultStrategy#selectOneMessageQueue()

20. What should I do if any Broker goes down suddenly?

Broker master-slave architecture and multi-copy strategy. After the master receives the message, it will synchronize it to the slave, so that there is more than one message, and the master is down and the message in the slave is available, which ensures the reliability and high availability of MQ. And Rocket MQ4.5.0 has supported the Dlegder mode since the beginning, based on the raft, to achieve a real sense of HA.

21. Which NameServer does Broker register its information with?

Asking this is obviously pitting you, because Broker will register its own information on all NameServers, not one, but every one, all!

Guess you like

Origin blog.csdn.net/My_SweetXue/article/details/107381108