Why use message queue? What are the advantages and disadvantages of message queues? What are the differences between Kafka, ActiveMQ, RabbitMQ, and RocketMQ, and what scenarios are they suitable for?

Why use message queue?

In fact, I want to ask you what are the usage scenarios of message queues, and then what are the specific scenarios in your project, and tell me how you use message queues in this scenario?

When the interviewer asks you this question, one of the expected answers is, what business scenario does your company have and what are the technical challenges in this business scenario. It might be very troublesome if you don’t use MQ, but now that you have used MQ, it has brought you Lots of benefits.

Let’s first talk about the common usage scenarios of message queues. In fact, there are many scenarios, but there are three core ones: decoupling , asynchronous , and peak clipping .

decoupling

Look at this scene. System A sends data to the three systems BCD and sends it through interface calls. What if the E system also needs this data? So what if the C system is no longer needed? The person in charge of System A almost collapsed...
Insert image description here

In this scenario, system A is severely coupled with various other messy systems. System A generates a relatively critical piece of data, and many systems require system A to send this data. System A must always consider BCDE. What should we do if the four systems fail? Do you want to resend or save the message? My hair is all white!

If MQ is used, system A generates a piece of data and sends it to MQ, and whichever system needs the data consumes it in MQ. If the new system needs data, just consume it directly from MQ; if a system no longer needs this data, just cancel the consumption of MQ messages. In this way, system A does not need to consider who to send data to, does not need to maintain this code, and does not need to consider whether the call is successful, failure timeout, etc.

Insert image description here

Summary : Through a model of MQ and Pub/Sub publishing and subscribing messages, system A is completely decoupled from other systems.

Interview skills : You need to consider whether there are similar scenarios in the system you are responsible for, that is, a system or a module calls multiple systems or modules. The calls between each other are very complicated and troublesome to maintain. But in fact, this call does not need to directly call the interface synchronously. If you use MQ to decouple it asynchronously, it is also possible. You need to consider whether you can use this MQ to decouple the system in your project. . Reflect this thing in your resume and use MQ for decoupling.

asynchronous

Let's look at another scenario. System A receives a request and needs to write the library locally. It also needs to write the library in the three BCD systems. It takes 3ms to write the library locally, and it takes 300ms, 450ms, and 200ms to write the library in the three BCD systems respectively. The final total request delay is 3 + 300 + 450 + 200 = 953ms, which is close to 1s. The user feels that something is being done, which is extremely slow. The user initiates a request through the browser and waits for 1 second, which is almost unacceptable.

Insert image description here

For general Internet companies, the general requirement for direct user operations is that each request must be completed within 200 ms, which is almost imperceptible to users.

If MQ is used , then system A continuously sends 3 messages to the MQ queue. If it takes 5ms, the total time from system A accepting a request to returning a response to the user is 3 + 5 = 8ms. For the user, it actually feels like Just click a button and return directly after 8ms. It’s great! The website is so well done and so fast!
Insert image description here

Peak clipping

From 0:00 to 12:00 every day, system A is calm, with 50 concurrent requests per second. As a result, every time from 12:00 to 13:00, the number of concurrent requests per second suddenly increases to 5k+. However, the system is directly based on MySQL, and a large number of requests pour into MySQL, and about 5k SQLs are executed on MySQL every second.

Generally, MySQL can handle 2k requests per second. If the number of requests per second reaches 5k, MySQL may be killed directly, causing the system to crash, and users will no longer be able to use the system.

But once the peak period is over, it becomes the off-peak period in the afternoon. There may be only 10,000 users operating on the website at the same time, and the number of requests per second may be only 50, which has almost no impact on the entire system. pressure.

Insert image description here

If MQ is used, 5k requests are written to MQ per second, and system A can process up to 2k requests per second, because MySQL can process up to 2k requests per second. System A slowly pulls requests from MQ, pulling 2k requests per second. It is ok if it does not exceed the maximum number of requests it can handle per second. In this way, even during peak periods, System A will never Will hang up. MQ receives 5k requests per second and only 2k requests go out. As a result, during the peak period at noon (1 hour), there may be hundreds of thousands or even millions of requests backlogged in MQ.
Insert image description here

The backlog during this short peak period is OK, because after the peak period, 50 requests enter MQ per second, but system A will still process 2k requests per second. Therefore, as soon as the peak period is over, System A will quickly clear the backlog of messages.

What are the advantages and disadvantages of message queue

The disadvantages are as follows:

  • Reduced system availability

The more external dependencies the system introduces, the easier it is to fail. Originally, you just need system A to call the interfaces of the three systems BCD and ABCD. The four systems are fine and there is no problem. If you add MQ, what will happen if MQ fails? Once MQ hangs up and the entire system collapses, aren't you doomed? How to ensure the high availability of message queue.

  • Increased system complexity

If you add MQ abruptly, how can you ensure that messages are not consumed repeatedly? How to deal with message loss? How to ensure the order of message delivery? I have a big head, a lot of problems, and a lot of pain.

  • Consistency issue

System A will directly return success after processing. People think your request is successful. But the problem is, if there are three systems BCD and two systems BD write the library successfully, but system C fails to write the library, what should I do? Your data is inconsistent.

So the message queue is actually a very complex architecture. There are many benefits to introducing it, but you also have to make various additional technical solutions and architectures to avoid the disadvantages it brings. After doing it well, you will find that... Yeah, the system complexity has increased by an order of magnitude, maybe 10 times more complicated. But at critical moments, you still have to use it.

What are the advantages and disadvantages of Kafka, ActiveMQ, RabbitMQ, and RocketMQ?

characteristic ActiveMQ RabbitMQ RocketMQ Kafka
Single machine throughput Level 10,000, one order of magnitude lower than RocketMQ and Kafka Same ActiveMQ Level 100,000, supporting high throughput Level 100,000, high throughput, generally used with big data systems for real-time data calculation, log collection and other scenarios
The impact of topic number on throughput Topics can reach hundreds/thousands of levels, and the throughput will decrease slightly. This is a major advantage of RocketMQ. On the same machine, it can support a large number of topics. When the number of topics increases from dozens to hundreds, the throughput will drop significantly. On the same machine, Kafka tries to ensure that the number of topics is not too large. If it wants to support large-scale topics, more machine resources need to be added.
Timeliness ms level Microsecond level, this is a major feature of RabbitMQ, with the lowest latency ms level Delay is within ms level
Availability High, based on master-slave architecture to achieve high availability Same ActiveMQ Very high, distributed architecture Very high, distributed, with multiple copies of one data. If a few machines go down, there will be no data loss or unavailability.
message reliability There is a lower probability of losing data Basically not lost After parameter optimization and configuration, 0 loss can be achieved Same as RocketMQ
Function support The functionality of the MQ domain is extremely complete Developed based on Erlang, it has strong concurrency capabilities, excellent performance, and low latency. MQ has relatively complete functions, is distributed, and has good scalability. The function is relatively simple, mainly supporting simple MQ functions, and is used on a large scale for real-time computing and log collection in the field of big data.

In summary, after various comparisons, we have the following suggestions:

General business systems need to introduce MQ. At first, everyone used ActiveMQ, but now it is true that not many people use it. It has not been verified in large-scale throughput scenarios, and the community is not very active, so everyone should forget it. I personally don’t It is recommended to use this;

Later, everyone started to use RabbitMQ, but it is true that the Erlang language prevented a large number of Java engineers from studying and controlling it in depth. For the company, it was almost uncontrollable, but it is true that it is open source, has relatively stable support, and is very active. high;

But now more and more companies are indeed using RocketMQ, which is indeed very good. After all, it is produced by Alibaba, but the community may be at risk of suddenly becoming obsolete (Currently RocketMQ has been donated to Apache, but the activity on GitHub is actually not high. ) If you have absolute confidence in your company's technical strength, it is recommended to use RocketMQ. Otherwise, go back and use RabbitMQ. They have an active open source community and will definitely not be pornographic.

Therefore, for small and medium-sized companies with average technical strength and not particularly high technical challenges, RabbitMQ is a good choice; for large companies with strong infrastructure research and development capabilities, RocketMQ is a good choice.

If it is real-time computing, log collection and other scenarios in the field of big data, using Kafka is the industry standard, and there is absolutely no problem. The community is very active and will definitely not be pornographic. Moreover, it is almost a de facto standard in this field around the world .

PS: Study notes, architecture notes from Hushan

Guess you like

Origin blog.csdn.net/qq_31686241/article/details/125731476