MQ message middleware related knowledge

1. Knowledge of MQ message middleware

1 Overview

Message queues have gradually become the core means of internal communication in enterprise IT systems. It has a series of functions such as low coupling, reliable delivery, broadcasting, flow control, and final consistency, and has become one of the main methods of asynchronous RPC. There are many mainstream messaging middlewares on the market today, such as the old ActiveMQ, RabbitMQ, the popular Kafka, and Alibaba independently developed RocketMQ.

2. Composition of message middleware

2.1 Broker

Message server, as a server to provide core message services

2.2 Producer

The message producer, the initiator of the business, is responsible for producing the message and transmitting it to the broker,

2.3 Consumer

Message consumers, business processors, are responsible for obtaining messages from brokers and processing business logic

2.4 Topic

Topic, the unified collection of messages in the publish-subscribe mode, different producers send messages to the topic, which are distributed by the MQ server to different subscribers, to achieve message broadcasting

2.5 Queue

Queue, in PTP mode, a specific producer sends a message to a specific queue, and the consumer subscribes to a specific queue to complete the reception of the specified message

2.6 Message

The message body is a data packet encoded according to a fixed format defined by different communication protocols to encapsulate business data and realize message transmission

3 Classification of message middleware patterns

3.1 Point to point

PTP point-to-point: use queue as the communication carrier
Note: The
message producer produces messages and sends them to the queue, and then the message consumers take them out of the queue and consume the messages.
After the message is consumed, it is no longer stored in the queue, so it is impossible for the message consumer to consume the message that has already been consumed. Queue supports multiple consumers, but for a message, only one consumer can consume.

3.2 Publish/Subscribe

Pub/Sub publish and subscribe (broadcast): Use topic as the communication carrier.
Description: The
message producer (publish) publishes the message to the topic, and multiple message consumers (subscribe) consume the message at the same time. Unlike the peer-to-peer method, messages published to the topic will be consumed by all subscribers.

Queue implements load balancing, sending messages produced by the producer to the message queue for consumption by multiple consumers. But a message can only be accepted by one consumer. When there is no consumer available, the message will be saved until there is an available consumer.
Topic implements publishing and subscription. When you publish a message, all services subscribing to this topic can get the message, so from 1 to N subscribers can get a copy of the message.

4 Advantages of message middleware

4.1 System decoupling

There is no direct calling relationship between the interactive systems, but only through message transmission, so the system is not very intrusive and the degree of coupling is low.

4.2 Improve system response time

For example, the original set of logic, completing the payment may involve modifying the order status, calculating the member points, and notifying the logistics and distribution before completing the logic; through the MQ architecture design, the urgent and important business (need to respond immediately) can be placed in this call In the method, the message queue is used with low response requirements and placed in the MQ queue for consumer processing.

4.3 Provide services for big data processing architecture

With message as integration, in the context of big data, message queues are also integrated with real-time processing architecture to provide performance support for data processing.

4.4 Java Message Service-JMS

The Java Message Service (Java Message Service, JMS) application program interface is a message-oriented middleware (MOM) API in the Java platform, used to send messages between two applications or in a distributed system for asynchronous communication .
The P2P and Pub/Sub messaging modes in JMS: point to point (queue) and publish/subscribe (publish/subscribe, topic) were originally defined by JMS. The main difference or problem solved by these two modes is whether the messages sent to the queue can be consumed repeatedly (multi-subscription).

5 Application Scenarios of Message Middleware

5.1 asynchronous communication

Some businesses don't want or need to process messages immediately. The message queue provides an asynchronous processing mechanism that allows users to put a message into the queue, but does not process it immediately. Put as many messages as you want in the queue, and then process them when needed.

5.2 Decoupling

Reduce the degree of strong dependence between projects and adapt to heterogeneous systems. At the beginning of the project, it is extremely difficult to predict the future needs of the project. An implicit data-based interface layer is inserted in the processing process through the message system. The processing on both sides must implement this interface. When the application changes, the processing on both sides can be independently extended or modified, as long as you ensure They follow the same interface constraints.

5.3 Redundancy

In some cases, the process of processing data will fail. Unless the data is persisted, it will be lost. Message queues persist data until they have been completely processed, avoiding the risk of data loss in this way. In the "insert-get-delete" paradigm adopted by many message queues, before deleting a message from the queue, your processing system needs to clearly indicate that the message has been processed to ensure that your data is safely stored Until you finish using it.

5.4 Scalability

Because the message queue decouples your processing process, it is easy to increase the frequency of message enqueue and processing, as long as you increase the processing process. No need to change the code or adjust the parameters. Facilitate distributed expansion.

5.5 Overload protection

In the case of a surge in traffic, applications still need to continue to play a role, but such sudden traffic cannot be predicted; it is undoubtedly a huge waste to invest resources on standby in order to be able to handle such instant peak access as the standard. The use of message queues enables key components to withstand the sudden access pressure, and will not completely collapse due to sudden overloaded requests.

5.6 Recoverability

When a part of the system fails, it will not affect the entire system. The message queue reduces the coupling between processes, so even if a message processing process hangs, the messages added to the queue can still be processed after the system is restored.

5.7 Order guarantee

In most usage scenarios, the order of data processing is important. Most message queues are sorted by nature, and it is guaranteed that data will be processed in a specific order.

5.8 Buffer

In any important system, there will be elements that require different processing time. The message queue helps the most efficient execution of tasks through a buffer layer, which helps to control and optimize the speed of data flow through the system. To adjust the system response time.

5.9 Data stream processing

Mass data streams generated by distributed systems, such as business logs, monitoring data, user behaviors, etc., are collected and summarized in real time or in batches for these data streams, and then big data analysis is a must-have technology for the current Internet. This is done through message queues. Class data collection is the best choice.

6 Common protocols for message middleware

6.1 AMQP protocol

AMQP is Advanced Message Queuing Protocol, an application layer standard advanced message queuing protocol that provides unified messaging services. It is an open standard for application layer protocols and is designed for message-oriented middleware. Clients and message middleware based on this protocol can transmit messages, and are not restricted by different client/middleware products, different development languages ​​and other conditions.
Advantages: reliable and universal

6.2 MQTT protocol

MQTT (Message Queuing Telemetry Transport) is an instant messaging protocol developed by IBM and may become an important part of the Internet of Things. This protocol supports all platforms and can connect almost all networked objects to the outside world, and is used as a communication protocol for sensors and actuators (such as connecting houses via Twitter).
Advantages: simple format, small occupied bandwidth, mobile communication, PUSH, embedded system

6.3 STOMP protocol

STOMP (Streaming Text Orientated Message Protocol) is a stream text oriented message protocol, a simple text protocol designed for MOM (Message Oriented Middleware, message-oriented middleware). STOMP provides an interoperable connection format that allows clients to interact with any STOMP message broker (Broker).
Advantages: command mode (non-topic\queue mode)

6.4 XMPP protocol

XMPP (Extensible Messaging and Presence Protocol) is a protocol based on Extensible Markup Language (XML), which is mostly used for instant messaging (IM) and online field detection. Suitable for quasi-instant operation between servers. The core is based on XML streaming. This protocol may eventually allow Internet users to send instant messages to anyone else on the Internet, even if their operating systems and browsers are different.
Advantages: universal openness, strong compatibility, scalability, and high security, but the XML encoding format takes up a lot of bandwidth

6.5 Other protocols based on TCP/IP customization

Some special frameworks (such as redis, kafka, zeroMq, etc.) do not strictly follow the MQ specification according to their own needs, but encapsulate a set of protocols based on TCP\IP, and transmit through the network socket interface to realize the function of MQ.

7 Introduction to common message middleware MQ

7.1 RocketMQ

A distributed, queue model message middleware under the Alibaba department, originally named Metaq, and the name of version 3.0 changed to RocketMQ. It is a set of mq implemented by Alibaba based on the design philosophy of Kafka using java. At the same time, it integrates multiple internal mq products (Notify, metaq) of the Ali department, only maintains the core functions, removes all other runtime dependencies, and ensures the most simplified core functions. On this basis, it cooperates with Ali's other open source products to achieve different scenarios. The mq architecture is currently mainly used in order trading systems.

Has the following characteristics:

  • Able to guarantee strict message order
  • Provide filtering function for messages
  • Provide rich message pull mode
  • Efficient subscriber horizontal expansion capability
  • Real-time news subscription mechanism
  • Billion-level message accumulation capability

The official provides some comparative differences from Kafka:
https://rocketmq.apache.org/docs/motivation/

7.2 RabbitMQ

An open source message queue written in Erlang supports many protocols: AMQP, XMPP, SMTP, STOMP. This is also true, making it a very heavyweight and more suitable for enterprise-level development. At the same time, the Broker architecture is implemented. The core idea is that the producer will not send the message directly to the queue, and the message will be queued in the central queue before being sent to the client. Good support for routing, load balance, and data persistence. It is mostly used for enterprise-level ESB integration.

7.3 ActiveMQ

A sub-project under Apache. Using Java to fully support JMS 1.1 and J2EE 1.4 standard implementation of JMS Provider, a small amount of code can efficiently implement advanced application scenarios. Pluggable transport protocol support, such as: in-VM, TCP, SSL, NIO, UDP, multicast, JGroups and JXTA transports. RabbitMQ, ZeroMQ, ActiveMQ all support commonly used multi-language clients C++, Java, .Net, Python, Php, Ruby, etc.

7.4 Redis

A Key-Value NoSQL database developed in C language is very active in development and maintenance. Although it is a Key-Value database storage system, it supports MQ functions, so it can be used as a lightweight queue service. The enqueue and dequeue operations of RabbitMQ and Redis are executed 1 million times each, and the execution time is recorded every 100,000 times. The test data is divided into four different sizes of 128Bytes, 512Bytes, 1K and 10K. Experiments show that when entering the team, the performance of Redis is higher than RabbitMQ when the data is relatively small, and if the data size exceeds 10K, Redis is unbearably slow; when leaving the team, regardless of the size of the data, Redis shows very good performance , And RabbitMQ's dequeue performance is much lower than Redis.

7.5 Kafka

A sub-project under Apache, a high-performance distributed Publish/Subscribe message queue system implemented using scala, has the following characteristics:

  • Fast persistence: Through disk sequential read and write and zero copy mechanism, message persistence can be performed under O(1) system overhead;

  • High throughput: A throughput rate of 10W/s can be reached on an ordinary server;

  • High accumulation: Consumers are offline for a long time under the topic, and the message accumulation is large;

  • Completely distributed system: Broker, Producer, and Consumer all natively automatically support distributed, relying on zookeeper to automatically achieve complex balance;

  • Support for parallel loading of Hadoop data: For log data and offline analysis systems like Hadoop, but with the limitations of real-time processing, this is a feasible solution.

    7.6 ZeroMQ

Known as the fastest message queuing system, it is specially developed for high throughput/low latency scenarios. It is often used in financial applications and focuses on real-time data communication scenarios. ZMQ can implement advanced/complex queues that RabbitMQ is not good at, but developers need to combine multiple technical frameworks by themselves, and the development cost is high. Therefore, ZeroMQ has a unique non-middleware model, more like a socket library. You don't need to install and run a message server or middleware, because your application itself uses the ZeroMQ API to complete the role of logical services. However, ZeroMQ only provides non-persistent queues. If the machine is down, data will be lost. For example, ZeroMQ is used as data stream transmission in Twitter's Storm.

ZeroMQ socket has nothing to do with the transport layer: ZeroMQ socket defines a unified API interface for all transport layer protocols. In-process (inproc), inter-process (IPC), multicast, and TCP protocols are supported by default. To switch between different protocols, simply change the prefix of the connection string. You can switch from local communication between processes to distributed TCP communication at any time with minimal cost. ZeroMQ handles connection establishment, disconnection and reconnection logic behind the scenes.

characteristic:

  • Lock-free queue model: For the data exchange channel pipe between cross-thread interaction (user side and session), the lock-free queue algorithm CAS is adopted; asynchronous events are registered at both ends of the pipe, and messages are read or written to the pipe Read and write events will be triggered automatically.
  • Batch processing algorithm: For batch messages, adaptive optimization is carried out, and messages can be received and sent in batches.
  • Thread binding under multi-core, no CPU switching: different from the traditional multi-threaded concurrency mode, semaphore or critical section, zeroMQ makes full use of the advantages of multi-core, each core is bound to run a worker thread to avoid multi-threading CPU switching overhead.

2. Comparison of main message middleware

img

Comprehensive selection of RabbitMq

[Message Queuing MQ] Comparison of various types of MQ

At present, there are many MQ products in the industry. We make the following comparison:

RabbitMQ

It is an open source message queue written in Erlang. It supports many protocols: AMQP, XMPP, SMTP, STOMP. It is also the case, which makes it very heavyweight and more suitable for enterprise-level development. At the same time, a broker architecture is implemented, which means that messages are queued in the central queue when they are sent to the client. It has good support for routing, load balance or data persistence.

Redis

It is a Key-Value NoSQL database. Its development and maintenance are very active. Although it is a Key-Value database storage system, it supports MQ function, so it can be used as a lightweight queue service. The enqueue and dequeue operations of RabbitMQ and Redis are executed 1 million times each, and the execution time is recorded every 100,000 times. The test data is divided into four different sizes of 128Bytes, 512Bytes, 1K and 10K. Experiments show that when entering the team, the performance of Redis is higher than RabbitMQ when the data is relatively small, and if the data size exceeds 10K, Redis is unbearably slow; when leaving the team, regardless of the size of the data, Redis shows very good performance , And RabbitMQ's dequeue performance is much lower than Redis.

Redis RabbitMQ
Join the team 16088 10627 128B
Join the team 15961 9916 512B
Join the team 17094 9370 1K
Join the team 25 2366 10K
Leave the team 15955 3219 128B
Leave the team 20449 3174 512B
Leave the team 18098 2982 1K
Leave the team 9355 1588 10K

ZeroMQ

Known as the fastest message queuing system, especially for high throughput demand scenarios. ZMQ can implement advanced/complex queues that RabbitMQ is not good at, but developers need to combine multiple technical frameworks by themselves. The technical complexity is a challenge to the successful application of this MQ. ZeroMQ has a unique non-middleware model, you don't need to install and run a message server or middleware, because your application will play this service role. You only need to simply reference the ZeroMQ library, which can be installed using NuGet, and then you can happily send messages between applications. But ZeroMQ only provides non-persistent queues, which means that if the machine is down, data will be lost. Among them, Twitter's Storm uses ZeroMQ as the data stream transmission.

ActiveMQ

It is a sub-project under Apache. Similar to ZeroMQ, it can implement queues with agent and peer-to-peer technology. At the same time, similar to RabbitMQ, it can efficiently implement advanced application scenarios with a small amount of code. RabbitMQ, ZeroMQ, ActiveMQ all support commonly used multi-language clients C++, Java, .Net, Python, Php, Ruby, etc.

Jafka / Kafka

Kafka is a sub-project under Apache, a high-performance cross-language distributed Publish/Subscribe message queue system, and Jafka was incubated on Kafka, which is an upgraded version of Kafka. It has the following characteristics: fast persistence, which can carry out message persistence under O(1) system overhead; high throughput, which can reach a throughput rate of 10W/s on an ordinary server; a fully distributed system, Broker , Producer, and Consumer all natively automatically support distributed and automatically achieve complex balance; support Hadoop data parallel loading, for log data and offline analysis systems like Hadoop, but require real-time processing limitations, this is a feasible solution . Kafka uses Hadoop's parallel loading mechanism to unify online and offline message processing, which is also valued by the system studied in this topic. Compared with ActiveMQ, Apache Kafka is a very lightweight messaging system. In addition to very good performance, it is also a well-working distributed system.

Some other queue lists HornetQ, Apache Qpid, Sparrow, Starling, Kestrel, Beanstalkd, Amazon SQS will not be analyzed one by one.

Guess you like

Origin blog.csdn.net/virtual_users/article/details/109677832
Recommended