How to design a message middleware? The overall architecture of the message middleware

MQ concept

1. Message

Message is the smallest concept in MQ, which is essentially a piece of data. It can be understood by one or more applications, and it is an information carrier passed between applications.

2. Queue

2.1 Local queue

The local queue can be divided into initialization queue, transmission queue, target queue and dead letter queue according to the function.

The initialization queue is used as a message trigger function.

The transmission queue only temporarily stores the message to be transmitted. When conditions permit, the message is transmitted to other queue managers through the pipeline.

The target queue is the destination of messages and can store messages for a long time.

If the message cannot be delivered to the target queue and can no longer be routed out, it is automatically placed in the dead letter queue for storage.

2.2 Alias ​​Queue & Remote Queue

It is just a queue definition, used to specify the queue of the remote queue manager. Using a remote queue, the program does not need to know the location of the target queue.

2.3 Model queue

The model queue defines a combination of local queue attributes. Once the model queue is opened, the queue manager will dynamically create a local queue based on these attributes.

3. Queue Manager (Queue Manager)

A queue manager is an organization responsible for providing message services to applications. If the queue manager is compared to a database, then the queue is one of the tables.

4. Channel

A channel is a unidirectional point-to-point communication connection between two managers. If two-way communication is required, a pair of channels can be established.

5. Listener

Features of MQ products

Reliable transmission

This feature can be said to be the foundation of message middleware. For applications, as long as the data is successfully submitted to the message middleware, the problem of reliable data transmission is the message middleware.

Do not repeat transmission

Non-repetitive propagation is the function of resuming transmission at breakpoints, which is especially suitable for unstable network environment and saves network resources.

Asynchronous transmission

Asynchronous transmission means that both parties receiving information do not have to be online at the same time, and have offline capabilities and security.

Message-driven

After receiving the message, actively notify the message recipient.

Support transaction

The application can combine some data updates into a unit of work. These updates are usually logically related. In order to ensure data integrity, all updates must succeed or fail simultaneously).

1233356-670665073d7efc85.png

Introduction to MQ applicable scenarios

MQ message queue is produced by the concept of Yunsong coupling. It mainly uses queues and publish and subscribe as the message transmission mechanism, and it is a basic product to reliably transmit messages to the consumer side in an asynchronous manner.

It is widely used between cross-platform and cross-system distributed systems to provide them with efficient and reliable asynchronous transmission mechanisms.

  • Message Channel

    Use MQ to connect clients and servers that collaborate with each other so that they can exchange messages.

1233356-ffa39635f840d466

If the client and server need safe and reliable interaction, an MQ queue can be used as a secure channel, so that the client and server can communicate asynchronously in a safe and efficient manner.

  • Message Bus

    For a distributed system composed of many independently developed services, if they are to form a complete system, these services must be able to interact reliably, and at the same time, for the robustness of the system,

There is no excessively tight dependency between each service, so that different services can be connected through the message bus, allowing them to transfer data asynchronously.

1233356-3f64fcfa4d3abf1b
  • Message routing (Message Router)

    Through message routing, you can route messages sent to MQ designated queues to different queues according to rules.

1233356-1656d6882dbab5fa

In addition, the JMS specification also supports the filtering of messages through selector conditions. Multiple consumers can consume messages in the same queue, and each consumer only consumes messages of interest.

  • Publish / Subscribe (Publicsher / Subscriber)

    The publish / subscribe mode is used for one-to-many communication. When a message publisher sends a message to a topic, all subscribers of the topic will receive the message.

One of the simplest message middleware

You must think of it, it is the queue! Queue.

ArrayBlockingQueue can be used as a simple MQ

Use put () to send messages and
take () to consume messages

We encapsulate ArrayBlockingQueue to support multiple topics

Map  msg = new HashMap<String, ArrayBlockingQueue<Message>>()

Such a simplest MQ is realized, as shown below

15902998-e083717c4522aed6

But this MQ is too simple, the disadvantages are as follows:

The message will be lost, if the jvm hangs, the message will be lost

Does not support high availability

Does not support distributed and cannot expand horizontally

We solve one by one

To be highly available, to be distributed

First solve the distributed architecture such as evolution as follows

Now our mq system is split into multiple machines, which supports distributed,

Any machine can send a message as long as it is connected to our MQ

The same is true on the consumer side

15902998-a465bea885315a7d

Distributed solution, next to solve high availability

The current problem is that MQ is a single point. Once MQ hangs down, the entire messaging system completely collapses, which is never allowed in a production environment.

Let's design our high availability architecture

Two kinds of ideas commonly used in high availability of distributed systems

Kafka idea, each node is divided into multiple partitions (partition), and then multiple nodes back up partitions with each other

Rocketmq idea, each node is configured with a hot standby

We adopt the second kind, the architecture diagram has evolved into the following picture

MQ adds a hot standby to achieve high availability, the standby machine synchronizes the data of the host in real time, so that when the main mq hangs up, the producer can automatically switch to the standby machine to continue running

But what if the backup machine also hangs? Will the messages in the machine be lost?

Here comes a new concept-brushing

In order to prevent the loss of messages caused by downtime, the data needs to be brushed

Commonly used flashing strategies are:

Real-time (synchronous) flashing, sending a message to write back after writing to the disk, to ensure that the message is 100% written successfully, the disadvantage is that the performance of sending messages is poor

Flash disk regularly (asynchronously), write the disk every unit time, high performance

Buffer (asynchronous) flashing, open up a buffer space, such as 32k, flashing after writing, high performance

Without flashing the disk, returning after writing memory is the strategy with the highest performance. **** The disadvantages are also obvious

15902998-a6e10b820e9230ca

Let's sort out the working steps in the picture above

1.1 The active and standby MQ is started and registered with the registration center

1.2 The producer starts and pulls all MQ lists from the registration center

1.3 Select a main MQ to send messages (there will be multiple main after horizontal expansion)

1.4 If the master hangs up the registration center, it will immediately perceive and notify the producer not to think that the master has sent, instead send it to the standby machine

To be highly concurrent, there is no data loss

If you want high concurrency, you cannot lose data. In fact, as far as MQ is concerned, these two points are contradictory, just like the CAP theory.

We can only weigh the pros and cons to find a balance, but also choose strategies according to the scene

In most cases, regular flashing or buffer flashing + hot standby can achieve 99% of data without loss

While maintaining a decent performance

If your data is not very important, such as log files, you can just drop the performance without brushing.

to sum up

Let ’s review the above

From the simplest queue to distributed to high availability, a basic mq framework is built

But we didn't talk about sending messages, how to consume messages, repeated consumption of messages, sequential consumption of messages, and so on.

Kafka

First of all, let's look at Kafka's system architecture (doing message middleware is inevitable to understand Kafka).

1233356-317b7e3c75364b54.jpg

The Kafka ecosystem contains the following content:

  • Producer
  • Consumer
  • Kafka cluster
  • ZooKeeper

Among them, ZooKeeper assumes the role of NameServer, and is also used to save the metadata of the system, providing functions such as master selection and coordination.

Broker is a real server, used to store messages.

Usability

First look at the availability of external dependencies. If your system "strongly depends" on other external services, then the availability of your system must be related to the availability of external services. (Strong dependency means that the service that cannot be separated from the dependency keeps running normally)

It can be seen from the above architecture that Kafka only depends on ZooKeeper, and ZooKeeper itself is highly available (2N + 1 node ZK cluster can tolerate N node failure), so it will not affect the availability of the entire cluster.

Then look at the availability of Kafka itself. Talking about availability will inevitably involve backup problems. Without backup, there is a single point of problem, and there is no high availability. So let's take a look at Kafka's backup strategy.

1233356-14d8e55e1577c75a.png

(Package of an InfoQ article discussing Kafka availability)

The data flow of Kafka Replication is shown in the figure above, and some information can be obtained from the figure:

  1. Partitions are backed up, such as topic1-part1, there are 3
  2. The partition backups are distributed on different brokers. In the above figure, topic1-part1 is distributed on broker1, broker2, and broker3, among which broker1 is the leader.
  3. The partitioned leaders are randomly distributed. In the above figure, the leader of topic1-part1 is on broker1, the leader of topic2-part1 is on broker, and the leader of topic3-part1 is on broker4.
  4. The message is written to the Leader partition, and then copied to the Follower partition through the Leader partition

For a more detailed Kafka Replication implementation, you can go to the official website (you may also write a separate article later), which will not be expanded here.

A replication strategy like Kafak ensures that the system is still available when any Broker fails. If broker1 fails, the leader of topic1-part1 will be re-elected, and then topic1-part1 on broker2 or broker3 may become the leader and then be responsible for writing messages.

So the availability of the system depends on the number of partition backups, this backup data is configurable.

Kafka itself achieves high availability through Replication, and the dependent ZooKeeper is also highly available, so the availability of the entire system is better guaranteed.

reliability

In message middleware, the reliability is mainly that the written message will be consumed, and the message will not be lost.

There are two points that messages are not lost in a distributed environment:

  1. After the message is successfully written to a node, the message will be persisted
  2. Messages will be backed up to other physical nodes

As long as the above two points are fulfilled, it can be guaranteed that the data will not be lost except for the permanent failure of all nodes.

The messages written on Kafka Broker will be flashed (either asynchronously or synchronously), and will be backed up to other physical nodes, so the above two points are met.

The asynchronous flash disk combined with multi-node backup strategy can also provide better reliability, unless the power failure of the computer room or the like leads to the data loss of all nodes that have not been flashed.

Of course, message loss does not necessarily mean that the message was actually destroyed from the disk or not stored. If the message is stored, but it cannot be consumed, it is also a message loss for the client. For example, Consumer receives ACK and then consumes it. If it crashes before consumption, it will not be received next time, and it can be understood that the message is lost, but we will not discuss this in this article. Happening.

Evaluation

advantage

  1. Some functions are hosted to ZK, and they only need to pay attention to the content related to the message. From this perspective, it simplifies part of the content
  2. High machine utilization. From the above backup strategy, it can be seen that the data between different brokers are mutually backed up. This structure improves the machine utilization rate relative to the master-slave mode (most of the master-slave mode, the slave is useless)

Disadvantages

  1. Introduced ZK, increased external dependencies, and increased the complexity of operation and maintenance

From the system architecture, the backup method is a better way to master each other, but the implementation will be more complicated. If you want to implement an MQ yourself, it is easier to start with the master-slave mode.

(The Kafka backup strategy and the implementation of the grassroots WAL are more complicated, and I have the opportunity to say this later)

RocketMQ

1233356-c17a621ecff217cb.png

(Picture taken from RocketMQ_design document)

RocketMQ contains the following contents:

  • Producer
  • Consumer
  • NameServer
  • Broker

Producer, Consumer and Kafka are the same (All MQ will provide Producer and Consumer), Rocket also has a Broker cluster, the biggest difference from Kafka is that RocketMQ itself implements a clustered NameServer service.

Usability

The availability of RocketMQ is also divided into NameServer and Broker.

NameServer is cluster mode, and "almost" is stateless, and can be deployed in clusters, so there will be no availability problems. (Stateless means that each node provides services independently, and only need to deploy multiple nodes to solve the availability problem)

The availability of Broker can be divided into two parts. For a Topic, it can be distributed on multiple Master Brokers, so that after one of the Brokers is unavailable, other Brokers can still provide services without affecting the write service. After a Master Broker hangs, although other masters can be used to ensure the availability of writing, some data that has been written to the failed broker may not be consumed. RocketMQ solves this problem through the Master-Slave model.

After the Master permanently fails, the read request on the Master can be transferred to the Slave, which can ensure the availability of the system (Master-Slave is asynchronously copied, meaning that a small amount of data may not have been copied from the Master to the Slave, which is reliable Sex section).

Combining the above two points, RocketMQ also provides high availability features, and availability depends only on its own services, no additional services like Kafka, such as ZK, are introduced.

reliability

Reliability is considered from the perspective of the reliability of a single broker writing messages and message backup.

RocketMQ uses a synchronous flashing method to persist written messages.

1233356-4cd2f734f00b9f5b.png

The only difference between synchronous flashing and asynchronous flashing is that asynchronous flashing returns directly after writing the pagecache, while synchronous flashing needs to wait for the flashing to complete before returning. The writing process is as follows:

  1. Write to pagecache, the thread waits, informs the brushing thread to flash
  2. After flashing the thread, wake the front end to wait for the thread, which may be a batch of threads
  3. The front-end waiting thread wants the user to return the write result

(Synchronous brushing is necessarily more time-consuming than asynchronous brushing. How to solve the performance loss caused by synchronous brushing will be discussed later)

Using the synchronous flashing method, from the perspective of a single node, the reliability is higher than the asynchronous flashing method, because as long as the Producer receives feedback that the message is written successfully, then this message must be flashed, and it should not Messages are lost due to power failure and other reasons.

A single node will inevitably face a single point of problem. When a permanent failure of a node cannot be recovered, even if this message has been persisted, it is meaningless. Compared with Kafka's mutual backup method, RocketMQ uses the MS method.

MS mode encountered the problem of delayed master-slave replication (asynchronous replication is always delayed), then some data may be lost after the Master is unavailable. RocketMQ provides a synchronous double write mode for this scenario.

Evaluation

advantage

  1. No external dependencies (this means that your system does not require additional services, whether it is from operation and maintenance or availability, this is indeed an advantage)

Disadvantages

  1. The problem of machine utilization caused by the MS structure (Slave may be idle most of the time)

Limited by the MS's machine utilization, it will not actually adopt the mode of one master and many slaves. Most of them are one master and one slave. Some of the services with lower reliability requirements are not even equipped with slaves. This has been confirmed by Ali's internal development classmates, which is also a defect of the MS model.

Some other architectures of MQ

Kafka introduced an external ZK, and RocketMQ's master-slave mode is not "good", so can you combine the two modes?

Next, we discuss several architectures that the author considers.

Combining Kafka and RocketMQ

1233356-1bd4ca070337cd5d.png

This architecture is mainly to remove the dependency on ZK based on Kafka. The introduction of ZK is mainly to solve the coordination problem of distributed systems. In addition, Kafka will store metadata (Topic configuration, consumption progress and other information stored on ZK) on ZK, while providing NameServer services.

At this point, I agree with RocketMQ's approach. Metadata can actually be stored on the Broker, because the Broker is stateful, so the consumption progress and other information on it is actually irrelevant to other Brokers (if there is mutual backup, you need to synchronize this data), so NameServer can Very lightweight and made stateless. RocketMQ does exactly the same. NameServer's code is about 1,000 lines, which is relatively simple.

One of the biggest problems in the implementation of this architecture is that after ZK is removed, the internal primary and secondary methods need to select the Leader for each Topic Partition. It is a very difficult thing to implement the master selection in the architecture without a central node, including the need to deal with network partitioning and other issues. When we solve this problem in the above architecture, we can actually make some compromises. For example, the central node can be selected first, and then the central node is responsible for the remaining issues related to the selection of the master.

The central node can be specified manually, and the availability of the central node itself is not very important, because the system can run normally without the central node, but it is not possible to select the master. The availability of the system depends on whether the central node fails and other nodes fail at the same time (sacrificing some automated operation and maintenance, because the high availability of the central node is not considered, but the external dependencies are removed, and the system design always has a tradeoff).

Remove NameServer

Consider in depth:

  • Metadata information is nothing more than Topic configuration, consumption progress, the amount of data will not be very large, it can be stored directly on the Broker
  • And Broker itself is already a multi-node, it can naturally achieve the backup of metadata

After storing the metadata on the broker, there will be a problem: every broker must have all the metadata, then all brokers need to communicate to obtain the topic data (if only the data is available, just backup between several brokers).

This problem can be implemented by introducing protocols such as Gossip, so the architecture can remove the above NameServer and evolve into the following structure:

1233356-c3311077c2ac4f42.png

At this point, there is actually only one Broker cluster left in the architecture. The data between Brokers uses Kafka's backup strategy, and the metadata between Brokers is copied through the Gossip protocol.

In fact, the system architecture is very simple here, and I feel that there is no content that can be removed and changed (the author's belief-simple is beautiful).

But in fact, one of the problems that has been overlooked is the above tradeoff. In the end, for a system, we definitely want to be automated enough, so we still have to solve the high availability problem of the central node.

How to choose a unique leader in Broker, this is actually the consistency problem of distributed systems, as long as a protocol that can solve the consistency problem of distributed systems is introduced, such as Raft and Paxos.

So this architecture is theoretically feasible:

  • No NameServer;
  • Brokers use mutual master and backup to ensure system availability and reliability;
  • Introduce Gossip protocol to copy metadata;
  • Introduce a consistency agreement to solve the problem of choosing the leader;
    • For simplicity, you can use the consistency protocol to select the central node, and the central node is responsible for coordinating other issues.
    • It is also possible to directly select the topic of each Topic Partition through the consistency agreement.

If we write an MQ ourselves

Earlier I said that the public account hopes to write a series of articles similar to "From Entry to XXX", so I don't want to design the system too complicated at the beginning, so that I can't achieve it myself. Or choose a simpler architecture to facilitate us to discuss the core issues of MQ and really use business time to make some attempts.

Therefore, subsequent articles will be developed on the basis of the following architecture (after the content of this architecture is finished, various protocols will be introduced to simplify the architecture or improve the availability and reliability of the system).

1233356-e87779100f422ba4.png

Similar to the RocketMQ architecture, and simplified:

  • Single-node NameServer (NameServer's own service discovery can be done through DNS)
  • Broker adopts the master-slave model
  • Metadata is stored on the broker and reported to the NameServer (each broker only stores part of the metadata, which is aggregated on the NameServer)

The implementation of this architecture will be relatively simple, but it still has high availability and reliability. Because the NameServer itself is stateless, and the failure of the NameServer does not affect the core services of the system (sending and messaging of messages), a single node can be tolerated. Broker is similar to RocketMQ's implementation, and the synchronous flashing mode plus the main and standby modes can also provide better usability and reliability, but the utilization rate is not enough (based on the original intention of writing this series of articles, we will not consider the utilization rate problem first) .

Conclusion

This article mainly introduces the architecture of Kafka and RocketMQ and discusses the realization of usability and reliability. Combining the two gives the MQ architecture that the author thinks about.

At the end of the article, we give the architectural basis for the content of the subsequent series of discussions, that is, choose the easiest mode to discuss the follow-up questions. This is a common agreement that needs to be reached before writing the next article.

References

https://www.jianshu.com/p/ffa950d18f52
https://blog.csdn.net/apanious/article/details/51014396
https://www.cnblogs.com/hzmark/p/mq_arch.html


Kotlin developer community

1233356-4cc10b922a41aa80

The public account of the first Kotlin developer community in China, which mainly shares and exchanges related topics such as Kotlin programming language, Spring Boot, Android, React.js / Node.js, functional programming, and programming ideas.

The more noisy the world, the more peaceful thinking is needed.

1665 original articles published · 1067 praised · 750,000 views

Guess you like

Origin blog.csdn.net/universsky2015/article/details/105531342