[System Architecture] Talk about the architecture and principles of open source messaging middleware

Speaking of message middleware, children's shoes on the Internet must subconsciously be high concurrency, high-performance io scheduling, etc. come to mind, but for applications, it may be more than simple performance, especially for transactions. For business platforms dealing with finance.

Ok, let me introduce to you in the financial trading platform, which scenarios require us to use message middleware? Why use it? How to design a middleware private cloud to make development more enjoyable? (In view of the different language skills of different students, here is only the content of design principles and mechanisms. This article will involve popular open source products on the market, such as activemq, rabbitmq, kafka, metaq, etc.)

The role of message middleware is to use as a carrier for asynchronous concurrency. Not only that, it still needs to ensure a lot of capabilities in the architecture, high availability, high concurrency, scalability, reliability, integrity, guarantee order, etc., just These have already caused headaches for various designers; there are also some abnormal requirements, such as slow consumption, non-repeatability, etc. The cost of design is quite high, so do not blindly believe in open source experts. For many mechanisms, almost It must be rebuilt; it’s not that simple to build a user-friendly, universal private cloud that meets all businesses.

If a payment system has to process billions of business orders every day, then the processing capacity of message middleware must reach at least nearly 10 billion, because many systems rely on the clustering capabilities of middleware and must ensure that there is no error, so, let Let's analyze how middleware does it from some aspects of the architecture.

High availability

High availability is an eternal topic. This is also a measure of whether it is reliable in the financial world. You must know that the architects in the financial industry will find ways to prevent data loss, even a piece of data, but in fact, this thing In theory, it depends on character. . . This is not a fool.

For example, in the Internet data architecture, at least three copies of a piece of data are called high guarantees, but in fact, Google’s Belgian data center was permanently lost after lightning strikes on 8.13, 0.000001% of the data center, less than 0.05 % Of the disks could not be repaired. What I want to say here is that the right time and place is very important. There is nothing impossible under extreme conditions. There must be architecture vulnerabilities. Let’s take a look at the general practice of mq high availability: the
following picture is activemq HA Program:
[System Architecture] Talk about the architecture and principles of open source messaging middleware

Activemq's HA is managed by master/slave failover, where master-slave switching can be switched in many ways:
1: A shared lock is performed through an nfs or other shared disk device, and the master is marked by the ownership of the shared file lock When m hangs up, the corresponding slave will occupy the shared_lock and convert to master

2: It is more common to manage clusters through zookeeper. The
following figure is not introduced here. The following figure is the HA scheme of metaq.
[System Architecture] Talk about the architecture and principles of open source messaging middleware
As shown in the figure above, it is the same, and it is also the master-slave node of the broker through zk management.

Of course, this is only one of the failover mechanisms, which can only guarantee that the message will be transferred to the slave when the broker hangs, but it cannot guarantee the loss of the message in the intermediate process

When the message flows through the broker, it is likely to be caused by downtime or other hardware failures, which may cause the message to be lost. At this time, a relevant storage medium is required to guarantee the message.

Then we take Kafka's storage mechanism as a reference. We must know that the dependence of message middleware on storage not only requires fast speed, but also requires very low cost of IO requirements. Kafka has designed a set of storage mechanisms to meet the above requirements, which is simple here. introduce.

First, the topic in kafka is divided into multiple partitions under distributed deployment. The partition is equivalent to a load of messages and then routing by multiple machines. For example: a topic, debit_account_msg will be divided into debit_account_msg_0, debit_account_msg_1 , debit_account_msg_2. . . Waiting for N partitions, each partition will generate a directory locally such as /debit_account_msg/topic

The file inside will be divided into many segments, each segment will define a size, such as 500mb a segment, a file is divided into two parts: index and log
00000000000000000.index
00000000000000000.log
00000000000065535.index
00000000000065535.log
where the number represents the value of msgId The starting point of the index, the corresponding data structure is as follows:
[System Architecture] Talk about the architecture and principles of open source messaging middleware
1,0 represents the message with msgId of 1, and 0 represents the offset in this file. After reading this file, find the corresponding segment log file to read Corresponding msg information, the corresponding information is a fixed format message body:
[System Architecture] Talk about the architecture and principles of open source messaging middleware
Obviously, the simple application of this mechanism is definitely not enough to meet high concurrent IO. First, search the segmentfile binary, then find the corresponding data through offset, then read msgsize, and then Reading the body of the newspaper requires at least 4 disk io times, which is expensive, but the sequential reading is used when pulling, which has little effect.

In addition to the query mentioned above. In fact, before writing to the disk, all reads and writes are performed on the pagecache on the os, and then the hard disk is flushed (LRU strategy) periodically through asynchronous threads, but in fact, this risk is very large, because once the os goes down, it will cause Data loss, especially in the case of slow consumption and a lot of backlog of data, but Kafka’s brother metaq has done a lot of transformations on this area, and the replication mechanism (used by Ali) is carried out on these partition files, so at this level No matter how the lightning strikes, the chances of losing messages will be smaller. Of course, it does not rule out what happens when the optical cable in the host room is dug up.

Having said so much, it seems to be perfect and beautiful, but in fact the operation and maintenance costs seem to be huge. Because these are all files, once a problem occurs, it is quite troublesome to deal with it manually, and it is on one machine, which requires relatively large operation and maintenance costs to make some operation and maintenance specifications and api call facilities.

Therefore, in this area, we can store data on some nosql, such as mongoDB. Of course, mysql is also possible, but the io capability and nosqldb are not at the same level unless we have a strong transaction processing mechanism. Li is indeed quite strict with this requirement. For example, metaq is used behind Alipay, because the previous middleware tbnotify will be very passive in the case of slow consumption, and metaq will have a great advantage in this area, why, please listen to the decomposition later.

High concurrency

In the beginning, most of the engineers used mq to solve the performance and asynchronization problems. In fact, for the same point, an io scheduling is not so resource-intensive. Let us look at some of the high in mq. Concurrency, first introduce the background of several well-known middlewares:

Activemq was a specialized enterprise-level solution at the time. It complied with the jms specification in jee. In fact, the performance was still good, but it was a rabbit holding a watermelon when it was pulled into the Internet.

Rabbitmq is written in erlang language, complies with AMQP protocol specifications, and has a cross-platform nature. The mode transfer mode should be richer and in distributed

Rocketmq (the latest version of metaq3.0 now, kafka is also the predecessor of metaq, originally the log message system open sourced by linkedIn), metaq basically writes the principles and mechanisms of kafka in java. After many modifications, it supports transactions. The development speed is very fast, and there are very good communities in Ali and China to do this maintenance.

Performance comparison, here are some data from the Internet, just for reference:
[System Architecture] Talk about the architecture and principles of open source messaging middleware

To be honest, in terms of these data levels, the difference is not too outrageous, but we can analyze some commonalities, where are these main performance differences?
Rocketmq is the successor of metaq. Except for improvements in some new features and mechanisms, the principles of performance are similar. Here are some highlights of these high performance:

  • The consumption of rocketmq mainly uses the pull mechanism. Therefore, for the broker, many consumption features do not need to be implemented on the broker. You only need to pull the relevant data through the consumer. And like activemq and rabbitmq, they all take the older ones. The way to let the broker dispatch the message, of course, some of the standard delivery methods of jms or amqp

  • File storage is stored sequentially, so when you pull messages, you only need to call the segment data, and the consumer consumes information to the greatest extent when doing consumption, it is unlikely to generate a backlog, and you can set io Scheduling algorithms, like noop mode, can improve the performance of some sequential reads.

  • Use pagecache to hit the data in the os cache to reach a hot consumption

  • Metaq's batch disk IO and network IO, try to make the data run in one io, the messages are all batches, so that the scheduling of io does not need to consume too much resources

  • NIO transmission, as shown in the figure below, this is an architecture of the original metaq. Initially, metaq used some high-performance NIO frameworks integrated with gecko and notify-remoting inside Taobao to distribute messages
    [System Architecture] Talk about the architecture and principles of open source messaging middleware
  • The lightweight of the consumption queue, we must know that our message capability is obtained through the queue

Look at the following figure:
[System Architecture] Talk about the architecture and principles of open source messaging middleware
Metaq adds a logical queue to the physical queue for consumption. The disk data corresponding to the queue is serialized. The addition of the queue does not add the burden of disk iowait. The writing can be sequential, but when reading Still need to use random read, first logical queue, then read the disk, so pagecache is very important, try to make the memory larger, this allocation will be fully utilized.

In fact, achieving the above can basically ensure that our performance is at a relatively high level; but sometimes performance is not the most important, the most important thing is to make the best balance with other architectural features, after all, there are Other mechanisms must be satisfied. Because basically the three most difficult problems in the industry: high concurrency, high availability, and consistency conflict with each other.

Scalable

This is an old-fashioned question. For general systems or middleware, it can be better extended, but in the area of ​​message middleware, it has always been a nuisance. Why?

Let me talk about the limitations of activemq's expansion, because the expansion of activemq requires business nature. As a broker, you must first know the source and destination, but if these messages are distributed transmission, it will become complicated. Let's take a look at activemq. How to play with the load.
[System Architecture] Talk about the architecture and principles of open source messaging middleware
We assume that the producer sends topicA messages. If all consumers are connected to each broker under normal circumstances, what’s hot? If there is a message from the producer on the broker, it can be transferred to the corresponding On the consumer.
But if there is no corresponding messager connected to it in broker2, what should we do in this case? Because there are many nodes in the application system (producer) and dependent system (consumer) of the same topic, how to expand the capacity? Activemq can do the normal part in the figure above, but it is quite troublesome to change the corresponding configuration of producer, broker, and consumer.
Of course, activemq can also do dynamic search through multicast (some people also mentioned using lvs or f5 for load, but there are big problems for consumers, and this load configuration has no substantial effect on topic distribution) However, there will still be the problem I mentioned. If the topic is too large, each broker needs to connect to all producers or consumers, otherwise the situation I said will occur. The expansion of Activemq is quite troublesome.

Let’s talk about how metaq does this. Look at the picture.
[System Architecture] Talk about the architecture and principles of open source messaging middleware
Metaq uses topics as partitions. At this level, we only need to configure the number of topic partitions, so that there is only one partition. The concept of "business" is used as a routing rule; generally, multiple topics are configured on a broker machine, and each topic generally has only one partition on a machine. If the machine is not enough, it can also support multiple partitions. Generally speaking, we You can customize the partition by taking the modulus of the business id, just by obtaining the parameters of the sending zone.
[System Architecture] Talk about the architecture and principles of open source messaging middleware
The consumers of metaq also use the group load method (this group is generally configured according to the partition capabilities) to pull messages from the partition. If there are many consumers, they do not need to participate in consumption. This is generally the case online, because after all, the application server is much larger than the message server.
[System Architecture] Talk about the architecture and principles of open source messaging middleware
In another case, when there are too many partitions, as shown in the figure below, when
[System Architecture] Talk about the architecture and principles of open source messaging middleware
the load is heavily dependent on core messages, the requirements for the server broker are still relatively high. After all, the amount of dependence is relatively large. In addition, if the message has broadcast characteristics, it may be Larger, so for the broker, a high-io hard disk and large memory are required for pagecache, and the actual calculations required do not need to be too large
[System Architecture] Talk about the architecture and principles of open source messaging middleware

reliability

Reliability is an important feature of message middleware. Let's see how mq transfers these messages. Take activemq as a reference first. It is based on the push&push mechanism.

How to ensure that every message sent is consumed? After the Activemq producer sends a message, it needs to receive a broker's ack to confirm the receipt. The same guarantee is also true for the broker to the consumer.

The mechanism of Metaq is the same, but the broker reaches the consumer through a pull method, so its arrival guarantee depends on the ability of the consumer, but in general, the application server cluster is unlikely to have an avalanche effect.

How to ensure the idempotence of messages? At present, basically neither Activemq nor Metaq can guarantee the idempotence of messages, which requires some business to guarantee. Because once the broker times out, it will try again. If you try again, a new message will be generated. It is possible that the broker has already landed the message. In this case, it is impossible to guarantee that the same business transaction will generate two messages.

How to ensure the reliability of the message? At this point, activemq and metaq basically have the same mechanism:
Producer guarantee: After producing data, it must be persisted after reaching the broker before returning ACK to the source
broker. Guarantee: After receiving the message, the metaq server refreshes it to the hard disk regularly, and then the data All are replicated to the slave through synchronous/asynchronous, to ensure that consumption will not be affected after the downtime.
Activemq is also stored locally through the database or files for local recovery.

Consumer guarantee: The consumer of the message consumes the message one after another, and will consume the next message only after successfully consuming one message. If a message fails to be consumed (such as an exception), it will try to consume the message again (the default maximum is 5 times). After the maximum number of times is still unable to be consumed, the message will be stored on the consumer's local disk, and the background thread Keep trying again. The main thread continues to go back and consumes subsequent messages. Therefore, only after the MessageListener confirms that a message is successfully consumed, the meta consumer will continue to consume another message. This ensures reliable consumption of messages.

consistency

The consistency of mq we discuss two scenarios:
1: to ensure that the message will not be sent/consumed multiple times

2: Guarantee transactions
. Some of the mqs just described above cannot guarantee consistency. Why not guarantee? The cost is relatively high. It can only be said that these can be guaranteed by modifying the source code, and the scheme is relatively not too complicated, but the additional overhead is relatively large, such as ensuring a certain period of time through an additional cache cluster. Repeatability, I believe there should be some mq with this function later.

Activemq supports two types of transactions, one is JMS transaction and the other is XA distributed transaction. If you bring a transaction, a transactionId will be generated to the broker during interaction. The broker implements some TMs to allocate transaction processing. Metaq also supports local transactions and XA, comply with the JTA standard. The transaction guarantees of activemq and metaq are all done through the redo log method, which is basically the same.

The distributed transaction here is only guaranteed after the broker phase. The prepared message will be stored in a local file before the broker commits, and the message will be written to the queue in the commit phase, and finally two-phase commit is realized through TM.

summary

For example, there are some messaging middleware with very good performance inside the company, and I hope that it can be open sourced and used by more people in the future. For some popular messaging middleware, we can customize different architectures for different applications, different costs, and different developments. Of course, these architectures must be considered in many ways.

Recommended reading:

Carefully organized | The article catalog in the second half of 2017.
Feasible solutions for strong consistency between cache and database.
User process buffer and kernel buffer
introduce dynamic programming through gold mine stories (part 1)

Focus on server background technology stack knowledge summary sharing

Welcome to pay attention to communication and common progress

[System Architecture] Talk about the architecture and principles of open source messaging middleware

Coding

The code farmer has the right way to provide you with easy-to-understand technical articles to make technology easier!

Guess you like

Origin blog.51cto.com/15006953/2552096