Performance comparison of Kafka and RocketMQ

With the same hardware resources invested in the Double Eleven process, a single topic of the log cluster built by Kafka can reach several million TPS, while the core business cluster using RocketMQ components can only reach hundreds of thousands of TPS. This phenomenon stimulates I think about the performance of both.

Reminder: TPS is only one of many performance indicators. We need to consider many aspects in the selection of technology. This article does not intend to invest too much in the selection of message middleware. The focus is to try to analyze the performance of the two. design thinking.

1. File layout


1.1 Kafka file layout

The macro layout of Kafka files is shown in the following figure:

As shown in the figure above, the main characteristics of the Kafka file layout are as follows:

The organization of files is organized by topic + partition. Each topic can create multiple partitions, each partition contains a separate folder, and is a multi-copy mechanism, that is, each partition of the topic will have Leader and Follower, and Kafka has internal The mechanism ensures that the leader and follower of a certain partition of the topic will not exist on the same machine, and each broker will try to balance the leader of each partition as much as possible . Of course, if it is not balanced during the running process, you can execute the command to manually rebalance . The leader node is responsible for reading and writing a partition, and the follow node is only responsible for data backup.

Kafka's load balancing mainly depends on the distribution of partition leader nodes.

The leader node of the partition is responsible for reading and writing, while the slave node is responsible for data synchronization. If the broker node where the leader partition is located goes down, it will trigger the switch of master and slave nodes, and a new leader node will be elected among the remaining follow nodes. The flow of data inflow is shown in the following figure:

When the partition leader receives a message sending request from the client, whether it returns after writing to the leader node or waits until all its slave nodes are written before returning. This is very critical and will directly affect the delay of the message sender. Therefore, Kafka provides The ack parameter is used for strategy selection:

  • ack = 0

    It returns directly without waiting for confirmation from the broker, that is, when the client sends the message to the network, it returns that the message is sent successfully.

  • ack = 1

    After the Leader node accepts and stores it, it returns success to the client.

  • ack = -1
    Leader node and all Follow nodes accept and store it successfully and then return success to the client.

1.2 RocketMQ file layout

The file layout of RocketMQ is shown in the following figure:

The messages of all topics of RocketMQ will be written to the commitlog file, and then the message consumption queue file (Consumequeue) is constructed based on the commitlog file. The organizational structure of the message consumption queue is organized according to /topic/{queue}. From the point of view of the cluster, it is shown in the following figure:

RocketMQ adopts master-slave synchronization by default. Of course, the multi-copy mechanism was introduced from RocketMQ4.5, but the granularity of its copy is Commitlog file . The data from the node node is consistent.

1.3 File layout comparison

The layout of files in Kafka is Topic/partition, each partition has a physical folder, and file sequential writing is implemented at the partition file level . If there are hundreds of topics in a Kafka cluster, and each topic has hundreds of partitions, When the message is written in high concurrency, its IO operation will appear scattered, and its operation is equivalent to random IO, that is, the IO performance of Kafka when the message is written will increase with the number of topics and partitions, and its write performance will increase first. , and then descend .

However, RocketMQ pursues extreme sequential writing when writing messages. All messages are written to the commitlog file in order regardless of the topic, and the sequentiality will not be affected as the number of topics and partitions increases. However, through the author's practice, a physical machine and SSD disk are used, but a file cannot fully utilize the performance of disk IO.

The two file organization methods, in addition to the difference in the sequential writing of the disk, due to the problem of granularity, Kafka's topic expansion partition will involve the movement of the partition in each broker, and its expansion operation is relatively heavy, while the RocketMQ data storage is based on For commitlog files, data movement will not occur during expansion, but will only affect new data. RocketMQ's operation and maintenance costs are lower for Kafka.

Finally, the ack parameter of Kafka can be analogous to RocketMQ's synchronous replication and asynchronous replication.

When the ack parameter of Kafka is 1, compared with the asynchronous replication of RocketMQ; -1 is compared to the synchronous replication of RocketMQ, and -1 is compared to the oneway mode of the RocketMQ message sending method.

2. Data writing method


2.1 Kafka message writing method

Kafka's message writing uses FileChannel, and the code screenshot is as follows:

And the transferTo method is used when the message is written . According to the information on the Internet, the network read and write in NIO is really zero-copy, that is, the transferTo or transferFrom method of FileChannel needs to be called, and its internal mechanism uses the sendfile system call.

2.2 RocketMQ message writing method

RocketMQ's message writing supports memory mapping and FileChannel writing. The example is shown in the following figure:

2.3 Comparison of message writing methods

Although RocketMQ and Kafka both support FileChannel writing, the API that RocketMQ calls when writing based on FileChannel is not transferTo, but first calls writer, and then regularly flushes to disk. The code screenshot is as follows:

Why doesn't RocketMQ call the transerTo method? Personally, I think it is related to RocketMQ's need to assemble the MQ message format in the Broker. It needs to decode the request from the network, transfer it to the heap memory, and then process the message, and finally persist it to the disk.

From the online query data, there is probably a point of view: the sendfile system call is copied from the user cache area to the kernel cache area one more time than the memory map, but when writing more than 64K memory, the performance of sendfile is often higher, which may be Because sendfile is based on block memory.

3. Message sending method


3.1 Kafka message sending mechanism

Kafka uses a double-ended queue on the message sending client and introduces the idea of ​​batch processing. Its message sending mechanism is shown in the following figure:

When the client sends a message by calling the message sender of kafka, the message will first be stored in a double-ended queue. The single element in the double-ended queue is ProducerBatch, which represents a sending batch, and its maximum size is controlled by the parameter batch.size. The default is 16K. Then a separate Send thread will be opened, a sending batch will be obtained from the double-ended queue, and the messages will be sent to the Kafka cluster in batches. The linger.ms parameter is introduced here to control the sending behavior of the Send thread.

In order to improve the high throughput of kafka message sending, that is to control the behavior of the message sending thread when the batch.size is not full in the buffer, whether to send immediately or wait for a certain time, if linger.ms is set to 0, it means to send immediately, If set to greater than 0, the message sending thread will wait for this value before sending to the broker. The linger.ms parameter will increase the response time, but it is beneficial to increase the throughput. Somewhat similar to Nagle's algorithm in the TCP domain .

Kafka's message sending will organize the data according to the message storage protocol when writing to the ProducerBatch, and it can be directly written to the file on the server side.

3.2 RocketMQ message sending mechanism

RocketMQ message sending on the client side mainly selects a queue according to the routing algorithm, and then sends the message to the server side. The message will be organized on the server side according to the storage format of the message, and then persisted and other operations.

3.3 Message sending comparison

Kafka has a significant advantage over RokcetMQ in message sending is that the organization of the message format takes place on the client side, which will have a big advantage in saving the CPU pressure on the Broker side, and the client's "distribution" has inherited its advantages. The architecture is somewhat similar to the difference between shardingjdbc and MyCat.

Another feature of Kafka on the message sender side is the introduction of double-ended buffer queues. Kafka is ubiquitous in the pursuit of batch processing. This notable feature is that it can improve the throughput of message sending, but it also increases the response time of messages. , and brings the possibility of message loss, because Kafka will return success after appending to the message cache. If the message sender exits abnormally, it will cause message loss.

linger.ms = 0 in Kafka is analogous to the effect of RocketMQ message sending.

However, Kafka can be customized according to the scene by providing two parameters, batch.size and linger.ms , which is more flexible than RocketMQ.

For example, in a log cluster, the parameters of batch.size and linger.ms are usually increased , and the batch message sending mechanism is repeatedly used to improve its throughput; however, if it is sensitive to some response time, the value of linger.ms can be appropriately reduced .

4. Summary


From the above comparison, Kafka's overall performance is indeed better than RocketMQ, but in the process of message selection, we should not only refer to its performance, but also consider its functionality. For example, RocketMQ provides rich Message retrieval function, transaction message, message consumption retry, timing message, etc.

The author personally thinks that Kafka is usually used in big data and stream processing scenarios, and RocketMQ is selected for business processing.

The article first published the public account of "Middleware Interest Circle",

The knowledge system that has been covered so far is shown in the figure below:

This article is shared from WeChat public account - middleware interest circle (dingwpmz_zjj).
If there is any infringement, please contact [email protected] to delete it.
This article participates in the " OSC Yuanchuang Project ", you are welcome to join and share with us.

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324079882&siteId=291194637