Introduction to RocketMQ

Table of contents

Foreword:

1 Overview

2. Download and install, cluster construction

3. Message model

4. How to ensure throughput

4.1. Message storage

4.1.1 Sequential reading and writing

4.1.2. Asynchronous brushing

4.1.3. Zero copy

4.2. Network transmission


Foreword:

The code examples of RocketMQ have a full set of detailed demos in the installation directory, so this article does not focus on dead things like API, but on the characteristics of RocketMQ. Message middleware needs to focus on three points: throughput, message reliability, and message model. The similarities and differences of any message middleware are reflected in these three aspects. This article will also explain RocketMQ from these three aspects.

This article is the last of the author's RocketMQ series of articles. Some basic concepts, download and installation, clustering, basic message models, etc. have already been discussed in separate articles. Each article is very simple, and it takes less than two or three minutes to read a single article. , so the content will be directly linked.

1 Overview

RocketMQ  is an open source distributed messaging middleware, originally developed and open sourced by Alibaba Group. It aims to provide reliable, high-performance, and scalable message communication capabilities for distributed systems. RocketMQ, RabbitMQ, and KAFKA are listed as the three mainstream message middleware.

The basic concept + architecture of RocketMQ:

Basic Concepts of RocketMQ__BugMan's Blog-CSDN Blog

2. Download and install, cluster construction

RocketMQ download and installation tutorial + cluster construction tutorial:

RocketMQ download and installation, cluster building nanny level tutorial_Installing rocketmq cluster_BugMan's Blog-CSDN Blog

3. Message model

The message models of RocketMQ are as follows:

  • Sequential messages, consumers consume in the order sent by producers.
  • Broadcast messages, a message is consumed by multiple consumers.
  • Delayed messages, residence time, consumers can consume messages.
  • Batch messages, supporting producers to send messages in batches.
  • Filter messages, and through tags, consumers can consume the messages they are interested in under the same topic.
  • For transactional messages, the producer's message production supports transactional rollback.

Detailed explanation of RocketMQ message model:

Detailed Explanation of RocketMQ Using __BugMan's Blog - CSDN Blog

4. How to ensure throughput

4.1. Message storage

The biggest feature of RocketMQ can be summed up in one sentence:

It not only ensures the reliability of the message, but also guarantees the throughput.

Reliability and throughput are actually two mutually exclusive points. To ensure reliability, messages must be stored on the disk to prevent power loss. After falling behind the disk storage, the disk IO when reading this message will reduce the throughput. So the core of RocketMQ is actually to drop data to disk, and then try every means to improve throughput.

The production and consumption process of RocketMQ:

  • Receive the message sent by the producer and save it

  • After pushing a message to the consumer, wait for the consumer's ACK, and mark the message as consumed after receiving the ACK.

  • Periodically delete some expired messages.

The means used by RocketMQ to improve throughput:

  • sequential write

  • Asynchronous brushing

  • zero copy

4.1.1 Sequential reading and writing

The sequential read and write performance of the disk is much better than the random read and write. Because each time you read data from the disk, you need to address first to find the physical location of the data on the disk. For mechanical hard disks, it is moving the magnetic head, which will consume time. Compared with random reading and writing, sequential reading and writing saves most of the addressing time. It only needs to address once to continue reading and writing, so the performance is much better than random reading and writing.

RocketMQ takes advantage of this feature. All its message data are stored in an infinitely growing file queue, CommitLog, which is composed of a group of 1G memory-mapped file queues. When writing, it keeps writing from a fixed position, and when a file is full, it opens a new file to read and write sequentially.

The process of RocketMQ sequential disk writing is as follows:

  1. Producers send messages to RocketMQ Broker. After the Broker receives the message, it writes the message in sequence to the Page Cache (page cache) in the memory.

  2. Messages form continuous data blocks in the Page Cache of memory. Due to RocketMQ's sequential writing strategy, messages with the same Topic and Queue ID will be arranged in a batch according to the order in which they were sent to form continuous data blocks, rather than randomly scattered on the disk.

  3. Then, a background thread periodically writes sequential data blocks in the Page Cache to a storage file on disk, called "CommitLog". Since the data is written continuously, the writing operation of the disk becomes efficient, which reduces the seek time and fragmentation of the disk, and improves the writing performance.

It should be noted that sequential write to disk does not mean that all messages are written in sequence, but that messages with the same Topic and Queue ID will be written to disk in consecutive data blocks in the order in which they are sent. For messages with different Topic and Queue IDs, they may be interleaved on disk.

4.1.2. Asynchronous brushing

RocketMQ's asynchronous flush (Async Flush) is an optimization method used to improve message writing performance and throughput. In terms of message storage, RocketMQ follows the strategy of writing to the disk first and then returning, that is, before the message is written to the disk, it will first write the message to the Page Cache (page cache) of the operating system, and immediately return the successful write Respond to the producer, and then the background thread asynchronously flushes the data in the Page Cache to the disk.

4.1.3. Zero copy

Zero-Copy is an optimization technology designed to improve the efficiency and performance of data transfers, especially in file transfers and network data transfers. The traditional data transmission method involves multiple data copies, and zero copy reduces the overhead of data transmission by avoiding unnecessary data copy operations, thereby improving system performance.

In traditional data transfer, such as reading a file from disk and sending it over a network, the following steps are usually involved:

  1. Read data from disk to kernel space (Kernel Buffer).
  2. Copy data from kernel space to user space (User Buffer).
  3. Copy data from user space to network buffer (Network Buffer).
  4. Finally the data is sent over the network.

This traditional data transfer method involves multiple data copies, and each copy requires CPU participation, and data copying between kernel space and user space is required, resulting in additional overhead and delay.

The main idea of ​​zero-copy technology is to avoid unnecessary data copying, and reduce CPU and memory usage by directly transferring data between kernel space and user space.

Another article by asynchronous bloggers about 0 copy more detailed content:

The clearest zero-copy detailed explanation on the whole network, just read it once

4.2. Network transmission

Regarding serialization and why to use serialization, students who are not very clear can move to another blogger's article:

Detailed explanation of JAVA serialization__BugMan's Blog-CSDN Blog

RocketMQ messages use serialization, and serialization and deserialization are implemented by the SDKs on the producer side and the consumer side.

The serialization methods currently supported by RocketMQ are:

  1. RocketMQ custom serialization : RocketMQ uses a custom serialization protocol to encode and decode messages. This serialization protocol is called Remoting Command Protocol. The protocol uses a binary format and efficiently encodes and decodes messages to achieve high performance and low latency.

  2. JSON serialization : RocketMQ supports converting messages to JSON format for transmission. The JSON serialization method is suitable for processing messages with complex structures, and is easy to read and debug, but it will increase a certain amount of transmission overhead compared to the binary format.

  3. Java native serialization : RocketMQ also supports the use of Java native serialization (Java Serialization). Java native serialization is a serialization method provided by the Java standard library, but it may not be as good as other serialization frameworks in terms of performance.

  4. Hessian Serialization : Hessian is a binary serialization framework that supports multiple programming languages. RocketMQ supports using Hessian to serialize messages into binary data for transmission.

Guess you like

Origin blog.csdn.net/Joker_ZJN/article/details/131993079