RocketMQ Copilot is an intelligent auxiliary operation and maintenance system for Apache RocketMQ

picture

1. Introduction to RocketMQ

socketMQ is a distributed messaging middleware developed by Alibaba. It was later open sourced to the Apache Foundation and became Apache's top open source project. It has the characteristics of high performance, high reliability, high real-time and distribution. RocketMQ is mainly used to solve problems such as application coupling, message distribution, and traffic cutting.

The basic concepts of RocketMQ include Topic, Message, Message Attribute, etc. Topic is the top-level container for message transmission and storage in RocketMQ. It is used to identify messages of the same type of business logic and is uniquely identified and distinguished through TopicName. Producers send messages to topics, which are the carriers of data messages that are ultimately delivered to consumers. At the same time, the producer can define some attributes for the message, such as Message Key and Tag. The Message Key is the business identifier of the message, which is set by the message producer and uniquely identifies a certain business logic; the Message ID is the globally unique identifier of the message, which is set by RocketMQ. The system automatically generates it and uniquely identifies a message.

In practical applications, RocketMQ adopts the publish-subscribe model. The basic participating components mainly include: message sender, message server (message storage), message consumption, and route discovery. For example, in a simple scenario where a user places an order and adds user points based on the payment amount, the traditional model requires the order module to call the points module interface. In this case, the order module and the points module form a system coupling. Once the points module is modified or an exception occurs, It will affect the order module function. After the RocketMQ solution is introduced, after the user successfully places an order, the message can be written to the message queue.

As a message middleware, RocketMQ has experienced the baptism of Taobao Double Eleven. It can not only provide asynchronous decoupling and peak-shaving capabilities for distributed application systems, but also has the massive message accumulation and high throughput required for Internet applications. , reliable retry and other features.

2. Introduction to RocketMQ Copilot

RocketMQ Copilot is an intelligent auxiliary operation and maintenance system for Apache RocketMQ. Its core concept is to present the actual production experience of RocketMQ clusters in a productized form. It can assist the majority of enterprise developers in operating and managing self-built clusters, and it can also facilitate the operation and maintenance of self-built clusters. Master the best practices for RocketMQ cluster operation and maintenance.

Currently, RocketMQ Copilot mainly includes three functions: system inspection, expert diagnosis and cluster management. Through RocketMQ, faults that have occurred in the past are no longer repeated, and system faults can be discovered in advance, treated in advance, and nipped in the bud. After a failure occurs, the problem can be quickly and accurately located. Even if you are not familiar with the source code, you can still have enough confidence to dare to deploy it on the production system.

3. RocketMQ Copilot core capabilities

RocketMQ Copilot is essentially a professional operation and maintenance system tool that assists users in systematic operation and maintenance management of existing self-built clusters. Its overall architecture is shown in the figure below:

picture

1. System inspection

System inspection is mainly used to solve one problem: Is the cluster normal? This question is actually difficult to answer, because the status of the cluster is dynamic and traffic changes, and the health status of the cluster will also change. RocketMQ Copilot regularly detects potential system risks through preset inspection rules and generates risk reports to help users perceive and handle online cluster risks in advance.

System inspection covers:

  • Kernel parameter inspection to ensure that the kernel environment in which RocketMQ runs is in the optimal configuration.

  • Cluster parameter inspection, RocketMQ has hundreds of configuration items, and it is quite difficult to fully grasp each configuration item. Copilot will dynamically ensure that each parameter item is optimally configured under the current load based on the cluster situation.

  • Consumer group inspection checks whether each client configuration is correct, such as whether the most common subscription relationships are consistent, so that everyone can find out whether your customers are using RocketMQ according to recommended practices.

  • Topic-level inspection, such as whether the routes of the Topic are consistent, whether there are hotspot partitions, etc.

2. Cluster governance (SLI/SLO)

Through RocketMQ Copilot's built-in series of end-to-end SLI, combined with the ability to customize SLO, the service quality of the cluster can be quickly and digitally measured, and precise alarm items can be configured to greatly eliminate invalid alarms and alarm noise. .

picture

3. Expert diagnosis

The one-click online problem diagnosis tool provided by RocketMQ Copilot has a variety of built-in expert diagnosis templates. After users enter simple problem information, they can quickly obtain a problem diagnosis report and give suspected problem points.

A big problem that bothers me is that messages are not consumed. RocketMQ's semantics of guaranteeing at least delivery must ensure that messages cannot be lost. The troubleshooting path for the "message not consumed" problem is as shown in the figure below. There are dozens of troubleshooting branches, which can be seen in this type of troubleshooting. How high is the problem?

picture

4. Product advantages

1. Expert experience output

RocketMQ Copilot is based on the product team's more than ten years of experience in RocketMQ cluster operation and maintenance. Using the RocketMQ Copilot process, you can quickly master various experience skills and best practices in RocketMQ cluster operation and maintenance.

2. Lightweight output

RocketMQ Copilot adopts a single application architecture, does not have any software dependencies, and does not rely on the Internet to run. Whether it is an offline IDC environment or a cloud vendor virtual host environment, one-click deployment and operation can be achieved.

3. No intrusion into the cluster

RocketMQ Copilot positions itself as an auxiliary operation and maintenance tool, adopting the non-intrusive, offline, and localized system design concept. Using RocketMQ Copilot has no intrusion on users’ existing RocketMQ clusters.

4. Zero threshold free trial

RocketMQ Copilot provides all developers with a zero-threshold free trial subscription. New users can experience and use it without any charge. If they need to extend the trial, they can also apply for free.

5. Rooted in the community and continuously updated

RocketMQ Copilot continues to be closely integrated with the community, absorbing needs and contributions from the developer community, and maintaining rapid updates and iterations.

5. Conclusion

RocketMQ Copilot has now completed the development of the first version, and more planning capabilities will be released soon. At the same time, this product is permanently free for individual developers. You can go to the AutoMQ official website (https://play.automq.com) to try this ingenious work and experience the different operation and maintenance experience RocketMQ Copilot brings to you.

picture

Guess you like

Origin blog.csdn.net/weixin_40381772/article/details/134733651