1 Introduction
When it comes to gold, silver and silver again, everyone can't bear the restlessness in their hearts. Here I will share with you a knowledge point (the application scenario of MQ) that I encountered in the previous interview. If there are any shortcomings, welcome the big guys Point to point.
Message middleware application background
To improve system performance, the first consideration is database optimization. However, due to historical reasons, horizontal expansion of the database is a very complicated project, so we generally try to block the traffic before the database. Whether it is an infinite scale-out server, or vertical blocking of traffic reaching the database, this is the idea. To block the traffic directly to the database, the cache component and the message component are the two major killers.
2. Introduction to MQ
MQ: Message queue, message queue, refers to a container that saves messages.
Now the commonly used MQ components are activeMQ, rabbitMQ, rocketMQ, and kafka. Although different MQs have their own characteristics and advantages, no matter which MQ it is, there are some features that MQ itself comes with.
Common message queue comparison
characteristic | ActiveMQ | RabbitMQ | RocketMQ | Kafka |
---|---|---|---|---|
Producer Consumer Model | support | support | support | support |
publish-subscribe model | support | support | support | support |
request response mode | support | support | not support | not support |
API completeness | high | high | high | high |
Multilingual support | support | support | java | support |
Single machine throughput | 10,000 class | 10,000 class | 10,000 class | one hundred thousand |
message delay | without | microsecond | millisecond | millisecond |
Availability | High (Master-Slave) | High (Master-Slave) | very high (distributed) | very high (distributed) |
message lost | Low | Low | theoretically not lost | theoretically not lost |
completeness of documentation | high | high | teach high | high |
Provides a quick start | have | have | have | have |
community activity | high | high | middle | high |
business support | without | without | business cloud | business cloud |
3. MQ features
first in first out
First-in, first-out is the most obvious feature of queues. The order of the message queue is basically determined when entering the queue, and generally does not require manual intervention. And, most importantly, the data is only one piece of data in use. This is why MQ is used in many scenarios.
publish subscribe
Publish and subscribe is a very efficient processing method. If there is no blocking, it can basically be regarded as a synchronous operation. This processing method can effectively improve server utilization, and such application scenarios are very extensive.
Endurance
Persistence ensures that the use of MQ is not only an auxiliary tool for some scenarios, but allows MQ to store core data like a database to ensure the reliability of MQ.
distributed
In the current usage scenarios of large traffic and big data, server software that only supports single applications is basically unusable, and it can only be widely used if it supports distributed deployment. Moreover, MQ is positioned as a high-performance middleware.
4. Application scenarios
The message queue middleware is an important component in a distributed system. It mainly solves problems such as application decoupling, asynchronous messages, traffic cutting, massive log data synchronization, distributed transactions, etc., to achieve high performance, high availability, scalability and eventual consistency. Architecture.
4.1 Application Decoupling
Scenario description : In general shopping scenarios, after a user places an order, the order system generally needs to notify the inventory system, which was previously notified through an interface call.
Disadvantages of calling through the interface : the order system and the inventory system are highly coupled, and avalanche accidents are prone to occur. If the inventory system cannot be accessed, the order will fail to reduce the inventory, resulting in order failure. When the call volume reaches a certain level, it will lead to the order system Cluster access time becomes longer.
The scheme after introducing the application message queue :
Order system : After the user places an order, the order system completes the local persistence processing and writes the message to the message queue. After the writing is successful, it returns the user's order placement success.
Inventory system : subscribe to the order information in MQ to obtain the order information, and the inventory system will increase or decrease the inventory according to the order information.
Key steps of the program :
Even if the inventory system cannot be used normally when placing an order, it will not affect the normal ordering, because after the order is placed, the order system writes to the message queue and no longer cares about other subsequent operations, as long as the final consistency is achieved. Realize the application decoupling of the order system and the inventory system.
4.2 Asynchronous messages
Scenario description : After a general user registers, a registration email and SMS need to be sent to the user. The previous steps can be divided into: serial mode, parallel mode.
- Serial mode : Write the user registration information into the database, after the writing is successful, send the registration email first, and then send the registration SMS. Only after the above three tasks are all completed, the success information is returned to the client.
- Parallel mode : write the user registration information into the database, and send the registration email and registration SMS after the writing is successful. After the above three tasks are completed, the success information is returned to the client. Compared with serial, the parallel method can improve the processing time.
Problem analysis :
Assuming that each of the three business nodes uses 50 milliseconds, without considering other overheads such as the network, the serial time is 150 milliseconds, and the parallel time may be 100 milliseconds.
Because the number of requests processed by the CPU per unit time is fixed, it is assumed that the CPU throughput is 100 times per second. Then the number of requests that the CPU can process in 1 second in serial mode is 7 times (1000/150). The number of requests processed in parallel is 10 times (1000/100).
As described in the above case, the performance (concurrency, throughput, response time) of the traditional system will have bottlenecks.
Decouple the steps of sending registration emails and registration SMS through message queues :
As can be seen from the above architecture, the user's response time is equivalent to the time when the registration information is written into the database, which is 50 milliseconds. After registering an email, sending a short message and writing it to the message queue, it returns directly, so the speed of writing to the message queue is very fast and can be basically ignored, so the user's response time may be 50 milliseconds. So after the architecture change, the throughput of the system increased to 20 QPS per second. It is 3 times better than serial and twice better than parallel.
4.3 Traffic cut
Traffic cutting is also widely used in message queues, and is generally widely used in spike or group grab activities.
Scenario description : Seckill activities generally lead to a surge in traffic due to excessive traffic, and the application or database hangs. To solve this problem, it is generally necessary to join a message queue at the front end of the application.
The architecture is as follows :
The benefits of joining a message queue :
- The number of people who can control the activity
- It can alleviate the application of high flow in a short period of time
After the user's request is received by the server, it is first written to the message queue. If the message queue length exceeds the maximum number, the user request will be discarded directly or the error page will be jumped.
The seckill service performs subsequent processing according to the request information in the message queue.
4.4 Massive log data synchronization
Scenario description : In the microservice system, projects are often deployed in clusters, so a unified log platform is needed to query the logs of each instance, but the log information in the cluster is often massive data, and a single log collection tool cannot meet the needs of the business. Therefore, message queues need to be used in log processing, such as the application of Kafka, to solve the problem of a large number of log transmissions.
The architecture is simplified as follows :
Architecture Description :
- The log collection client is responsible for log data collection, and writes regularly to the Kafka queue.
- Kafka message queue, responsible for receiving, storing and forwarding log data
- Log processing application: subscribe and consume log data in kafka queue
4.5 Distributed things
Distributed transactions are divided into strong consistency, weak consistency, and eventual consistency
-
Strong consistency :
When the update operation is complete, any access by multiple subsequent processes or threads will return the latest updated value. This is the most user-friendly, that is, what the user wrote last time is guaranteed to be read next time. According to CAP theory, this implementation requires sacrificing availability.
-
Weakly consistent :
The system does not guarantee that subsequent processes or thread accesses will return the latest updated value. After the data is successfully written, the system does not promise to read the newly written value immediately, nor does it promise to read the latest value.
-
Eventually consistent :
A specific form of weak consistency. The system guarantees that the system will eventually return the value of the last update operation without subsequent updates. Under the premise that no failure occurs, the time of the inconsistency window is mainly affected by the communication delay, the system load and the number of replicas. DNS is a typical eventual consistency system.
In a distributed system, it is almost impossible to satisfy the "CAP law" of consistency, availability, and partition tolerance at the same time. In the vast majority of scenarios in the Internet field, strong consistency needs to be sacrificed in exchange for high system availability. The system often only needs to ensure "eventual consistency", as long as the final time is within the acceptable range of users. Sometimes we only need to use short-term data inconsistency to achieve the effect we want.
Scenario description : For example, there are two data of order and inventory, and the process of placing an order is simplified as adding an order and subtracting an inventory. And orders and inventory are independent services, so how to ensure data consistency.
The most frustrating thing about remote calls is that there are three types of results, success, failure and timeout. If it times out, success or failure is possible. The general solution, most of the practice is to use mq to do eventual consistency.
To achieve eventual consistency :
These questions may come to mind through the above architecture :
The transaction is executed locally first, and a message is sent if the execution is successful, and the consumer gets the message and executes its own transaction.
For example, a and b are two services, and service a calls service b asynchronously. If service b fails, succeeds, or times out, how to use mq to make them eventually consistent?
Referring to the concept of local transactions, this scenario can be divided into three situations :
- The first case: Assuming that both a and b are executed normally, the entire business ends normally;
- The second case: Assuming that b times out, MQ needs to resend the message to b (the b service needs to be idempotent). If the retransmission fails, it depends on the situation, whether to interrupt the service, or continue to retransmit, or even human intervention;
- The third case: Assuming that one of a and b fails, the failed service uses MQ to send messages to other services, other services receive messages, query the local transaction log, and if the local fails, delete the received message (Indicates that the message consumption is successful), if the local is successful, you need to call the compensation interface for compensation (each service needs to provide a business compensation interface).
Special attention is required :
There is a pit in MQ, which is usually only applicable to scenarios where only the first operation fails, that is, after the first one succeeds, it must be ensured that the subsequent operations have no obstacles in the business, otherwise it is difficult to roll back if the later fails, and only allows If the system fails abnormally, business failure is not allowed. Usually, it is unlikely to succeed after a business failure. If the failure is caused by the network or downtime, it can be solved by retrying. If the business is abnormal, it can only be Send a message to service a and ask them to compensate, right? Compensation is usually performed by a third party, and each service needs to provide a compensation interface. In the design paradigm, the failure of consuming downstream services is usually not allowed.
5. Summary
MQ is used more and more in the scenario of distributed system development, and the business capability of processing is getting stronger and stronger, so it is necessary to master the usage scenarios of MQ. By mastering MQ, you can solve most business scenarios, and you can also add points in interviews to improve your core competitiveness.
Finally, it is not easy to go out to work. I hope all brothers can find their favorite jobs, and the Year of the Tiger will be full !
I also hope that brothers can follow, like, collect, comment and support a wave, thank you very much!