RabbitMQ Advanced Features - 100% message delivery success of the program

 Message delivery guarantee 100% winning designs


 What is the reliability of the delivery end of the production?

 

  1. Ensure the success of the message sent
  2. MQ nodes successfully receive protection
  3. MQ sender receives node (Broker) acknowledgment
  4. Improve the compensation mechanism news

If you want to guarantee 100% delivery success message, only the first three steps do not necessarily guaranteed. Some extreme cases, such as the production side when delivering the message may have failed, or that the production side delivered the news, MQ Broker also received, MQ Broker when you return the acknowledgment, due to network glitches led to the production receives no response, this this message does not know when the delivery succeeded or failed, so for these situations need to do some compensation mechanism.

Internet giant's solution:

  1. Message for falling, the message status marking
  2. Delayed delivery of the message, do secondary confirmation, check the callback

Which specific use to decide based on business scenarios and concurrency, data volume size

Program: information for falling message, status message marking scheme below:

. 1 STEP : service data storage: for example, sending an order message, the service data is the first order information storage, and then generates a message, the message also storage, the message should contain a message state attribute, and set for example, the initial value is 0, indicating the message being sent is successfully created, the flaw in this way is that we want to persist twice database.

the STEP 2 : First, to ensure the success of the first step messages are stored, without any exceptions, then the production side and then send a message. If you fail to quick failure mechanisms.

. 3 STEP : The results of the MQ message received response(confirm)to the production end.

. 4 STEP : producing a terminalConfirm Listener, asynchronous to monitorBrokerthe response sent back, to determine whether the message is successfully delivered, if successful, the message to query the database, and a state update message indicates a successful message delivery.

Suppose STEP 2 has been OK, in the third step echo response, the network suddenly appeared glitch, resulting in the production of end Listener receive confirm this reply message, that is the status of this message has been a 0.

5 the STEP : At this point we need to set a rule, such as message sets a threshold timeout in storage time, after five minutes if the state or 0, the message would need to be extracted. Here, the timing of a distributed task, the timing to fetch DB created for more than 5 minutes and the status message is a message from zero.

6 the STEP : to crawl out of the news re-delivery(Retry Send), that is, continue to go from the beginning of the second step down.

7 the STEP : Of course, some messages may be due to some practical problems can not be routed to the Broker, such as routingKey not set up, the corresponding queue is deleted by mistake, even if this message is retried several times and still can not be delivered successfully, it is necessary for the number of retries do restrictions, such as restrictions three times, if the delivery number is greater than 3 times, then the message will update the status of 2, indicating that the final delivery failure messages.

For a program delivery reliability, suitability in a highly concurrent scenarios?

For one embodiment, the operation needs to be done twice a persistent database, the database will be a performance bottleneck in a highly concurrent scenarios. In fact, only need to be in the core business data storage link, the message is not necessary to put in storage, it can do delayed delivery of a message, do the second confirmation callback inspection.

Scheme Two: the message delivery delay, do the second acknowledgment, checking the callback, as shown below:

 

Upstream ServiceUpstream service is the production side, Downstream servicethe downstream end of the service is consumption, Callback serviceis a callback service.

Step1 : first storage service message, and then sends out the message production side, it must be noted that other database operation is completed: after sending the message again.

2 STEP : After sending the message, followed by a message transmitting end production again(Second Send Delay Check), i.e. delayed message delivery inspection, there need to set a delay time, such as for delivery after 5 minutes.

. 3 STEP : an end consumer to listen on the specified queue, processes the received message.

. 4 STEP : After the process is completed, sending aconfirmmessage, the response is sent back, but this is not a normal response to the ACK, but re-generates a message delivered to the MQ.

5 the STEP : The aboveCallback serviceis a separate service, in fact, it plays the role of DB scheme a store messages, listening to it through MQ downstream services sentconfirma message, if theCallback servicereceivedconfirmmessage, then the message do persistent storage, ie message is persisted to the DB.

Step6 : After 5 minutes the MQ message to the delay, andCallback servicealso listen to the message queue corresponding delay, Check message after receiving the message DB to check whether there is, if there is no need to do any processing, or if there is no consumer fails, thenCallback serviceyou need to initiate communications to the upstream RPC service, tell it to delay delivery of this message is not found, need to re-send, receive information after the production side will re-query service message and then send out the message.

Option II is the Internet giant is more classic and mainstream solutions:

Option II does not necessarily guarantee hundred percent successful delivery, but basically you can guarantee about 99.9% of the message is OK, some particularly extreme cases can only be compensated to do it manually, or use scheduled tasks to do.

Scheme II database main purpose is to reduce the operation and improve concurrency. In high concurrency scenarios, most concerned about is not the message delivered 100 percent success, but we must ensure the performance, guaranteed hard to resist such a large amount of concurrency. Therefore, the operation can be reduced to minimize the database, it can be compensated for asynchronously.

In fact, there is no main flow of the Callback service, it belongs to a compensation of service, the entire core link is the production end storage business news, send a message to MQ, the queue monitor consumer side, consumer news. Other steps are a compensation mechanism.

 

Guess you like

Origin blog.csdn.net/LuuvyJune/article/details/92772358