rabbitMq troubleshooting Experience

Recently some rely rabbitMq decomposition of doing business, not functioning, to do this to troubleshoot a simple conclusion:

 1, first of all, clear thinking, the use of the general idea of ​​doing business rabbitmq decomposition is (my understanding is this): core businesses disposed of -> producers to produce news -> then posted to the message queue -> Message Queue have messages -> consumers will receive messages -> do related business processes.

2, clear thinking, the next step is to identify a consistent environment, including redis (1 with a library or library a few), mongoDB, rabbit (link address), and some may be holding the database.

3, determine the environmental, business and then run it again to see if there is a problem. I found that there is a problem. So ready to find a local environmental issue, where the need to transfer all of the local environment.

4, since there are problems, then there are several possibilities: a, the producer side of the business code is bug, resulting in no production news

                                                                B, the message producers to produce, but sent to the wrong queue

                                                                c, mq environmental problems, leading to sent a message, but the queue is empty

                                                                d, consumer error messages received from the queue

                                                                e, a message queue, consumed by other consumers

                                                                F, the consumer receives the message, but the consumer side of the code has bug, resulting in no normal processing

                                                                Format g, the message in question

                                                           

5. Based on the above speculation, for my particular business, producers may have multiple (sending a message from a different end), consumers have only one (in fact, most are like this). Obviously I will start with the producers to start, is not very clear (if the test has been informed of some system sent me is normal, then you can never start with a normal producer may be the fastest). So I decided to start with the consumer to start, while the consumer side of the case may be edf, next to see what the consumer is listening message queue is assumed to be test.queue, mqMsg further confirm what format it is received and prepare the corresponding msg, then went rabbitMq admin interface, find the corresponding test.queue, he found that consumers really only one, exclude e, as shown below:

6, start the local consumer items, where consumers debug process, the message format mq ready to send a message in the background test.queue queue interface (personally recommend starting one, easy to debug), as shown below:

7, when viewed in the message queue test.queue found exclude C, as shown below,

8, were observed at this time the consumer receives the message, parses the message there is no abnormality exclude g, then a step break, the entire consumer code completion, no error check result, the business successfully completed, exclusions f.

9, then the consumer, queues are no problem, then the problem is obviously the producers, I am going to first determine what systems, each system there are areas in the production of news.

10, the start of each production item message, sequentially Debug, associated service code (Because there are multiple producers, this step is very time-consuming), eventually found in the user system below a similar determination if (xxOptional.isPresent) {produce messages}, and found xxOptional according to previous business code, no matter how kind is always empty, leading to the production of not news, and this judgment itself is useless, might not have taken into account this situation before colleagues, the final confirmation of this judgment with colleagues It can be removed.

After 11, the producer side of the problem, local testing, problem solving, and finally deploying to the test environment test, the problem is solved.

12,有的时候,可能是几种情况一起发生,应对这些情况,先从消费者入手,利用rabbitMq后台能较快排查问题。

 

 

 

 

 

 

 

 

 

 

Guess you like

Origin blog.csdn.net/qq_29231037/article/details/90409998