ActiveMQ production environment using the stepped pit recording

Reference website:

ActiveMQ official website: http://activemq.apache.org/

https://blog.csdn.net/yinwenjie/article/details/51205822 (Great God, Great God absolute)

Summary:

This chapter wrote the CMS (ActiveMQ the C ++ client) and ActiveMQ used in the production environment in mind stepping pit, a bitter tears ah

A, ActiveMQ cluster deployment

Here I used the great God who set up a reference website of ways: static bridging mode, version: 5.9.0, cluster operating environment for CentOS release 6.8

Here is a schematic view of a cluster of

 

Two, CMS client packaged as a static library again used in the production environment

Here is a version of cpp-3.9.3, the official website for the demo as a production environment is not enough, you need to re-package the next, because the company reasons, can not provide a detailed source code, here are some very, very important code, will be described later why:

 

Third, using the CMS static library stepped pit

This library will use to back-end big brother, also started with a normal, then the discovery wrong, and why sometimes is not set up? He came to me. It is also hard to force the start. Then I went to his testing environment and test environment cluster of view. Many, many times by restarting the service chiefs, I saw an interesting phenomenon, the chart below.

Gangster program environment map:

ActiveMQ cluster environment:

See, there was an intermediate state 2 tcp disconnect process, I wonder, this is random, but the probability is not small. Zuifan such a random. . .

Fourth, the pit row

Here, no way can only bite the bullet on. This library before delivery chiefs, I was initially tested, and found nothing wrong. In view of the above questions, then I ask to do a thorough testing program architecture on the basis of chiefs.

Here also we made a concurrent test, multi-threaded, multi-process connection test. Well, the above problem arises. When more complicated process used here to test the connection, I found the above problems. On the test chart:

Below this is a demo program of java to throw an exception :: DataInputStream :: readLong - Reached EOF

Haha, spent waited a long time effort, to locate where the problem of.

Back to think about it, this is a bit unusual, ah, program restart, the initialization process starts to connect MQ, MQ cluster initiative disconnect there, it stands to reason that this should not happen, CMS which MQ cluster there in the end should take the initiative after disconnecting, then send the FIN, to complete the entire tcp disconnection process, not always in CLOSE_WAIT state.

Since the more complicated process has this problem (multi-threaded connection is not), then I fork a child process between sleep 1 second, then completely normal. Wonderful work.

Then I checked the bug MQ development site to see some information:

This version looks like some of the problems ah. . . . . . . .

Fifth, optimization

Here, I realized that I can not solve this problem, unless renewed version verification. But you can avoid this problem?

We can see from the demo shots, when the more complicated process of problems will be an exception callback function, but I must be the exception callback ring out in the second part of the code in order to generate the callback function prototype: virtual void onException ( const CMSException & ex);

Since we know there is an exception, and then they can be set in an abnormal state needs to reconnect function, tested, reconnection can be successful. This avoids this problem. If there are big brothers to solve the above problems, be sure to tell me in the comments below, very grateful.

 

 

 

 

Published 155 original articles · won praise 15 · views 160 000 +

Guess you like

Origin blog.csdn.net/wangdamingll/article/details/102844988