Some common problems with Activemq

1. Let’s talk about the serious first: the service hangs up.

This has to start with the storage mechanism of ActiveMQ. In the usual case, non-persistent messages are stored in memory, persistent messages are stored in files, and their maximum limit is configured in the <systemUsage> node of the configuration file. However, when the non-persistent messages accumulate to a certain extent and the memory is in a hurry, ActiveMQ will write the non-persistent messages in the memory to a temporary file to free up memory. Although they are all saved to the file, the difference between it and the persistent message is that the persistent message will be restored from the file after restart, and the non-persistent temporary file will be deleted directly.

So what happens if the file size reaches the maximum limit in the configuration? I did the following experiment:

Set a persistent file limit of about 2G, and produce a large number of persistent messages until the file reaches the maximum limit. At this time, the producer is blocked, but the consumer can connect and consume the message normally. After a part of the message is consumed and the file is deleted to make room, The producer can continue to send messages again, and the service automatically returns to normal.

Set a temporary file limit of about 2G, mass-produce non-persistent messages and write temporary files. When the maximum limit is reached, the producer blocks, and consumers can connect normally but cannot consume messages, or consumers who consume slowly stop suddenly. The entire system can be connected, but cannot provide services, and just hangs up .

The specific reason is unknown. Solution: Try not to use non-persistent messages. If you have to use it, increase the temporary file limit as much as possible.

For detailed configuration information, see the documentation: http://activemq.apache.org/producer-flow-control.html

2. Lost messages

This has to start with Java 's java.NET .SocketException exception. Simply put, it is when the network sender sends a bunch of data and then calls close to close the connection. These sent data are all in the receiver's cache. If the receiver calls the read method, the data can still be read from the cache, even though the other party has closed the connection. But when the receiver tries to send data, because the connection is closed at this time, an exception will occur, which is well understood. However, it should be noted that when a SocketException occurs, the data in the original buffer area is also invalid. At this time, the receiver calls the read method again to read the data in the buffer, and the error Software caused connection abort: recv failed will be reported.

Through packet capture, ActiveMQ will send a heartbeat packet every 10 seconds. This heartbeat packet is sent by the server to the client to determine whether the client is dead or not. If you have read the first article above, you will know that non-persistent messages will be written to the file when they accumulate to a certain extent. This writing process will block all actions, and it will last for 20 to 30 seconds, and as the memory increases and increase. When the client calls connection.close() after sending the message, it will expect the server's answer to close the connection. If there is no answer for more than 15 seconds, it will directly call the close of the socket layer to close the tcp connection. At this time, the message sent by the client is still waiting to be processed in the server's cache, but due to the setting of the server's heartbeat packet, a java.Net.SocketException exception occurs , the data in the cache is invalid, and all unprocessed messages are lost.

Solution: Use persistent messages or non-persistent messages to process them in time without accumulation, or start a transaction. After the transaction is started, the commit() method will wait for the server's return responsibly, and will not close the connection and cause the message to be lost.

关于java.net.SocketException请看我的详细研究:http://blog.163.com/_kid/blog/static/3040547620160231534692/

 

3.持久化消息非常慢。

默认的情况下,非持久化的消息是异步发送的,持久化的消息是同步发送的,遇到慢一点的硬盘,发送消息的速度是无法忍受的。但是在开启事务的情况下,消息都是异步发送的,效率会有2个数量级的提升。所以在发送持久化消息时,请务必开启事务模式。其实发送非持久化消息时也建议开启事务,因为根本不会影响性能。

4.消息的不均匀消费。

有时在发送一些消息之后,开启2个消费者去处理消息。会发现一个消费者处理了所有的消息,另一个消费者根本没收到消息。原因在于ActiveMQ的prefetch机制。当消费者去获取消息时,不会一条一条去获取,而是一次性获取一批,默认是1000条。这些预获取的消息,在还没确认消费之前,在管理控制台还是可以看见这些消息的,但是不会再分配给其他消费者,此时这些消息的状态应该算作“已分配未消费”,如果消息最后被消费,则会在服务器端被删除,如果消费者崩溃,则这些消息会被重新分配给新的消费者。但是如果消费者既不消费确认,又不崩溃,那这些消息就永远躺在消费者的缓存区里无法处理。更通常的情况是,消费这些消息非常耗时,你开了10个消费者去处理,结果发现只有一台机器吭哧吭哧处理,另外9台啥事不干。

解决方案:将prefetch设为1,每次处理1条消息,处理完再去取,这样也慢不了多少。

详细文档:http://activemq.apache.org/what-is-the-prefetch-limit-for.html

5.死信队列。

If you want to not be deleted by the server after the message processing fails, but can be processed or retried by other consumers, you can turn off AUTO_ACKNOWLEDGE and leave the ack to the program itself. Then if AUTO_ACKNOWLEDGE is used, when is the message confirmed, and is there any way to prevent the confirmation of the message? have!

There are two ways to consume messages, one is to call the consumer.receive() method, which will block until a message is obtained and returned. In this case, the message is automatically acknowledged after it is returned to the method caller. Another method is to use the listener callback function. When a message arrives, the onMessage method of the listener interface will be called. In this case, the message will not be confirmed until the onMessage method is executed. At this time, as long as an exception is thrown in the method, the message will not be confirmed. Then the problem comes. If a message cannot be processed, it will be returned to the server for redistribution. If there is only one consumer, the message will be acquired again and an exception will be thrown again. Even if there are multiple consumers, messages that cannot be processed on one server often cannot be processed on another server. Is it just returning--getting--reporting an infinite loop of errors?

After 6 retries, ActiveMQ considers the message to be "poisonous" and will drop the message into the dead letter queue. If your message is missing, look for it in ActiveMQ.DLQ, maybe it's just lying there.

Detailed documentation: http://activemq.apache.org/redelivery-policy.html

http://activemq.apache.org/message-redelivery-and-dlq-handling.html

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326802658&siteId=291194637