Problem Description
When using rabbitMq consumer to monitor in simple mode, the service suddenly shut down automatically without any CPU or memory alarm in advance.
Check the log before shutting down the service and find OOM exception
Consumer thread error, thread abort.
But why does an exception cause the service to shut down?
When I started to see OOM, I thought about adding a heap dump file to the startup parameters that was generated when OOM occurred. However, when I checked the file directory, I found that I did not see the generated heap dump file. This was very strange.
problem analysis
After carefully reading the error log,
the error location is org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.AsyncMessageProcessingConsumer#run
@Override // NOSONAR - complexity - many catch blocks
public void run() {
// NOSONAR - line count
if (!isActive()) {
return;
}
boolean aborted = false;
this.consumer.setLocallyTransacted(isChannelLocallyTransacted());
String routingLookupKey = getRoutingLookupKey();
if (routingLookupKey != null) {
SimpleResourceHolder.bind(getRoutingConnectionFactory(), routingLookupKey); // NOSONAR both never null
}
if (this.consumer.getQueueCount() < 1) {
if (logger.isDebugEnabled()) {
logger.debug("Consumer stopping; no queues for " + this.consumer);
}
SimpleMessageListenerContainer.this.cancellationLock.release(this.consumer);
if (getApplicationEventPublisher() != null) {
getApplicationEventPublisher().publishEvent(
new AsyncConsumerStoppedEvent(SimpleMessageListenerContainer.this, this.consumer));
}
this.start.countDown();
return;
}
try {
initialize();
while (isActive(this.consumer) || this.consumer.hasDelivery() || !this.consumer.cancelled()) {
mainLoop();
}
}
catch (InterruptedException e) {
logger.debug("Consumer thread interrupted, processing stopped.");
Thread.currentThread().interrupt();
aborted = true;
publishConsumerFailedEvent("Consumer thread interrupted, processing stopped", true, e);
}
catch (QueuesNotAvailableException ex) {
logger.error("Consumer threw missing queues exception, fatal=" + isMissingQueuesFatal(), ex);
if (isMissingQueuesFatal()) {
this.startupException = ex;
// Fatal, but no point re-throwing, so just abort.
aborted = true;
}
publishConsumerFailedEvent("Consumer queue(s) not available", aborted, ex);
}
catch (FatalListenerStartupException ex) {
logger.error("Consumer received fatal exception on startup", ex);
this.startupException = ex;
// Fatal, but no point re-throwing, so just abort.
aborted = true;
publishConsumerFailedEvent("Consumer received fatal exception on startup", true, ex);
}
catch (FatalListenerExecutionException ex) {
// NOSONAR exception as flow control
logger.error("Consumer received fatal exception during processing", ex);
// Fatal, but no point re-throwing, so just abort.
aborted = true;
publishConsumerFailedEvent("Consumer received fatal exception during processing", true, ex);
}
catch (PossibleAuthenticationFailureException ex) {
logger.error("Consumer received fatal=" + isPossibleAuthenticationFailureFatal() +
" exception during processing", ex);
if (isPossibleAuthenticationFailureFatal()) {
this.startupException =
new FatalListenerStartupException("Authentication failure",
new AmqpAuthenticationException(ex));
// Fatal, but no point re-throwing, so just abort.
aborted = true;
}
publishConsumerFailedEvent("Consumer received PossibleAuthenticationFailure during startup", aborted, ex);
}
catch (ShutdownSignalException e) {
if (RabbitUtils.isNormalShutdown(e)) {
if (logger.isDebugEnabled()) {
logger.debug("Consumer received Shutdown Signal, processing stopped: " + e.getMessage());
}
}
else {
logConsumerException(e);
}
}
catch (AmqpIOException e) {
if (e.getCause() instanceof IOException && e.getCause().getCause() instanceof ShutdownSignalException
&& e.getCause().getCause().getMessage().contains("in exclusive use")) {
getExclusiveConsumerExceptionLogger().log(logger,
"Exclusive consumer failure", e.getCause().getCause());
publishConsumerFailedEvent("Consumer raised exception, attempting restart", false, e);
}
else {
logConsumerException(e);
}
}
catch (Error e) {
//NOSONAR
logger.error("Consumer thread error, thread abort.", e);
publishConsumerFailedEvent("Consumer threw an Error", true, e);
getJavaLangErrorHandler().handle(e);
aborted = true;
}
catch (Throwable t) {
//NOSONAR
// by now, it must be an exception
if (isActive()) {
logConsumerException(t);
}
}
finally {
if (getTransactionManager() != null) {
ConsumerChannelRegistry.unRegisterConsumerChannel();
}
}
// In all cases count down to allow container to progress beyond startup
this.start.countDown();
killOrRestart(aborted);
if (routingLookupKey != null) {
SimpleResourceHolder.unbind(getRoutingConnectionFactory()); // NOSONAR never null here
}
}
Thanks to the relevant source code of rabbitMq monitoring before, I know that this is where the consumer thread execution begins, cyclically consuming messages in the mainLoop. If an exception is thrown during consumption, the event publishConsumerFailedEvent will be released after being caught.
Spring will shut down the service after processing this event.
It turns out that the OOM here is caused by calling other services and being thrown by other services. This exception was not caught, and the run method thrown into SimpleMessageListenerContainer triggered an event to close the service.
After checking the listener method,
I found that exceptions were captured.
try{
}catch (Exception exception) {
log.error("exception occur={}", exception);
} finally {
channel.basicAck(message.getMessageProperties().getDeliveryTag(), false);
}
It was found that the manual ack mode was used. And exception capture is done for Exception, so how is the above exception thrown into SimpleMessageListenerContainer?
It turns out that OutOfMemoryError does not inherit from Exception.
If you also want to capture Error exception
, you need to add a capture.
try{
}catch (Exception exception | Error error) {
log.error("exception={},error={}", exception,error);
} finally {
channel.basicAck(message.getMessageProperties().getDeliveryTag(), false);
}
in conclusion
Therefore, when using manual ack mode, be sure to ensure that exceptions/errors do not throw threads