Remember an OOM near the launch process
story background
Recently, I have been rushing to launch the application, and the package is basically ready to go online. No one wants to make mistakes at this point in time~
At that time, the application was already online and the stress test had passed. However, yesterday afternoon, my colleagues in the test group suddenly found me and said that my application did not consume kafka data, and other applications had already consumed it synchronously, which made me stunned.
First, go to Consul to see the service situation. I found that except for my application on the pre, everything else is still alive! ! ! ! (I have a little peace of mind~)
Quickly connect to the server to view the application, and find that the log is staying at more than 1 am on the 19th.
[2019-07-19 01:46:33] [WARN ] [dc16dc8a-79d8-4ec5-adca-3f7bdff7bb7d] [http-nio-6030-exec-7] [AbstractHandlerExceptionResolver.java:140] [org.springframework.web.servlet.mvc.method.annotation.ExceptionHandlerExceptionResolver] : Resolved [org.springframework.web.util.NestedServletException: Handler dispatch failed; nested exception is java.lang.OutOfMemoryError: GC overhead limit exceeded]
Then quickly restart and let the test colleagues verify the data synchronization~
Troubleshoot
Since the server memory is tight (8G) and multiple applications need to be deployed, the default startup JVM parameters are set to a relatively low value (-Xms256m -Xmx512m -Xmn128m -XX:MaxPermSize=64m)~
Since the OOM dump parameter was not added, there was no way to locate the problem at that time, and the problem could only be reproduced locally.
- First set the same JVM parameters locally, plus the path where the OOM dump occurs
-Xms256m -Xmx512m -Xmn128m -XX:MaxPermSize=64m -XX:ErrorFile=G:/heap/dump/hs_err_pid%p.log -XX:HeapDumpPath=G:/heap/dump -XX:+HeapDumpOnOutOfMemoryError
Then let the program run for a while, and the program really reproduces! ! ! At the same time, the corresponding heap file is generated when OOM occurs when we set it up.
[2019-07-20 11:01:50] [ERROR] [c8dc279e-1e7e-4a1a-b646-0203f23e7ba6] [http-nio-6030-exec-116] [AdviceController.java:48] [com.ost.micro.scheduler.strategy.controller.AdviceController] : {}
org.springframework.web.util.NestedServletException: Handler dispatch failed; nested exception is java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1006)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:925)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:974)
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:877)
Let's take a look at the current JVM heap graph:
Heap memory has not been reclaimed, resulting in OOM.
Open Eclipse Memory Analyzer
and analyze the generated dump file. We found that it org.drools.core.impl.KnowledgeBaseImpl
occupies 72%!!!
Solve the problem
For work reasons, I came into contact with the drools rule engine. The problems found above were obviously solved by using drools. Since the KieBase setting in the program is a singleton, the problem should not lie with it.
Let's take a look at the code using drools~
public <T extends IRule> Integer execute(T t) {
KieSession session = this.session();
session.insert(t);
int size = session.fireAllRules();
if (Objects.nonNull(t.getIError())){
throw new PayStrategyException(t.getIError());
}
return size;
}
At this point, I was wondering, is it the problem that KieSession is not closed? I took a look at the description of obtaining KieSession~
/**
* Creates a new {@link KieSession} using the default session configuration.
* Don't forget to {@link KieSession#dispose()} session when you are done.
*
* @return created {@link KieSession}
*/
It is very clear that when the execution is completed, the dispose() method must be executed to recycle the session. The modified code is as follows
public <T extends IRule> Integer execute(T t) {
KieSession session = null;
try {
session = this.session();
session.insert(t);
int size = session.fireAllRules();
if (Objects.nonNull(t.getIError())) {
throw new PayStrategyException(t.getIError());
}
return size;
} finally {
if (Objects.nonNull(session)) {
session.dispose();
}
}
}
Set the JVM parameters (-Xmx1344M -Xms1344M -Xmn448M -XX:MaxMetaspaceSize=256M -XX:MetaspaceSize=256M -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70 -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses -XX:+CMSClassUnloading XX:+ParallelRefProcEnabled -XX:+CMSScavengeBeforeRemark), re-run the JVM heap as shown below:
write at the end
Since I am not familiar with the working engine of Drools, I almost made a problem. It seems that I still need to read more official documents and write down this accident to remind myself~
Finally, recommend the URL for generating JVM parameters http://xxfox.perfma.com/jvm/generate