basic environment
- tomcat 7
- JDK 8
- Linux
identify the problem
View background exceptions
By viewing the background log of the system, it is found that each request is normal, and no exception is thrown. So consider the system condition
View system status
Top command to view the usage of CPU, memory, etc.
[root@DEV-L002323 ~]# top
top - 14:52:54 up 514 days, 7:00, 8 users, load average: 2.85, 1.35, 1.62
Tasks: 147 total, 1 running, 146 sleeping, 0 stopped, 0 zombie
Cpu(s): 57.6%us, 6.3%sy, 0.0%ni, 9.2%id, 26.2%wa, 0.0%hi, 0.0%si, 0.7%st
Mem: 3922928k total, 3794232k used, 128696k free, 403112k buffers
Swap: 4194296k total, 65388k used, 4128908k free, 1492204k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6764 root 20 0 2428m 1.1g 11m S 190.0 28.3 36:38.55 java
1161 root 20 0 0 0 0 D 0.3 0.0 32:43.06 flush-253:0
1512 root 20 0 14684 4188 488 S 0.3 0.1 0:16.12 sec_agent
1 root 20 0 19356 652 436 S 0.0 0.0 0:16.64 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.05 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 1:49.34 migration/0
4 root 20 0 0 0 0 S 0.0 0.0 17:46.61 ksoftirqd/0
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
6 root RT 0 0 0 0 S 0.0 0.0 2:02.78 watchdog/0
7 root RT 0 0 0 0 S 0.0 0.0 1:46.79 migration/1
Found from the results of the top command. The CPU utilization of the java process with pid of 6764 continues to be too high, reaching 190%. Memory usage is 28.3%.
Locate problem thread
Use ps -mp pid -o THREAD,tid,time
the command to view the thread situation of the process, and find that the two threads of the process have a high occupancy rate
[root@DEV-L002323 ~]# ps -mp 6764 -o THREAD,tid,time
USER %CPU PRI SCNT WCHAN USER SYSTEM TID TIME
root 71.7 - - - - - - 00:36:52
root 0.0 19 - futex_ - - 6764 00:00:00
root 0.0 19 - poll_s - - 6765 00:00:01
root 44.6 19 - futex_ - - 6766 00:23:32
root 44.6 19 - futex_ - - 6767 00:23:32
root 1.2 19 - futex_ - - 6768 00:00:38
root 0.0 19 - futex_ - - 6769 00:00:00
root 0.0 19 - futex_ - - 6770 00:00:01
root 0.0 19 - futex_ - - 6771 00:00:00
It can be seen from the above that the two threads 6766 and 6767 occupy the CPU for about half an hour, and the CPU utilization of each thread is about 45%. Next, you need to check the problem stack of the corresponding thread.
Let's take a look at the stack of the problem thread 6766.
View question thread stack
Convert thread id to hexadecimal
[root@DEV-L002323 ~]# printf "%x\n" 6766
1a6e
jstack view thread stack information
The jstack command prints thread stack information, command format:jstack pid |grep tid
[root@DEV-L002323 ~]# jstack 6764 | grep 1a6e
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007ffeb8016800 nid=0x1a6e runnable
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007ffeb8016800 nid=0x1a6e runnable
"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007ffeb8016800 nid=0x1a6e runnable
"VM Periodic Task Thread" prio=10 tid=0x00007ffeb8016800 nid=0x3700 waiting on condition
JNI global references: 496
As can be seen from the above, these are the threads of the GC. Then it can be inferred that it is very likely that there is not enough memory to cause the GC to continue to execute. Next we need to check
the gc memory
jstat to view the process memory status
Order: jstat -gcutil
[root@DEV-L002323 bin]# jstat -gcutil 6764 2000 10
S0 S1 E O P YGC YGCT FGC FGCT GCT
0.00 0.00 100.00 100.00 97.74 1863 33.937 310 453.788 487.726
0.00 0.00 100.00 100.00 97.74 1863 33.937 310 453.788 487.726
0.00 0.00 100.00 100.00 97.74 1863 33.937 310 453.788 487.726
0.00 0.00 100.00 100.00 97.74 1863 33.937 310 453.788 487.726
0.00 0.00 100.00 100.00 97.74 1863 33.937 310 453.788 487.726
0.00 0.00 100.00 100.00 97.74 1863 33.937 310 453.788 487.726
0.00 0.00 100.00 100.00 97.74 1863 33.937 310 453.788 487.726
0.00 0.00 100.00 100.00 97.74 1863 33.937 310 453.788 487.726
0.00 0.00 100.00 100.00 97.74 1863 33.937 310 453.788 487.726
0.00 0.00 100.00 100.00 97.74 1863 33.937 310 453.788 487.726
It can be seen that the utilization of the young generation and the old band of the memory has reached an astonishing 100%. The number of FGCs is also particularly high and continues to soar. It can be inferred that there
must be a problem with the implementation of the program, and it is necessary to focus on viewing large objects or abnormally large object information. At this point, you can generate a headdump file and get it locally for analysis
jstack
and jmap
analyze process stack and memory status
Use the jmap
command to export the heapdump file, and then get it locally for analysis using jvisualvm.exe .
命令: jmap [option] vmid
jmap -dump:format=b,file=dump.bin 6764
Command: jstack [option] vmid
jstack -l 6764 >> jstack.out
Locate the work site in the program from the heapdump file, and the memory status, as follows:
Thread:
"Thread-21" daemon prio=5 tid=85 WAITING
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at net.sf.jasperreports.engine.fill.AbstractThreadSubreportRunner.waitResult(AbstractThreadSubreportRunner.java:81)
Local Variable: net.sf.jasperreports.engine.fill.ThreadExecutorSubreportRunner#2
at net.sf.jasperreports.engine.fill.AbstractThreadSubreportRunner.start(AbstractThreadSubreportRunner.java:53)
at net.sf.jasperreports.engine.fill.JRFillSubreport.prepare(JRFillSubreport.java:758)
at net.sf.jasperreports.engine.fill.JRFillElementContainer.prepareElements(JRFillElementContainer.java:331)
Local Variable: net.sf.jasperreports.engine.fill.JRFillSubreport#3
at net.sf.jasperreports.engine.fill.JRFillBand.fill(JRFillBand.java:384)
at net.sf.jasperreports.engine.fill.JRFillBand.fill(JRFillBand.java:358)
at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillBandNoOverflow(JRVerticalFiller.java:458)
Local Variable: net.sf.jasperreports.engine.fill.JRFillBand#3
at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillPageHeader(JRVerticalFiller.java:421)
at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillPageBreak(JRVerticalFiller.java:1954)
at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillColumnBreak(JRVerticalFiller.java:1981)
at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillDetail(JRVerticalFiller.java:754)
Local Variable: net.sf.jasperreports.engine.fill.JRFillBand[]#1
Local Variable: net.sf.jasperreports.engine.fill.JRFillBand#2
at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillReportStart(JRVerticalFiller.java:288)
at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillReport(JRVerticalFiller.java:151)
at net.sf.jasperreports.engine.fill.JRBaseFiller.fill(JRBaseFiller.java:939)
at net.sf.jasperreports.engine.fill.JRFiller.fill(JRFiller.java:152)
Local Variable: net.sf.jasperreports.engine.util.LocalJasperReportsContext#1
Local Variable: net.sf.jasperreports.engine.fill.JRVerticalFiller#1
at net.sf.jasperreports.engine.JasperFillManager.fill(JasperFillManager.java:464)
at net.sf.jasperreports.engine.JasperFillManager.fill(JasperFillManager.java:300)
Local Variable: java.io.File#135
Local Variable: net.sf.jasperreports.engine.JasperFillManager#1
Local Variable: net.sf.jasperreports.engine.JasperReport#1
at net.sf.jasperreports.engine.JasperFillManager.fillReport(JasperFillManager.java:757)
at com.pingan.icore.print.asyntask.jasper.AysnJasPdfConvertorThread.fill(AysnJasPdfConvertorThread.java:110)
Local Variable: java.lang.String#57815
Local Variable: java.lang.String#55498
Local Variable: java.util.HashMap#1682
Local Variable: java.lang.String#57807
Local Variable: java.lang.String#57809
at com.pingan.icore.print.asyntask.jasper.AysnJasPdfConvertorThread.run(AysnJasPdfConvertorThread.java:223)
Local Variable: java.io.File#139
Local Variable: java.io.File#138
Local Variable: java.io.File#137
Local Variable: java.io.File#136
Local Variable: com.pingan.icore.print.asyntask.jasper.AysnJasPdfConvertorThread#1
at java.lang.Thread.run(Thread.java:722)
Memory:
I found that there are many instances of this net.sf.jasperreports.engine.fill.JRTemplatePrintText class, with instances accounting for 33.2% and size accounting for 58.1%
in conclusion
It can be judged here that it is caused by the improper creation and use of objects by JasperReport during conversion. However, there is no particularly good way to solve this problem, unless you change the source code or change a reporting tool
to google whether others have encountered similar problems according to the above situation, and then locate the following two URLs:
- http:// community.jaspersoft.com/jasperreports-library/issues/4151
- http://community.jaspersoft.com/wiki/isprintwhendetailoverflowstrue-can-cause-report-render-indefinitely
It can be seen that the new version of jasperreports still has this problem. The problem can only be avoided by unchecking the option 'Print When Detail Overflows'
while making jasperreport's virtualizer(Virtualizes data to the filesystem. When this object is finalized, it removes the swap files it makes. The virtualized objects have references to this object, so finalization does not occur until this object and the objects using it are only weakly referenced.)
to optimize jasperreport's memory usage and alleviate symptoms.
Here is a demo of usage:
- http://www.massapi.com/source/sourceforge/17/71/1771543975/oreports-code/openreports/src/org/efs/openreports/util/ScheduledReportJob.java.html# 158
Problem solving isn't perfect.
1、ps -mp xxxx -o THREAD
在当前用户下,列出pid包含的所有线程。
2、ps -mp xxxx -o THREAD >> /tmp/thread.txt
在当前用户下,列出pid包含的所有线程。并把结果增量 输出到文件/tmp/thread.txt。
3、ps -mp xxxx -o THREAD,tid
在当前用户下,列出pid包含的所有线程信息及本地线程ID (tid)。
4、ps -mp xxxx -o THREAD |wc -l
在当前用户下,列出pid包含的所有线程的个数 。
“wc -l”是统计记录的行数。