Troubleshoot high CPU usage

basic environment

  • tomcat 7
  • JDK 8
  • Linux

 

identify the problem

View background exceptions

By viewing the background log of the system, it is found that each request is normal, and no exception is thrown. So consider the system condition

View system status

Top command to view the usage of CPU, memory, etc.

[root@DEV-L002323 ~]# top
top - 14:52:54 up 514 days,  7:00,  8 users,  load average: 2.85, 1.35, 1.62
Tasks: 147 total,   1 running, 146 sleeping,   0 stopped,   0 zombie
Cpu(s): 57.6%us,  6.3%sy,  0.0%ni,  9.2%id, 26.2%wa,  0.0%hi,  0.0%si,  0.7%st
Mem:   3922928k total,  3794232k used,   128696k free,   403112k buffers
Swap:  4194296k total,    65388k used,  4128908k free,  1492204k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                     
 6764 root      20   0 2428m 1.1g  11m S 190.0 28.3  36:38.55 java                                                                       
 1161 root      20   0     0    0    0 D  0.3  0.0  32:43.06 flush-253:0                                                                 
 1512 root      20   0 14684 4188  488 S  0.3  0.1   0:16.12 sec_agent                                                                   
    1 root      20   0 19356  652  436 S  0.0  0.0   0:16.64 init                                                                        
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.05 kthreadd                                                                    
    3 root      RT   0     0    0    0 S  0.0  0.0   1:49.34 migration/0                                                                 
    4 root      20   0     0    0    0 S  0.0  0.0  17:46.61 ksoftirqd/0                                                                 
    5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0                                                                 
    6 root      RT   0     0    0    0 S  0.0  0.0   2:02.78 watchdog/0                                                                  
    7 root      RT   0     0    0    0 S  0.0  0.0   1:46.79 migration/1

Found from the results of the top command. The CPU utilization of the java process with pid of 6764 continues to be too high, reaching 190%. Memory usage is 28.3%.

Locate problem thread

Use ps -mp pid -o THREAD,tid,timethe command to view the thread situation of the process, and find that the two threads of the process have a high occupancy rate

[root@DEV-L002323 ~]# ps -mp 6764 -o THREAD,tid,time
USER     %CPU PRI SCNT WCHAN  USER SYSTEM   TID     TIME
root     71.7   -    - -         -      -     - 00:36:52
root      0.0  19    - futex_    -      -  6764 00:00:00
root      0.0  19    - poll_s    -      -  6765 00:00:01
root     44.6  19    - futex_    -      -  6766 00:23:32
root     44.6  19    - futex_    -      -  6767 00:23:32
root      1.2  19    - futex_    -      -  6768 00:00:38
root      0.0  19    - futex_    -      -  6769 00:00:00
root      0.0  19    - futex_    -      -  6770 00:00:01
root      0.0  19    - futex_    -      -  6771 00:00:00

It can be seen from the above that the two threads 6766 and 6767 occupy the CPU for about half an hour, and the CPU utilization of each thread is about 45%. Next, you need to check the problem stack of the corresponding thread. 
Let's take a look at the stack of the problem thread 6766.

View question thread stack

Convert thread id to hexadecimal

[root@DEV-L002323 ~]#  printf "%x\n" 6766
1a6e

jstack view thread stack information

The jstack command prints thread stack information, command format:jstack pid |grep tid

[root@DEV-L002323 ~]# jstack 6764 | grep 1a6e
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007ffeb8016800 nid=0x1a6e runnable
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007ffeb8016800 nid=0x1a6e runnable 
"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007ffeb8016800 nid=0x1a6e runnable  
"VM Periodic Task Thread" prio=10 tid=0x00007ffeb8016800 nid=0x3700 waiting on condition 

JNI global references: 496

As can be seen from the above, these are the threads of the GC. Then it can be inferred that it is very likely that there is not enough memory to cause the GC to continue to execute. Next we need to check 
the gc memory

jstat to view the process memory status

Order: jstat -gcutil

[root@DEV-L002323 bin]# jstat -gcutil 6764 2000 10
  S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT   
  0.00   0.00  100.00 100.00  97.74   1863   33.937   310  453.788  487.726
  0.00   0.00  100.00 100.00  97.74   1863   33.937   310  453.788  487.726
  0.00   0.00  100.00 100.00  97.74   1863   33.937   310  453.788  487.726
  0.00   0.00  100.00 100.00  97.74   1863   33.937   310  453.788  487.726
  0.00   0.00  100.00 100.00  97.74   1863   33.937   310  453.788  487.726
  0.00   0.00  100.00 100.00  97.74   1863   33.937   310  453.788  487.726
  0.00   0.00  100.00 100.00  97.74   1863   33.937   310  453.788  487.726
  0.00   0.00  100.00 100.00  97.74   1863   33.937   310  453.788  487.726
  0.00   0.00  100.00 100.00  97.74   1863   33.937   310  453.788  487.726
  0.00   0.00  100.00 100.00  97.74   1863   33.937   310  453.788  487.726

It can be seen that the utilization of the young generation and the old band of the memory has reached an astonishing 100%. The number of FGCs is also particularly high and continues to soar. It can be inferred that there 
must be a problem with the implementation of the program, and it is necessary to focus on viewing large objects or abnormally large object information. At this point, you can generate a headdump file and get it locally for analysis

jstack and  jmap analyze process stack and memory status

Use the jmapcommand to export the heapdump file, and then get it locally for analysis using jvisualvm.exe .

命令: jmap [option] vmid 
jmap -dump:format=b,file=dump.bin 6764

Command:  jstack [option] vmid 
jstack -l 6764 >> jstack.out

Locate the work site in the program from the heapdump file, and the memory status, as follows: 
Thread:

"Thread-21" daemon prio=5 tid=85 WAITING
    at java.lang.Object.wait(Native Method)
    at java.lang.Object.wait(Object.java:503)
    at net.sf.jasperreports.engine.fill.AbstractThreadSubreportRunner.waitResult(AbstractThreadSubreportRunner.java:81)
       Local Variable: net.sf.jasperreports.engine.fill.ThreadExecutorSubreportRunner#2
    at net.sf.jasperreports.engine.fill.AbstractThreadSubreportRunner.start(AbstractThreadSubreportRunner.java:53)
    at net.sf.jasperreports.engine.fill.JRFillSubreport.prepare(JRFillSubreport.java:758)
    at net.sf.jasperreports.engine.fill.JRFillElementContainer.prepareElements(JRFillElementContainer.java:331)
       Local Variable: net.sf.jasperreports.engine.fill.JRFillSubreport#3
    at net.sf.jasperreports.engine.fill.JRFillBand.fill(JRFillBand.java:384)
    at net.sf.jasperreports.engine.fill.JRFillBand.fill(JRFillBand.java:358)
    at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillBandNoOverflow(JRVerticalFiller.java:458)
       Local Variable: net.sf.jasperreports.engine.fill.JRFillBand#3
    at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillPageHeader(JRVerticalFiller.java:421)
    at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillPageBreak(JRVerticalFiller.java:1954)
    at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillColumnBreak(JRVerticalFiller.java:1981)
    at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillDetail(JRVerticalFiller.java:754)
       Local Variable: net.sf.jasperreports.engine.fill.JRFillBand[]#1
       Local Variable: net.sf.jasperreports.engine.fill.JRFillBand#2
    at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillReportStart(JRVerticalFiller.java:288)
    at net.sf.jasperreports.engine.fill.JRVerticalFiller.fillReport(JRVerticalFiller.java:151)
    at net.sf.jasperreports.engine.fill.JRBaseFiller.fill(JRBaseFiller.java:939)
    at net.sf.jasperreports.engine.fill.JRFiller.fill(JRFiller.java:152)
       Local Variable: net.sf.jasperreports.engine.util.LocalJasperReportsContext#1
       Local Variable: net.sf.jasperreports.engine.fill.JRVerticalFiller#1
    at net.sf.jasperreports.engine.JasperFillManager.fill(JasperFillManager.java:464)
    at net.sf.jasperreports.engine.JasperFillManager.fill(JasperFillManager.java:300)
       Local Variable: java.io.File#135
       Local Variable: net.sf.jasperreports.engine.JasperFillManager#1
       Local Variable: net.sf.jasperreports.engine.JasperReport#1
    at net.sf.jasperreports.engine.JasperFillManager.fillReport(JasperFillManager.java:757)
    at com.pingan.icore.print.asyntask.jasper.AysnJasPdfConvertorThread.fill(AysnJasPdfConvertorThread.java:110)
       Local Variable: java.lang.String#57815
       Local Variable: java.lang.String#55498
       Local Variable: java.util.HashMap#1682
       Local Variable: java.lang.String#57807
       Local Variable: java.lang.String#57809
    at com.pingan.icore.print.asyntask.jasper.AysnJasPdfConvertorThread.run(AysnJasPdfConvertorThread.java:223)
       Local Variable: java.io.File#139
       Local Variable: java.io.File#138
       Local Variable: java.io.File#137
       Local Variable: java.io.File#136
       Local Variable: com.pingan.icore.print.asyntask.jasper.AysnJasPdfConvertorThread#1
    at java.lang.Thread.run(Thread.java:722)

Memory: 
I found that there are many instances of this net.sf.jasperreports.engine.fill.JRTemplatePrintText class, with instances accounting for 33.2% and size accounting for 58.1%

in conclusion

It can be judged here that it is caused by the improper creation and use of objects by JasperReport during conversion. However, there is no particularly good way to solve this problem, unless you change the source code or change a reporting tool 
to google whether others have encountered similar problems according to the above situation, and then locate the following two URLs: 
http:// community.jaspersoft.com/jasperreports-library/issues/4151 
http://community.jaspersoft.com/wiki/isprintwhendetailoverflowstrue-can-cause-report-render-indefinitely

It can be seen that the new version of jasperreports still has this problem. The problem  can only be avoided by unchecking  the option 'Print When Detail Overflows'
while making jasperreport's virtualizer(Virtualizes data to the filesystem. When this object is finalized, it removes the swap files it makes. The virtualized objects have references to this object, so finalization does not occur until this object and the objects using it are only weakly referenced.) 
to optimize jasperreport's memory usage and alleviate symptoms. 
Here is a demo of usage: 
http://www.massapi.com/source/sourceforge/17/71/1771543975/oreports-code/openreports/src/org/efs/openreports/util/ScheduledReportJob.java.html# 158

Problem solving isn't perfect.

1、ps -mp xxxx -o THREAD
     在当前用户下,列出pid包含的所有线程。

2、ps -mp xxxx -o THREAD  >> /tmp/thread.txt
     在当前用户下,列出pid包含的所有线程。并把结果增量 输出到文件/tmp/thread.txt。

3、ps -mp xxxx -o THREAD,tid
     在当前用户下,列出pid包含的所有线程信息及本地线程ID (tid)。

4、ps -mp xxxx -o THREAD |wc -l
     在当前用户下,列出pid包含的所有线程的个数 。
     “wc -l”是统计记录的行数。

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324982801&siteId=291194637