After the wave, talk about your understanding of jvm performance tuning

After the wave, talk about your understanding of jvm performance tuning

In our daily R&D work, we often encounter system performance problems. At this time, we must perform system performance tuning. There are many types of system tuning, such as architecture and code optimization, jvm tuning, operating system tuning, database tuning, tomcat tuning, network tuning, etc. Architecture and code optimization are the most efficient tuning methods, but they cannot solve all performance problems. Today we are going to review a common topic, jvm tuning.

This article mainly includes the following

  • Java memory model review

  • When is JVM tuning required

  • Common OOM exceptions and cases

  • JVM comes with monitoring tools

  • JVM commonly used tuning parameters

  • JVM third-party monitoring tool

  • Tuning case

Java memory model review

First, let's review the memory model of JVM with HotSpot, as shown in the figure below:

After the wave, talk about your understanding of jvm performance tuning

The HotSpot memory model is divided into 3 parts:

Class loader

The class loader is used to load the .class file compiled by java, extract the class information and store it in the method area in a certain data structure.

Runtime data area

The thread stack and the local method stack are used to store related information such as method calls when the thread is running. The program counter records the address of the bytecode instruction in the main memory. These three modules are all thread-private.

The heap stores objects created when the program is running.

For the method area in the jvm specification, before java8, HotSpot's implementation of the method area was in the permanent generation. Starting with java7, HotSpot began to remove the permanent generation, symbol references were moved to the native heap, and literal and class static variables were moved to the java heap. HotSpot in Java 8 completely abolished the permanent generation, and replaced the permanent generation with the meta space to implement the method area in the JVM specification. The meta space is used to store the metadata of the class, and the memory is allocated in the local memory and does not occupy the JVM memory.

The distribution of heap memory is as follows:

After the wave, talk about your understanding of jvm performance tuning

The heap space allocation strategy of the G1 garbage collector is as follows:

After the wave, talk about your understanding of jvm performance tuning

ZGC memory allocation appeared later to be more dynamic and flexible. This article takes Java 8 as an example, and does not discuss G1 and ZGC

By the way, review the commonly used garbage collection algorithms:

a. Cleaning algorithm: it will cause memory fragmentation and low memory allocation efficiency

After the wave, talk about your understanding of jvm performance tuning

b. Compression algorithm: high performance overhead

After the wave, talk about your understanding of jvm performance tuning

c. Copy algorithm: Heap usage efficiency is low

After the wave, talk about your understanding of jvm performance tuning

Common garbage collectors:

The new generation: Serial, Parallel Scavenge (pay more attention to throughput, cannot be used with CMS) and Parallel New, all use mark-copy algorithm

Old age: Serial Old (marker-compression algorithm) and Parallel Old (marker-compression algorithm), and CMS (marker-sweep algorithm, support for concurrency), replaced by G1 in java9

Execution engine

The HotSpot interpreter executor will interpret the loaded bytecodes into machine code one by one for execution. For hotspot codes that are repeatedly executed, the JIT Compiler will compile the bytecodes into machine codes and then execute them.

The garbage collector reclaims the heap memory space occupied by dead objects.

In the above JVM memory model architecture diagram, the three purple areas are our focus when tuning. Heap is the area where the object data is stored. This area is managed by the Garbage Collector, which can be specified when the JVM is started. JVM tuning generally revolves around modifying the size of Heap and selecting the most appropriate Garbage Collector. Although JIT Compiler will also have a big impact on application performance, the new version of JVM does not need to be optimized.

When is JVM tuning required

From the appearance point of view, when the application is slow to respond or can no longer provide services, or the application throughput is small, and the memory space is too large, we need to tune the application. These appearances are generally accompanied by frequent garbage collection, or OOM.

There are generally 3 indicators for JVM tuning

Application memory

Mainly the heap memory allocated to the jvm, which is specified by the -Xms and -Xmx parameters when the jvm is started, which are the memory allocated when the jvm starts and the maximum memory that can be allocated at runtime, respectively.

Throughput

For example, the number of transactions processed per second, the number of batch tasks completed per hour, and the number of successful requests to the database per hour.

Response delay

The time it takes from the application receiving the request to returning the response, or the time from the browser sending the request to the page rendering.

Common OOM anomalies and recurrence methods

OOM is the exception we programmers most want to see, but it often happens in our work. It will be triggered when the jvm does not have enough memory to allocate space for the newly created object, and there is not enough memory for the garbage collector to use, and the Java application will trigger OOM. Of course, Linux itself also has an OOM killer mechanism. When the kernel monitors that the process takes up too much space, especially when the memory increases instantaneously, in order to prevent the memory from being exhausted, it will trigger the OOM to kill the process. Common OOM in Java is as follows:

java.lang.OutOfMemoryError: Java heap space

There are no more than two reasons for this exception, memory leak and memory overflow. When the memory overflows, you need to adjust the JVM parameter -Xmx configuration to increase the heap space. If it is a memory leak, you need to find out the leaked code. For this, see the monitoring tool explained later.

The following code is a typical memory leak code, set -Xmx512m at startup

public class HeapSize {
    public static void main(String[] args){
        List<User> list = new ArrayList<>();
        User user = new User();
        while (true){
            list.add(user);
        }
    }
}

Wait for a while after execution:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3210)
at java.util.Arrays.copyOf(Arrays.java:3181)
at java.util.ArrayList.grow(ArrayList.java:261)
at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:235)
at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:227)
at java.util.ArrayList.add(ArrayList.java:458)
at boot.oom.HeapSize.main(HeapSize.java:18)

java.lang.OutOfMemoryError: GC overhead limit exceeded

The reason for this exception is that the garbage collector GC efficiency is very low. JVM spends more than 98% of the CPU time to perform a GC, but the reclaimed memory is less than 2% of the heap space size, and the GC has exceeded 5 consecutive times.

public class GcOverrhead {
    public static void main(String[] args){
        Map map = System.getProperties();
        Random r = new Random();
        while (true) {
            map.put(r.nextInt(), "value");
        }
    }
}

Add parameters when the above code is started: -Xmx45m -XX:+UseParallelGC -XX:+PrintGCDetails runs for a period of time, and the following exception will occur. Note: This parameter is only the author's local environment and needs to be modified accordingly.

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Hashtable.addEntry(Hashtable.java:435)
at java.util.Hashtable.put(Hashtable.java:476)
at boot.oom.HeapSize.main(HeapSize.java:20)

This exception can be avoided by adding the parameter -XX:-UseGCOverheadLimit, but in fact, you are lying to yourself, you still need to locate and solve the problem.

java.lang.OutOfMemoryError: Requested array size exceeds VM limit

This exception is easy to understand. The size of the array requested to be allocated exceeds the JVM limit. There are two reasons for this situation:

The requested array is too large, resulting in insufficient jvm space

The requested array is greater than or equal to Integer.MAX_INT-1

The following 2 pieces of code:

This code directly throws Requested array size exceeds VM limit


int[] arr = new int[Integer.MAX_VALUE - 1];

This code first throws Java heap space and then Requested array size exceeds VM limit

for (int i = 3; i >= 0; i--) {
    try {
        int[] arr = new int[Integer.MAX_VALUE-i];
        System.out.format("Successfully initialized an array with %,d elements.\n", Integer.MAX_VALUE-i);
    } catch (Throwable t) {
        t.printStackTrace();
    }
}

The results are as follows:

java.lang.OutOfMemoryError: Java heap space
at boot.oom.ArraySizeExceeds.main(ArraySizeExceeds.java:12)
java.lang.OutOfMemoryError: Java heap space
at boot.oom.ArraySizeExceeds.main(ArraySizeExceeds.java:12)
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at boot.oom.ArraySizeExceeds.main(ArraySizeExceeds.java:12)
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at boot.oom.ArraySizeExceeds.main(ArraySizeExceeds.java:12)

java.lang.OutOfMemoryError: MetaSpace

Metaspace has been mentioned before. This exception is insufficient metaspace. The solution is to increase the metaspace size and configure the parameter MaxMetaSpaceSize. We add the parameter when starting the reference:

-XX:MaxMetaspaceSize=2m, report an error directly:

Error occurred during initialization of VM
OutOfMemoryError: Metaspace

java.lang.OutOfMemoryError: Request size bytes for reason. Out of swap space

This exception is caused by insufficient swap space of the operating system. We know that the maximum memory allocated by jvm is specified by some parameters such as Xmx. If the total memory required by jvm exceeds the maximum physical memory that can be allocated by the host, swap space will be used. If the swap space is insufficient, jvm memory allocation will fail , Thus throwing this exception. The location of this exception is more complicated, and it may be caused by other processes on the host consuming too much memory. Therefore, I do not recommend using the crude way of increasing swap space, disabling swap, and isolating the process is a more appropriate solution.

java.lang.OutOfMemoryError: Unable to create native threads

This exception is also operating system level. As you all know, java threads are at the operating system level. Every time java applies for a thread, it needs to call the operating system to create a local thread. If the operating system fails to create a thread, the above exception will be thrown. The specific reasons are as follows:

a. The memory space is not enough. When the jvm starts, the parameter -Xss specifies the stack size occupied by each thread. If the memory is not enough, the thread creation will fail

b. The max user processes parameter limit in ulimit on the operating system, this parameter refers to the number of global threads that the operating system can create

ulimit -a | grep'max user processes' command can be viewed, as shown below:

After the wave, talk about your understanding of jvm performance tuning

ulimit -u can modify this parameter. For example, ulimit -u 10000, the operating system can create 10,000 threads.

c. The parameter sys.kernel.threads-max limit, we can pass the command

cat /proc/sys/kernel/threads-max to view, as shown below:

After the wave, talk about your understanding of jvm performance tuning

To modify this parameter, you need to add sys.kernel.threads-max = 10000 in the /etc/sysctl.conf file

d. Parameter sys.kernel.pid_max limit, this parameter just needs to allocate a pid every time a thread is created. When the value of pid is greater than this value, the creation will fail. View command: cat /proc/sys/kernel/pid_max
After the wave, talk about your understanding of jvm performance tuning

To modify this parameter, you need to add sys.kernel.pid_max =10000 in the /etc/sysctl.conf file

private static void crateSlowThread(){
        try {
            System.out.println(Thread.currentThread());
            Thread.currentThread().sleep(15000);
        } catch (InterruptedException e)
            e.printStackTrace();
        }
    }

    public static void test1() {
        while (true){
            new Thread(() -> crateSlowThread()).start();
        }
    }

Look at the code above, create threads continuously in the infinite loop, and finally this OOM will be reproduced: see the following figure:
After the wave, talk about your understanding of jvm performance tuning

The following code simulates high concurrency

public static void test() {
        for (int i = 0; i < 20; i ++){
            System.out.println(Thread.currentThread());
            new Thread(() -> crateSlowThread()).start();
        }
    }

    private static void crateSlowThread(){
        try {
            System.out.println(Thread.currentThread());
            Thread.currentThread().sleep(15000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    @RequestMapping("/createNativeThreads1")
    public String createNativeThreads1(){
        System.out.println("createNativeThreads test1");
        CreateNativeThreads.test1();
        return "Sucess!";
    }

Test in jmeter:

After the wave, talk about your understanding of jvm performance tuning

JVM comes with monitoring tools

JPS lists all processes on the target virtual machine

Example of use: jps -mlvV
After the wave, talk about your understanding of jvm performance tuning

主要参数

-m 打印传递给主类的参数

-l 打印模块名以及包名

-v 打印jvm启动参数,比如-XX:+HeapDumpOnOutOfMemoryError

-V 输出通过标记的文件传递给JVM的参数

jstat mainly monitors the performance data of virtual machines

Example of use: jstat -gc -h 2 44074 1s 5

After the wave, talk about your understanding of jvm performance tuning

基本参数:

-t展示从虚拟机运行到现在的性能数据

-h n 当n大于0是每隔几行展示行头部信息

vmid 展示虚拟机表示

interval 展示性能采样数据的间隔时间

count 展示性能指标的次数

性能参数:

Class 类加载器统计信息

Compiler 即时编译器统计信息

Gc 堆垃圾回收信息

Gccapacity 各代的空间信息

Gccause 同gcutil

Gcnew 新生代统计信息

Gcnewcapacity 展示新生代空间占用情况

Gcold 老年代统计信息

Gcoldcapacity 展示老年代空间占用情况

Gcmetacapacity:meta space 空间大小信息

Gcutil 统计垃圾收集汇总信息

Printcompilation:Displays Java HotSpot VM compilation method statistics.

jmap displays the object shared memory or heap memory information of the specified process

Use example:

Export all live objects in the heap

jmap -dump:live,format=b,file=filename.bin

Print the surviving objects in the heap

jmap -histo:live 44074

主要参数如下:

-clstats 展示被加载类的信息

-finalizerinfo 展示所有待 finalize 的对象

-histo 展示各个类的实例数目以及占用内存,并按照内存使用量从多至少的顺序排列

-histo:live 展示堆中的存活对象

-dump 导出 Java 虚拟机堆的快照

-dump:live 只保存堆中的存活对象

After the system OOM, if the service is hung up, or the monitoring system monitors that the service is shut down and restarted, the heap snapshot cannot be exported through the jmap command, so we need to add the following 2 parameters when starting the virtual machine:

-XX:+HeapDumpOnOutOfMemoryError

-XX:HeapDumpPath=\dump

jinfo view or modify the parameters of the Java process

Use example:

Display parameter configuration information jinfo 44074

Modify the process parameter jinfo -flag +HeapDumpAfterFullGC 44074

The main parameters:

-flag name 打印参数名是name的参数值

-flag [+|-]name 更改bool类型参数值

-flag name=value 增加参数对

-flags 打印传递给jvm的参数对

-sysprops 打印java系统参数对

jstack prints the thread stack information of the java process and the locks held by the thread

Example: jstack 44074

The output is as follows:

After the wave, talk about your understanding of jvm performance tuning

常用参数:

-F -l参数无响应是强制打印快照信息

-l 打印有关锁的额外信息比如Locked ownable synchronizers

-m Prints a mixed mode stack trace that has both Java and native C/C++ frames.

JVM commonly used tuning parameters

堆空间设置:

-Xmx4g 进程占用的最大堆空间大小,超出后会OOM

-Xms2g 初始化堆空间大小

-Xmn1g 年轻代大小,官方推荐配置为整个堆的3/8

-XX:NewRatio=n 年轻代和老年代空间大小比值

-Xss512k 每个线程占用内存大小

-XX:SurvivorRatio=n:年轻代中Eden区与Survivor区的比值。比如n=4,则Eden和Survivor比值为4:2,survivor占年轻代一半

-XX:MetaspaceSize=512m 元空间大小

-XX:MaxMetaspaceSize=512m 这个参数用于限制Metaspace增长的上限,防止因为某些情况导致Metaspace无限的使用本地内存

-XX:MinMetaspaceFreeRatio=N

当进行过Metaspace GC之后,会计算当前Metaspace的空闲空间比,如果空闲比小于这个参数,那么虚拟机将增长Metaspace的大小。在本机该参数的默认值为40,也就是40%。设置该参数可以控制Metaspace的增长的速度,太小的值会导致Metaspace增长的缓慢,Metaspace的使用逐渐趋于饱和,可能会影响之后类的加载。而太大的值会导致Metaspace增长的过快,浪费内存

-XX:MaxMetasaceFreeRatio=N

当进行过Metaspace GC之后, 会计算当前Metaspace的空闲空间比,如果空闲比大于这个参数,那么虚拟机会释放Metaspace的部分空间。在本机该参数的默认值为70,也就是70%。

-XX:MaxMetaspaceExpansion=N Metaspace增长时的最大幅度

垃圾收集器设置

-XX:+UseSerialGC 设置串行收集器

-XX:+UseParallelGC 设置并行收集器

-XX:+UseParalledlOldGC 设置并行年老代收集器

-XX:+UseConcMarkSweepGC 设置并发收集器

-XX:ParallelGCThreads=n 设置并行收集器收集时使用的线程数

-XX:MaxGCPauseMillis=n 设置并行收集最大暂停时间

-XX:GCTimeRatio=n 设置垃圾回收时间占程序运行时间的百分比,1/(1+n)

-XX:+DisableExplicitGC 禁止外部调用System.gc()

-XX:MaxTenuringThreshold 年轻代复制多少次才会进入老年代

垃圾回收统计信息

-XX:+PrintGC

-XX:+PrintGCDetails

-XX:+PrintGCTimeStamps 打印每次垃圾回收前,程序未中断的执行时间

-Xloggc:filename 把gc日志存入文件

-XX:+PrintGCApplicationStoppedTime 打印垃圾回收期间程序暂停的时间

-XX:+PrintGCApplicationConcurrentTime 打印每次垃圾回收前,程序未中断的执行时间

-XX:+PrintHeapAtGC 打印GC前后的详细堆栈信息

-XX:+HeapDumpOnOutOfMemoryError

-XX:HeapDumpPath=/dump

JVM third-party monitoring tool

eclipse mat

Download link: https://www.eclipse.org/mat/downloads.php

eclipse mat is a very common tool for analyzing java applications. It can be integrated in eclipse or installed separately. Let’s take a previous OOM exception case to introduce

public static void test(){
        List<User> list = new ArrayList<>();
        User user = new User();
        while (true){
            list.add(user);
        }
    }

Start application command:

java -jar -XX:+PrintGC -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./ spring-boot-mybatis-1.0-SNAPSHOT.jar

Call this method after startup, the program throws OOM, and generates a heap dump file: java_pid46242.hprof, then we open the mat tool and import the pair dump file just now, as shown below:

After the wave, talk about your understanding of jvm performance tuning

There are two ways for MAT calculation objects to occupy memory. The first is the Shallow heap, which uses memory for statistical objects. The second is the Retained heap, which counts the total memory that the garbage collector can reclaim when the object is no longer referenced, including the memory occupied by the object itself and the memory occupied by other objects that can only be referenced by the object. The pie chart above is based on the Retained heap.

From the above report, we can see that there is a memory leak. After clicking in, we can find the memory leak, as shown in the figure below:

After the wave, talk about your understanding of jvm performance tuning

Ali Diagnostic Tool Arthas

Tool address:


https://alibaba.github.io/arthas/arthas-tutorials?language=cn&id=arthas-basics
https://alibaba.github.io/arthas/arthas-tutorials?language=cn&id=arthas-advanced

scenes to be used:

Find the jar package where the class is located

Find out the cause of the exception

Find the reason why the code is not executed

Online debug

Global monitoring system status

Real-time monitoring of JVM running status

IBM heap anolyzer

This tool is a graphical tool for finding heap memory leaks. The interface is as follows:

Official website address:

https://www.ibm.com/support/pages/ibm-heapanalyzer

This tool is currently no longer updated by IBM, and the official website recommends MAT

Tuning case

Deadlock diagnosis

The following code is a classic deadlock case

public static void test() {
        Object lockA = new Object();
        Object lockB = new Object();

        new Thread(() ->{
            synchronized (lockA){
                try {
                    Thread.sleep(2000);
                }catch (InterruptedException e){
                    e.printStackTrace();
                }
                synchronized (lockB){
                    System.out.println("thread 1");
                }
            }
        }).start();

        new Thread(() ->{
            synchronized (lockB){
                synchronized (lockA){
                    System.out.println("thread 2");
                }
            }
        }).start();
    }

After the main function is started, the execution has not been completed. When calling this method with http, it is found that no result is returned. Enter the command: jstack 45309> deadlock.txt and then check the produced file, you can see the thread in the BOLOCKED state, as shown below:

After the wave, talk about your understanding of jvm performance tuning

Heap memory parameter settings

We add the following 2 parameters when the java application starts, and detailed garbage collection information will be printed in the log

-XX:+PrintGC

-XX:+PrintGCDetails

The following is a Full GC log, let’s analyze it


[Full GC (Allocation Failure) [PSYoungGen: 0K->0K(150528K)] [ParOldGen: 243998K->142439K(172032K)] 243998K->142439K(322560K), [Metaspace: 47754K->47754K(1093632K)], 3.6879500 secs] [Times: user=3.91 sys=0.00, real=3.69 secs]

Full GC: means a Full GC garbage collection, if it does not carry Full, then it means Minor GC

Allocation Failure: The reason for this garbage collection is because the young generation does not have enough memory to allocate to new objects

[PSYoungGen: 0K->0K(150528K)]: These 3 values ​​represent the size of the heap memory occupied by the young generation before garbage collection, the size of the heap memory occupied by the young generation after garbage collection, and the total size of the heap memory occupied by the young generation.

[ParOldGen: 243998K->142439K(172032K)]: These 3 values ​​respectively represent the size of the heap memory occupied by the old generation before garbage collection, the size of the heap memory occupied by the old generation after garbage collection, and the total size of the heap memory occupied by the old generation

243998K->142439K(322560K): These two values ​​respectively represent the amount of heap memory used before garbage collection, the amount of heap memory used after garbage collection, and the total size of heap space

[Metaspace: 47754K->47754K(1093632K)]: These 3 values ​​represent the memory size of metaspace before garbage collection, the memory size of metaspace after garbage collection, and the total size of metaspace.

3.6879500 secs: the duration of this GC

[Times: user=3.91 sys=0.00, real=3.69 secs]: These 3 times represent the CPU time consumed by the GC thread, the time spent by the GC process system calls and waiting, and the time the application is suspended.

Heap memory size setting

The above analysis shows that the size of the heap memory occupied after garbage collection in the old age is 142439K=139M

We specify the size of the heap space according to this value. In our application, it is recommended that the -Xms and -Xmx parameters be set to the same size. This can reduce the number of GCs during the initial startup period and prevent the JVM from requesting memory from the OS during operation. These 2 parameters are recommended to be set to 3~4 times the size of the heap memory occupied after garbage collection in the old generation. In this case, it is 139M (3~4). The official recommendation is to set the young generation to 3/8 of the total heap memory size, so Young generation size is -Xmn139M (3~4) * 3/8

Meta space size setting

The above analysis shows that the memory size occupied by the meta-space garbage collection is 47754K=47M

-XX:MetaspaceSize -XX:MaxMetaspaceSize These two values ​​are recommended to be set to 1.2~1.5 times the above value, that is, 47M * (1.2~1.5)

Taking the maximum analysis value from the above, the startup parameters are:

java -jar -Xms556m -Xmx556m -Xmn208m -XX:MetaspaceSize=70m -XX:MaxMetaspaceSize=70m -XX:+PrintGC -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./ spring-boot-mybatis-1.0-SNAPSHOT.jar

Garbage collection time

For Minor GC, the time consumed is proportional to the size of the young generation space, and the trigger frequency of Minor GC is inversely proportional to the size of the young generation space. The example is as follows:

After the wave, talk about your understanding of jvm performance tuning

We sampled in the log, and Minor GC triggered 8 times within 10s, the frequency is times/0.8s, the average time of these 8 GCs is: 0.05s=50ms

If our system tuning index is 40ms, then we need to reduce the size of the young generation. In the above case, the size of our young generation is reduced by 20%, 208m * 80%

Finally, JVM tuning is a permanent topic, and I have limited abilities. I welcome criticism and corrections.

Reference article:

https://docs.oracle.com/javase/8/docs/technotes/tools/unix/index.html
https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/index.html
https://www.oracle.com/technetwork/tutorials/tutorials-1876574.html#t1s2
https://plumbr.io/outofmemoryerror
http://openjdk.java.net/jeps/122

Source code address in the article:

https://github.com/jinjunzhu/spring-boot-mybatis

Guess you like

Origin blog.51cto.com/15072921/2607515