JAVA application performance monitoring and Tuning

Performance monitoring and Tuning

The first chapter JDK-based command-line tool monitoring

A parameter type .JVM

1. Standard parameters

-help

-serve

-client

-version

2.X parameters

Non-standard parameters

3.XX parameters

Non-standardized parameters

Relatively unstable

Mainly for tuning and JVM Debug

bool type, + - No.

kv type

-Xmx maximum memory

-Xms minimum memory

II. JVM run-time parameters View

1.jvm performance tuning tool jinfo

jinfo is jdk that comes with command, can be used to view the extended parameter java program running, support modifying some parameters at runtime

View all 2.jinfo -help command jinfo

3.jinfo -flags pid etc.

4. Usage Details: https://www.cnblogs.com/NetKillWill/archive/2017/10/28/jinfo.html

Three .jstat View Virtual Machine statistics

1. Class Loader

jstat -class pid 1000 5

View class loading, process ID, interval seconds, several times

2. Garbage Collection

jstat -gc pid 1000 3

View details gc

3.JIT compilation

jstat -compiler 4578

Four .jmap + MAT combat memory overflow

1. heap overflow: List to add objects inside stays full heap space

2. Non-heap memory overflow: List to add classes in information support full non-heap space

3. How to export the memory image file

3.1 out of memory automatically exported:

-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./

3.2 jmap command to manually export

jmap -help to view Use Help

jmap -dump, format = b, file = heap.hprof pid to export

jmap -heap 4578 and so on command

4.MAT analysis of memory overflow

Five .jstack combat the cycle of death and deadlocks

1. Print all the threads java internal process information

2.jps -l see what java process

3.jstack pid >pid.txt

4. Thread states and state transitions (to be learned)

The cycle of death investigation

top view consumption performance process, combined with jps -l, obtain process ID

jstack pid> pid.txt export process stack information for all threads

Under each thread information top -p pid -H view the process

printf% x 4584 converted to hexadecimal number of threads

Jstack exported to search for pid.txt thread number to view information

6. deadlock investigation

In pid.txt search for deadlock

The second chapter of visual monitoring based JvisualVM

A. Position

jvisualvm.exe \ Java \ under the Program Files \ jdk1.8.0_181 \ bin: C

II. Application of remote jar

1. Start command is as follows, and then adding a JMX connection (see the CPU, stack, class loading, threads, etc.)

nohup java -Xms512M -Xmx600M -Djava.rmi.server.hostname=101ycy.com -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=1099 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -jar carloan-service-admin-0.0.1.jar > ./out &

2. Add jstat connection (see Visual GC)

cd $JAVA_HOME/bin

touch jstatd.all.policy

vim jstatd.all.policy

jstatd.all.policy reads as follows:

grant codebase "file:${java.home}/../lib/tools.jar" {

permission java.security.AllPermission;

};

The same directory execute the following command, pay attention to set the hostname and port, you can add background nohup start

Window run:

./jstatd -J-Djava.security.policy=jstatd.all.policy -J-Djava.rmi.server.hostname=101ycy.com -p 2099 -J-Djava.rmi.server.logCalls=true

Background process:

nohup ./jstatd -J-Djava.security.policy=jstatd.all.policy -J-Djava.rmi.server.hostname=101ycy.com -p 2099 -J-Djava.rmi.server.logCalls=true &

The third chapter based debug monitor Btrace

A role:

1.btrace object may be dynamically inject code into bytecode target tracking applications, so debugging

2. You can intercept controller

3. interception Constructor

4. The method of interception of the same name

5. Return to intercept value

6. intercept abnormal

II. Use

1. Create a script, command line btrance pid script .java

2.visualVM install the plug-in btrace

III. Precautions

1. Default can only be run locally

2. production environment may be used, but modified bytecode will not be reduced, the need to restart the application

3. To locally first test well, and then run in a production environment

Chapter IV tomcat performance monitoring and tuning

A remote debug .tomcat

1.jdwp, it defines the communication protocol between the debugger and debug jvm

2. use: run / debug remote configuration

Configure the tomcat, then the idea of

3. The process can also be ordinary jar, add configuration commands to start

Two .tomcat-manager monitor, built-in

1. Low version enabled by default, the default high-security version in order not to open

2. Configuration:

conf / tomcat-users..xml user added

conf / Catalina / localhost / manager.xml configuration allows remote connection, need for new

Restart

3. Open your browser to view

Three .psi-probe monitoring

1.github: psi-probe, download maven package, mvn clean package

2. also need to configure two files

3. Open your browser to view

4. Role

Application of statistical information;

jsp pre-compiled, accelerate application access

View Log

Monitoring thread stack information

Connection Information

Request quantity; request processing time; the number of bytes in response to the request

Four .tomcat Tuning

1. Memory optimization (talk back)

2. Thread Optimization

maxConnections: maximum number of connections, then 8 multiplexed NIO default, default values ​​to see the document, a default NIO / NIO2 is 10000, APR8192

acceptCount: exceed the maximum number of connections, into the queue, queue number

maxThreads: maximum number of threads, each http request arrives web service, tomcat will create a thread to handle, concurrent handled number, default 200, associated with the machine CPU memory configuration, monitoring, watching the number of worker threads working condition, you can know how much open appropriate

minSpareThreads: minimum idle worker thread is not recommended to wait too small so as not to cause more than request

3. configuration optimization

autoDeploy: tomcat not to periodically check the deployment of new applications, will affect the performance, conf / server.xml, the production environment must be closed

enableLookups: changed to false, consumption performance is disabled by default, open the DNS query will go

reloadable: changed to false, discovery classes, web-inf changes to reload, default is not open, context.html document

protocol: http / 1.1, the default use the old version NIO on behalf of BIO, server.html document

APR mode: Call native methods to solve the problem of asynchronous IO from the operating system level, corresponding to a substantial increase in processing capacity, tomcat preferred mode for highly concurrent applications, you need to install the dependent libraries

sesson optimization: If it is jsp, if useless native session, you can disable session on the page, <% page session = "false"%>

Documentation: docs / config / http.html

Chapter V Nginx performance monitoring and tuning

Install a .niginx

1. Modify the yum source

2.yum install nginx

3./etc/nginx

4. The need to configure reverse proxy configuration selinux, Close: setenforce 0, view status getenforce, needs to close

Two .ngx_http_stub_status monitor connection information

1. First check whether this module compiled in nginx -V see the compiler parameters to find --with-http_stub_status_module, there came the compiler

The number of wait request header: the client request header reading, writing:: the number of responses head, waiting;: 2.active connections reading the current number of active connections

Three .ngxtop monitoring request information

1. Install python-pip

yum install warm-release

yum install python-pip

2. Install ngxtop

pip install ngxtop

3. Official documents

https://github.com/lebinh/ngxtop

4. Some commands

Default Monitoring: ngxtop

Ip and the top stream request: ngxtop top remote_addr request

Four .nginx-rrd graphical monitoring

1. graphical monitoring

Five .nginx optimization

1. Increase the number of worker threads and the number of concurrent connections

worker_processes 2; # number of processes, generally consistent with the number of cpu

events {

worker_connections 10240; # to open the maximum number of connections per process

multi_accept on; # can establish multiple connections once

use epoll; # using the epoll (high-performance mode of linux2.6)

}

2. Enable long connection

Reverse proxy call services can also be enabled long connection

3. Enable caching, compression

gzip on;

4. Operating System Optimization

/etc/sysctl.conf kernel parameter optimization

5. Reference

You can see other courses devoted to nginx

Chapter VI JVM GC tuning layer

How much memory should be set, what mechanism should be set to recover

A memory structure .JVM

1. runtime data area (specification)

FIG specified as follows:

 

2. Program Counter: JVM executed simultaneously support multiple threads, each thread has its own program counter, the executing thread method is called the current method, if the Java code, the program counter in the store is currently executing instruction address, If the code is C, null

3. virtual machine stack: the thread is a private, life-cycle with the same thread, the virtual machine is described stack memory model Java method executed: each method is created while executing a stack frame for storing local variable table, operand stack, dynamically linked, and other information for export, each method execution until the call is completed, it corresponds to a push to the stack frame outbound stack in a virtual machine process

Heap heap: the JVM is the biggest piece of memory, is shared by all threads, created when the virtual machine is powered on, the memory only purpose is to store objects, can be continuous, or may be discontinuous, can be logically consecutive

4. The method area methodarea: java8 called in the metaspace, 6, 7 are permanently PermGen generation method called non-heap difference

The method is part of the constant region of the pool

The native method stacks: native method stacks using native methods to serve as a virtual machine

6.jvm memory structure

metaspace = class, package, method, field, bytecodes, constant pool, like reference symbols

ccs: 32-bit pointer of the class

codecache: JIT compiled native code, C is the code used jni

 

-XX: NewSize -XX: MaxNewSize new generation of size, maximum size

-XX: NewRatio -XX: SurvivorRatio ratio of old and new, Eden, and the ratio of area S

-XX: MetaspaceSize -XX: MaxMetaspaceSize method area size

-XX: + UseCompressedClassPointers CCS compression is enabled by default ,, class pointer 1G

and many more

II. Garbage collection algorithm

1. Object has been created to hold the application, can not afford it will release a memory leak

2. Recovery idea: enumerate the root, do reachability analysis

3. root node: the local variable table class loader, thread, VM stack, static members, constant reference variable native method stacks, etc.

4. garbage collection algorithm

Clear labeling: Object mark need to be recovered, then unified Clear

Disadvantages: inefficient, two process efficiency, more fragmented, resulting in early GC

5. Copy

Capacity equal to two at a time and only one block, the block is full copy to put one above another viable

Disadvantages: operational efficiency, space utilization is low

6. tags to organize

Mark, then come up with viable object moves toward one end, then clean out beyond the boundaries of memory

优缺点:没有碎片,但是整理内存比较耗时

7.分带垃圾回收(JVM)

Young区采用复制算法

Old区用标记清除或标记整理算法

7.对象分配

对象优先在Eden区

大对象直接进入老年代:-XX:PretenureSizeThreshold,多大的对象定位为大对象,这个参数配置

长期存活对象进入老年代:

-XX:MaxTenuringThreshold:年轻代年龄,多大进入老年代

-XX:+PrintTenuringDistribution:发生YGC打印存活对象年龄信息

-XX:TargetSurvivorRatio:S区回收后存活比例,80%的平均年龄跟年龄取最小值,达到这个值也会进Old

三.垃圾收集器

1.串行收集器Serial(web不用)

单线程,发现内存不够,暂停应用执行,起垃圾回收线程回收,再运行应用

开启串行:-XX:+UseSerialGC,启用这个默认启用下边这个

-XX:+UseSerialOldGC

2.并行收集器(适合后台处理)

吞吐量优先

-XX:+UseParallelGC,启用这个默认启用下边这个

-XX:+UseParallelOldGC

Server模式下的默认收集器,双核2G以上就是Server模式

概念:内存不够暂停应用,启动多个GC线程进行GC,然后启动应用

-XX:ParallelGCThreads=<N>多少个GC线程

CPU>8 N=5/8

CPU<8 N=CPU

有自适应特性,确定停顿时间和吞吐量自适应大小,一般不用

3.并发收集器(适合web应用)

响应时间优先

CMS:

-XX:+UseConcMarkSweepGC 老年代设置,默认年轻代如下

-XX:+UseParNewGC

低停顿,低延迟

收集过程:初始标记stw,停止应用;并发标记,并发预清理,重新标记stw,并发清除,并发重置

缺点:CPU敏感,当GC时,占用一个核心;产生浮动垃圾;空间碎片;

ICMS:

适用于单核或双核,分几次收集,jdk8中废弃了

G1:-XX:+UseG1GC jdk7开始使用,jdk8推荐G1

是否需要切换到G1

50%以上的堆被存活对象占用

对象分配和晋升的速度变化非常大

垃圾回收时间特别长,超过了1s

4.并行VS并发

并行:指多条垃圾收集器并行工作,但用户线程处于等待状态,适合科学计算,后台处理等若交互场景

并发:执行GC时同时运行用户线程,不会停顿用户程序运行,适合对响应时间有要求的场景,比如web

5.停顿时间VS吞吐量

GC时终端应用执行的时间,-XX:MaxGCPauseMillis

吞吐量:花在GC的时间和应用时间的占比,-XX:GCTimeRatio=<n>,GC时间占:1/1+n

6.理想情况

吞吐量大的时候,停顿时间变小,一般是花在应用时间越长,停顿越长

 

7.如何选择垃圾收集器

1.优先调整堆大小让服务器自己来选择

2.如果内存小于100M,使用串行

3.如果是单核,并且没有停顿时间要求,串行或JVM自己选

4.允许停顿超过1s,并行或JVM自己选

5.如果响应时间重要,不能超过1s,则选择并发类型

可以看官方文档,GC调优

四.可视化GC日志分析工具

1.每种垃圾回收器的吞吐量和最大响应时间

2.gceasy:在线工具,上传GC日志文件进行分析

3.GCViewer:

github下载,maven编译

4.如何打印GC日志

-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps

-Xloggc:$CATALINA_HOME/logs/gc.log 日志输出到哪里

-XX:+PrintHeapAtGC 发生GC时打印堆信息

不同类型的收集器格式稍有不同

5.主要看的指标

吞吐量,最大最小平均响应时间,发生GC的原因

五.tomcat的GC调优

1.步骤:

打印GC日志;根据日志得到关键性能指标;分析GC原因,调优JVM参数,反复调试;

2.参数:

-XX:+DisableExplicitGC 不打印手动GC

-XX:+HeapDumpOnOutOfMemoryError 发生内存溢出时打印内存映像

-XX:HeapDumpPath=$CATALINA_HOME/logs/ 打印堆栈信息

-XX:+PrintGCDetails

-XX:+PrintGCTimeStamps

-XX:+PrintGCDateStamps

Xloggc:$CATALINA_HOME/logs/gc.log 打印GC日志

 

-XX:+PrintGC 输出GC日志

-XX:+PrintGCDetails 输出GC的详细日志

-XX:+PrintGCTimeStamps 输出GC的时间戳(以基准时间的形式)

-XX:+PrintGCDateStamps 输出GC的时间戳(以日期的形式,如 2013-05-04T21:53:59.234+0800)

-XX:+PrintHeapAtGC 在进行GC的前后打印出堆的信息

-Xloggc:../logs/gc.log 日志文件的输出路径

一.parallel GC调优的指导原则

1.除非确定,否则不要设置最大堆内存

2.优先设置吞吐量目标

3.如果吞吐量目标达不到,调大最大内存,不能让OS使用swap,如果仍然达不到,降低目标

4.如果吞吐量能达到,GC时间太长,设置停顿时间的目标

5.如果是metaspaceGC可以调大小

-XX:MetaspaceSize=100M

-XX:MaxMetaspaceSize=100M

如果YGC频繁可以调

-XX:YoungGenerationSizeIncrement=30

二.G1调优最佳实践

1.年轻代大小:避免使用-Xmn年轻代大小,-XX:NewRatio年轻代和老年代比例等显示设置Young区大小,会覆盖暂停时间目标

2.暂停时间目标:暂停时间不要太苛刻,其吞吐量目标是90%的应用程序时间和10%的垃圾回收时间,太苛刻会直接影响到吞吐量

3.演示的调试

-XX:MetaspaceSize=64M metaspace大小

-Xms128M -Xmx128M 整个堆大小

-XX:+UseG1GC 使用G1垃圾收集器

-MaxGCPauseMillis=100 设置每次年轻代垃圾回收的最长时间 G1的应该优先设置这个

三.总结

1.JVM的各种垃圾收集器

2.评价垃圾收集器性能的两个关键指标:吞吐量和最大停顿时间

第七章 JVM字节码与java代码层优化

一.jvm字节码指令与javap

因为在源码层面看不出代码区别,深入到字节码层面就可以看出区别

i++与++i,字符串拼接+原理

其它代码优化方法

二.jdk自带的javap

1.javap -help

2.javap -verbose xxx.class > xxx.txt 输出字节码文件的反解析文件

三.基于栈的架构

JVM的执行指令基于栈的架构,电脑常见的都是基于寄存器的

四.i++与++1

iconst压操作数栈

istore放本地变量表

iload从本地变量表压栈

五.常用代码的优化方法

1.尽量重用对象,不要创建对象

2.容器类初始化的时候指定长度

3.ArrayList随机遍历快,LinkedList添加删除快

4.集合遍历尽量减少重复计算

5.使用Entry遍历Map

6.大数组复制用System.arraycopy,调用的native方法

7.尽量使用基本类型而不是包装类型,int而不是integer,integer -128到127会缓存

8.不要手动调用System.gc(),生产环境一般参数禁用

9.及时消除过期对象的引用,防止内存泄露

10.尽量使用局部变量,减小变量的作用域,作用域越小出了作用域变量就可以被回收了

11.尽量使用非同步容易,ArrayList,而不是Vector

12.尽量减小同步作用的范围

synchronized在普通方法上加锁等价于在this上加锁,

在静态方法上加锁,等价于在类上加锁

13.threadlocal缓存线程不安全的对象,simpledataformat构造成本比较高

14.尽量使用延迟加载

15.尽量减少使用反射,加缓存

16.尽量使用连接池,线程池,对象池,缓存

17.及时释放资源,I/O流,Socket,数据库连接

18.慎用异常,不要用抛弃异常来表示正常的业务逻辑

19.String操作尽量少用正则表达式

replaceVs replaceAll,少用第二个

20.日志输出注意使用不同的级别

21.日志中参数拼接使用占位符

log.info("xxx{}",orderid)//这样性能高

 

 

 

 

 

 

 

 

 

 

 

 

 

发布了42 篇原创文章 · 获赞 25 · 访问量 7万+

Guess you like

Origin blog.csdn.net/qq812858143/article/details/103340137