Take you to understand JVM performance tuning and actual combat advancement in 20 minutes

ZGC

birth reason

The Java ecosystem is very strong, but it is not enough. Some scenarios are still at a disadvantage, and the emergence of ZGC can allow the Java language to seize the market in certain areas of other languages. for example

  • The Android mobile phone system dominated by Google shows a freeze.
  • The stock exchange market has very high real-time requirements, and currently it is mainly dominated by C++.
  • Performance of big data clusters such as HBase.

characteristic

  • ZGC (The Z Garbage Collector) launched a low-latency garbage collector for JDK11. STW means that the pause time is less than 1ms, and it will not increase as the size of the heap increases.
  • Realize the main principle: full concurrent processing (only when traversing GC ROOTS will be paused)

  • Higher versions after JDK16 support 16TB-level heaps;
  • Realize the main principles: Region partition management, dyeing pointer addressing

  • Application throughput is reduced by up to 15%.
  • The main principle of implementation: when the allocation rate of objects with a short life cycle is high, a large number of objects will not be marked and collected, and a large amount of floating garbage will be generated, which will affect throughput, and the space for transferable objects in the heap will become more and more Small.

  • Lay the groundwork for future new GC features.
  • Realize the main principle: the unused reserved 18 bits in the coloring pointer.

memory layout

ZGC adopts the mechanism of heap space paging model, and the heap space paging model is also very consistent with the processing method of standard huge pages (huge pages) such as 4KB introduced by Linux Kernel2.6. The essence is the same as G1, without the concept of generation. ZGC also adopts the heap memory layout based on Region. The difference is that the Region of ZGC is dynamic: dynamically created and destroyed, and dynamic capacity size. ZGC is divided into three regions:

  • Small Region (small page): The capacity is fixed at 2MB, and objects smaller than 256KB are stored.
  • Medium-sized Region (medium page): The capacity is fixed at 32MB, and objects larger than 256KB and smaller than 4MB are stored.
  • Large Region (large page): The capacity is 2*N MB, which can be changed dynamically. Only one large object will be stored in each large Region, and will not be reallocated (that is, the copy of the object described later), because the large object Copying is expensive.

Pointer coloring technique (Color Pointers)

  • ZGC only supports 64-bit systems, that is, 64-bit pointers.
  • ZGC in JDK11's ZGC analyzes the lower 42 bits, that is, 2 to the 42nd power, to indicate the heap space in use, that is, manageable memory, and it changes in later versions of JDK.
  • ZGC uses several high-level bits to do GC-related things such as fast concurrent marking, transfer and relocation of garbage collection.
  • Extension points reserved for future GC new features

A C program mapping.c Look at the pointer coloring technology display of ZGC's 64-bit virtual address space

Compile and execute, the three addresses are the same, that is, the same real address is mapped to three virtual addresses.

overall process

overview

Mainly divided into two steps

  • Mark phase (Mark garbage)
  • Transfer phase (object copy or move)

trash mark

Garbage marking algorithm adopts reachability analysis algorithm

  • Remapped
  • All memory before GC is Remapped, or if it is still Remapped after marking, it is garbage.

  • M0, two GCs occur as an example, M0 is one GC.
  • Active objects that were marked in the marking phase of the previous GC, but the objects were not transferred in the last GC.

  • M1, two GCs occur as an example, M0 is two GCs.
  • Live objects identified in this garbage collection.

Mark phase, object allocation (Remapped)

  • initial tag (tag root)
  • concurrent marking (mark remaining)
  • Relabeling (resolving missing labels)

After the marking is over, the Remapped object is a garbage object. And the next mark uses M1 to indicate active.

ZGC transfer

  • If it is the same page, it is equivalent to markup.
  • If it is a different page, it is equivalent to the copy algorithm.

Overview of JVM tuning

background

  • Problems in production environment
  • Problems in production environment.
  • What should I do if memory overflow occurs in the production environment?
  • How much memory should be allocated to the server in the production environment?
  • How to tune the performance of the garbage collector?
  • How to deal with the high CPU load in the production environment?
  • How many threads should be allocated to the application in the production environment?
  • Without adding a log, how to determine whether a certain line of code is executed by the request?
  • Without adding a log, how to check the input and return value of a method in real time?

  • Why tune
  • Prevent OOM
  • Solve OOM
  • Reduce the frequency of Full GC

  • Tuning scene
  • The number of Full GCs is frequent.
  • GC pause time is too long (more than 1 second).
  • The application has memory exceptions such as OutOfMemory.
  • System throughput and response performance is not high or decline

  • Considerations at different stages
  • Before going live
  • Project operation phase
  • OOM appears online

Tuning overview

  • basis for monitoring
  • run log
  • exception stack
  • GC log
  • thread snapshot
  • heap dump snapshot

  • The general direction of tuning
  • write code sensibly
  • Full and reasonable use of hardware resources
  • Reasonable JVM tuning

tuning target

The goal of JVM tuning is to use a smaller memory footprint to obtain higher throughput or lower latency. From here, we can also know that there are three important indicators:

  • Memory usage: The memory size required for the normal operation of the program.
  • Latency: Program pause time due to garbage collection.
  • Throughput: The ratio of user program running time to the total time spent by user programs and garbage collection.

From the above, we also know that these three can not be fully combined like the distributed CAP theory. It is impossible for a Java program to ensure small memory usage, low latency, and high throughput at the same time; any improvement in the performance of any indicator is almost impossible. At the cost of sacrificing the performance of other indicators, you cannot have both. The goals of the program are different, and the directions considered during tuning are also different. Therefore, it is necessary to combine actual scenarios, have clear optimization goals, find performance bottlenecks, and optimize the bottlenecks in a targeted manner.

Tuning Principles

  • 90%, that is, most Java applications do not require JVM optimization.
  • Most of the causes of GC problems are caused by problems at the code level (code level).
  • Before going online, you should consider setting the JVM parameters of the machine to the optimum.
  • Reduce the number of created objects, reduce the use of global variables and large objects (code level).
  • Prioritize architecture tuning and code tuning, and JVM optimization is a last resort.
  • Analyzing GC situations to optimize code is better than optimizing JVM parameters.

Tuning steps

  • Step 1: Performance Monitoring
  • GC is frequent
  • The cpu load is too high (such as top -hP process number; top -d 2 -c, etc.)
  • OOM
  • memory leak
  • deadlock
  • Program response time is long

  • Step 2: Performance Analysis
  • Print GC logs and analyze exception information through GCviewer or gceasy
  • Flexible use of command line tools, jstack, jmap, jinfo, etc.
  • Dump out the heap file, use the memory analysis tool to analyze the file
  • Use Ali Arthas, jconsole, JVisualVM to view JVM status in real time
  • jstack view stack information

  • Step 3: Performance Tuning
  • Appropriately increase the memory and select the garbage collector according to the business background
  • Optimize code, control memory usage
  • Add machines to disperse node pressure
  • Reasonably set the number of threads in the thread pool
  • Use middleware to improve program efficiency, such as caching, message queues, etc.

Performance Evaluation/Test Indicators

  • Pause time (or response time)
  • The time taken between submitting a request and returning a response to that request, generally focusing on the average response time.
  • It takes more than ten milliseconds to query a record (with index) in the database.
  • The mechanical disk is addressed and positioned once. 4 milliseconds
  • Read 1M data sequentially from the mechanical disk. 2 milliseconds
  • Read 1M data sequentially from SSD disk. 0.3 milliseconds
  • Read 1M data from memory. ten microseconds
  • Java program native method call. few microseconds
  • The network transfers 2Kb of data. 1 micro

  • throughput
  • A measure of the amount of work (requests) completed per unit of time
  • In GC: the proportion of events that run user code to the total running time (total running time: program running time + memory recovery time)
  • Throughput is 1-1/(1+n), where -XX::GCTimeRatio=n

  • memory usage
  • The memory size occupied by the Java heap area

  • mutual relationship
  • Take highway traffic conditions as an example
  • Throughput: the data of vehicles passing the expressway tollbooth every day
  • Concurrent number: the number of vehicles that are driving on the highway
  • Response Time: Vehicle Speed

JVM monitoring and diagnostic command-line tools

No monitoring, no tuning! The command line installs the bin directory of jdk. These tools are used to obtain information of different aspects and levels of the target JVM, and help developers solve some intractable diseases of Java applications.

  • View the running Java process: jps
  • jps (Java Process Status): Display all HotSpot virtual machine processes in the specified system (view virtual machine process information), which can be used to query the running virtual machine processes.
  • For a local virtual machine process, the local virtual machine ID of the process is consistent with the process ID of the operating system and is unique.
  • The basic usage syntax is: jps [options] [hostid]

  • View JVM statistics: jstat
  • jstat (JVM Statistics Monitoring Tool): A command-line tool for monitoring various running status information of virtual machines. It can display running data such as class loading, memory, garbage collection, JIT compilation, etc. in local or remote virtual machine processes. On a server that does not have a GUI graphical interface and only provides a plain text console environment, it will be the first choice for locating virtual machine performance problems during runtime. Commonly used to detect garbage collection problems and memory leaks. (There is no gui tool in the general production environment, it is simple and commonly used)
  • The basic usage syntax is: jstat - [-t] [-h] [ []], such as jstat -gc process id 1000 10
  • jstat can also be used to determine whether there is a memory leak
  • In a long-running Java program, we can run the jstat command to continuously obtain multiple rows of performance data, and take the minimum value of the OU column (that is, the occupied old age memory) in these rows of data.
  • Then, we repeat the above operations at intervals of a long period of time to obtain the minimum values ​​of multiple groups of OUs. If these values ​​show an upward trend, it means that the old generation memory usage of the Java program is increasing, which means that objects that cannot be recycled are increasing, so there is a high possibility of memory leaks.

  • View and modify JVM configuration parameters in real time: jinfo
  • jinfo (Configuration Info for Java): View virtual machine configuration parameter information, and can also be used to adjust virtual machine configuration parameters
  • The basic usage syntax is: jinfo [options] pid, such as jinfo -sysprops process id

  • Export memory image file & memory usage: jmap
  • Get the dump file (heap dump snapshot file, binary file), it can also get the memory-related information of the target Java process, including the usage of each area of ​​the Java heap, statistics of objects in the heap, class loading information, etc.
  • The basic usage syntax is:
  • jmap [option]
  • jmap [option] <executable
  • jmap [option] [server_id@]

  • Use 1: Export memory image file
  • manual way
  • jmap -dump:format=b,file=<filename.hprof>
  • jmap -dump:live,format=b,file=<filename.hprof>

  • Use 2: Display heap memory related information
  • jmap -heap process id
  • jmap -histo process id

  • Use 3: other functions
  • jmap -permstat process id
  • View the ClassLoader information of the system
  • jmap -finalizerinfo
  • View objects accumulated in the finalizer queue

  • JDK comes with a heap analysis tool: jhat
  • jhat (JVM Heap Analysis Tool): The jhat command provided by Sun JDK is used in conjunction with the jmap command to analyze the heap dump file (heap dump snapshot) generated by jmap. jhat has a built-in tiny HTTP/HTML server. After generating the analysis result of the dump file, the user can view the analysis result in the browser (analyze the dump snapshot information of the virtual machine).
  • Using the jhat command, an http service is started, and the port is 7000, namely http://localhost:7000/, which can be analyzed in the browser.
  • Note: The jhat command has been deleted in JDK9 and JDK10, and it is officially recommended to use VisualVM instead.
  • Basic applicable grammar: jhat

  • Print thread snapshot in JVM: jstack
  • jstack (JVM Stack Trace): It is used to generate a thread snapshot (virtual machine stack trace) of the specified process of the virtual machine at the current moment. A thread snapshot is a collection of method stacks being executed by each thread of a specified process in the current virtual machine.
  • The function of generating thread snapshots: it can be used to locate the reasons for long pauses in threads, such as deadlocks between threads, infinite loops, and long waits caused by requesting external resources. These are common causes of long thread pauses. When a thread pauses, you can use jstack to display the stack status of each thread call.
  • Basic syntax: jstack [option] pid

  • Multifunctional command line: jcmd
  • After JDK 1.7, a new command line tool jcmd has been added. It is a multifunctional tool that can be used to realize the functions of all the previous commands except jstat. For example: use it to export heap, memory usage, view Java process, export thread information, perform GC, JVM runtime, etc.
  • jcmd -l : list all JVM processes
  • jcmd pid help : For the specified process, list all specific commands supported
  • jcmd pid Specific command : Display the data of the instruction command of the specified process

  • Remote host information collection: jstatd
  • The previous instructions only involved monitoring native Java applications, and among these tools, some monitoring tools also support monitoring of remote computers (such as jps, jstat). In order to enable remote monitoring, you need to use the jstatd tool. The command jstatd is an RMI server program, which acts as a proxy server and establishes communication between the local computer and the remote monitoring tool. The jstatd server passes Java application information from the local machine to the remote machine.

JVM monitoring and diagnostic tool GUI

Earlier we learned that Arthas is also a JVM monitoring and diagnostic tool GUI. This article throws out the shadow first, and then focuses on

  • Tools that come with the JDK
  • jconsole: The visual monitoring tool that comes with JDK. View the running profile of Java applications, monitor heap information, permanent area (or metaspace) usage, class loading, etc.
  • Starting from Java5, the java monitoring and management console that comes with the JDK. It is a GUI performance monitoring tool based on JMX (java management extensions) for monitoring memory, threads and classes in the JVM.

  • Visual VM: Visual VM is a tool that provides a visual interface for viewing details of Java technology-based applications running on the Java Virtual Machine.
  • VisualVM is a powerful all-in-one visual tool for troubleshooting and performance monitoring.
  • It integrates multiple JDK command-line tools, using VisualVM can be used to display virtual machine processes and process configuration and environment information (jps, jinfo), monitor application CPU, GC, heap, method area and thread information (jstat, jstack), etc., for JConsole.
  • After JDK 6 Update 7, Visual VM is released as part of JDK (VisualVM is in the JDK/bin directory). In addition, Visual VM can also be installed as a stand-alone software.

  • JMC: Java Mission Control, built-in Java Flight Recorder. Ability to collect Java Virtual Machine performance data with very low performance overhead.

  • third party tools
  • MAT: MAT (Memory Analyzer Tool) is a memory analysis tool based on Eclipse. It is a fast and feature-rich Java heap analysis tool, which can help us find memory leaks and reduce memory consumption.
  • MAT is not a universal tool, it cannot handle all types of heap storage files. However, mainstream manufacturers and formats, such as the HPROF binary heap storage files used by Sun, HP, and SAP, and the PHD heap storage files of IBM, can all be parsed well.
  • The most attractive thing is that it can quickly generate memory leak reports for developers, which is convenient for locating and analyzing problems. Although MAT has such powerful functions, memory analysis is not as simple as one-click completion. Many memory problems still need to be discovered through experience and intuition from the information displayed to us by MAT.

  • JProfiler: commercial software, need to pay. Powerful.
  • Flame Graphs (flame graphs), in the pursuit of extreme performance scenarios, it is very important to understand what the CPU is doing during the running of your program. Flame graphs are a very intuitive tool to display the time allocation of the CPU during the entire life cycle of the program and The CPU consumption bottleneck in calling find.

In addition, special documents are added separately for JVM runtime parameters and analysis of GC logs

Guess you like

Origin blog.csdn.net/a448335587/article/details/129504039