JVM entry to proficiency

1. JVM concept

1.1. What is JVM

  • Java Virtual Machine: Java virtual machine, used to ensure that the Java language is cross-platform

  • The Java virtual machine can be regarded as an abstract computer. Just like a real computer, it has its own instruction set and various runtime memory areas.

  • The Java virtual machine has no necessary connection with the Java language. It is only associated with a specific binary file format (class file format)

  • The Java virtual machine is a bytecode translator. It translates bytecode files into machine codes corresponding to each system to ensure that the bytecode files can run correctly on each system.

  • The so-called cross-platform java is determined by running different virtual machines on different platforms. Therefore, the execution of java files is not directly executed on the operating system.

  • Instead, it is executed through the jvm virtual machine. We can see from this picture that the JVM does not deal directly with the hardware, but interacts with the operating system to execute java programs.

benefit:

  • Write once, run anywhere
  • Automatic memory management, garbage collection function
  • Array subscript out-of-bounds check
  • Polymorphism

1.2. JVM composition

Insert image description here

1.3. Operation process

Insert image description here

This picture is the composition diagram of jvm, which is divided into four parts:

  • class loader

    • The function of the class loader is to load class files into memory. For example, if we write a HelloWorld.java program, we first use the javac command to compile it and generate the HelloWorld.java bytecode file. How can we execute the .class file? You need to use the medicine class loader to load the bytecode file into memory, and then load and execute the program through the subsequent modules of the jvm. ClassLoader is only responsible for loading. As for whether it can be executed, it is not within its scope of responsibility and is the responsibility of the execution engine.
  • execution engine

    • The execution engine, also called the interpreter, is responsible for interpreting commands and submitting them to the operating system for execution.
  • local interface

    • The function of the local interface is to integrate different programming languages ​​​​for use by Java. Its original intention is to integrate C/C++ programs. When Java was born, C/C++ was rampant. To gain a foothold, there must be a smart and wise way to call C. /C++ program, so a special area is opened in the memory to process the code marked as native. The specific method is to register the native method in the Native Method Stack, and load the native libraries when the Execution Engine is executed. At present, this method is used less and less, except for hardware-related applications, such as driving printers through Java programs, or managing production equipment through Java systems. It is relatively rare in enterprise-level applications because of the current heterogeneous fields. The communication is very developed, for example, you can use Socket communication, you can also use Web Service, etc., I won’t introduce it in detail.
  • Runtime data area

    • Running the data area is the focus of the entire JVM. All the programs we write are loaded here before they start running. The Java ecosystem is so prosperous thanks to the excellent autonomy of this area. The entire JVM framework loads files by the loader, and then the executor processes the data in the memory. Interaction with heterogeneous systems can be done through local interfaces!

2. JVM memory structure

2.1. Program counter

Insert image description here

2.1.1. Definition

Program Counter Register Program Counter (Register)

  • The program counter is a small memory area that can be regarded as a line number indicator for the current thread executing bytecode. In the conceptual model of the virtual machine, the bytecode interpretation job is to select the next step by changing the value of this counter. The bytecode instructions executed.

  • For example, branch control, loop control, jumps, exceptions and other operations, thread recovery and other functions are all completed through this counter. Because the multi-threading of JVM is achieved by switching threads in turns and allocating processor execution time. Therefore, at any given moment, a processor (or core for multi-core processors) will only execute instructions in one thread.

  • Therefore, in order to return to the correct execution position after thread switching, each thread needs its own unique program counter. The counters of multiple threads do not affect each other and are stored independently. We call this type of memory area a thread-private memory area.

  • ​If the thread is executing a Java method, this counter records the address of the bytecode instruction of the virtual machine being executed; if the native method is being executed, this counter is empty (undefined). This memory area is in the virtual machine in Java The only memory area that does not specify any OutOfMemoryError.

Function , save the address of the currently executing instruction, once the instruction is executed, the program counter will be updated to the next instruction

features

  • is thread-private
  • no memory overflow

2.2, virtual machine stack

Insert image description here

2.2.1. Definition

Java Virtual Machine Stacks

  • The memory required by each thread when running is called the virtual machine stack

  • Each stack is composed of multiple stack frames (Frame), corresponding to the memory occupied by each method call.

  • Each thread can only have one active stack frame, corresponding to the method currently being executed.

  • Consistent with the program counter, the Java virtual machine stack is also thread-private and has the same life cycle as the thread.

  • The virtual machine stack describes the execution memory model of the method. Each method creates a stack frame when executed, which is used to store local variable tables, operand stacks, method exits and other information. The process from execution to end of each method corresponds to the process of a stack frame from pushing to popping.

  • The local variable table stores four types of eight basic data types known to the compiler, object reference (refrence), which is not equivalent to the object itself, and may be a reference pointer pointing to the starting address of the object.

  • ​The memory allocation of the local variable table has been allocated at compile time. The 64-bit long and double will occupy two local variable spaces, and the remaining data types will only occupy one. When entering a method, it is completely possible to determine how much memory space this method needs to allocate on the stack. The size of the local variable table does not change during the running of the method.

  • If the depth requested by the thread in the stack is greater than the depth allowed by the virtual machine, a StackOverFlowError exception will occur; if the virtual machine stack can be dynamically expanded (most current virtual machines support dynamic expansion, and of course also allow fixed-length virtual machine stacks), If the extension cannot apply for enough memory, an OutOfMemoryError exception will be thrown.

Problem analysis

  • Are local variables within methods thread-safe?
    • If local variables within a method are not accessible outside the scope of the method, it is thread-safe
    • If a local variable refers to an object and escapes the scope of the method, thread safety needs to be considered

2.2.2. Stack memory overflow

  • Too many stack frames cause stack memory to overflow
  • The stack frame is too large, causing stack memory to overflow.

2.2.3. Thread running diagnosis

Case 1: Too much cpu usage

position

  • Use top to locate which process is occupying too much of the CPU.
  • ps H -eo pid,tid,%cpu | grep process id (use the ps command to further locate which thread is causing the high CPU usage)
  • jstack process id
    • You can find the problematic thread based on the thread ID and further locate the source code line number of the problematic code.

2.2.4, stack frame

Composition: local variable table, operand stack, dynamic link, method return address

Local variable table:

Store a list of local variables;

One local variable can save data of type boolean, byte, char, short, float, reference and returnAddress. Two local variables can save data of type long and double;

Local variables use indexes for location access, and the index value of the first local variable is zero;

Operand stack:

Also known as the operation stack, it is a last-in-first-out stack;

When a method just starts executing, its operand stack is empty. As the method is executed and the bytecode instructions are executed, constants or variables will be copied from the local variable table or fields of the object instance and written to the operand stack. Then as the calculation proceeds, the elements in the stack are popped into the local variable table or returned to the method caller, which is a pop/push operation;
a complete method execution often includes multiple such pop/push processes. ;

Simple understanding, the operand stack is the actual console of the thread;

Dynamic link:

Simply understood as a reference to the runtime constant pool;

In the class file, it is described that a method calls other methods, or accessing its member variables is represented by symbolic references. The function of dynamic link is to convert the methods represented by these symbolic references into direct references of actual methods;

2.3, local method stack

Insert image description here

The functions and characteristics of the local method stack are similar to the virtual machine stack and are also thread-private.

The difference is that the object served by the local method stack is the native method executed by the JVM, while the object served by the virtual machine stack is the Java method executed by the JVM.

How to serve native methods?
What language is used to implement the native method?
How to organize data structures like stack frames for service methods?
The virtual machine specification does not provide mandatory provisions, so different virtual machines can be implemented freely.

2.4. Heap

Insert image description here

2.4.1. Definition

  • For most applications, the Java Heap is the largest area of ​​memory managed by the JVM, and the Java Heap is an area shared by all threads and is created when the virtual machine starts.

  • The only purpose of this area is to store instance objects, and almost all object instances allocate space here. This is described in the JVM specification: all object instances and arrays must allocate space on the heap.

  • The Java heap is the main area managed by the garbage collector, so it is often called the GC heap. From the perspective of memory allocation, since the current garbage collection mechanism is generational garbage collection, the heap can be further divided into the old generation and the new generation, and then further divided into the Eden area and the Survivor area. The Survivor area can be further divided into It is divided into From Survivor area and To Survivor area. According to the JVM specifications, the Java heap can be in physically discontinuous memory space as long as it is logically continuous.

  • Just like our disk, it can be either fixed size or expandable. However, the current mainstream ones all adopt scalable strategies (using -Xmx and -Xms controls). If memory allocation is not completed in the heap and there is no expandable memory space in the heap, an OutOfMemoryError exception will be thrown.

Heap

  • Through the new keyword, creating objects will use heap memory

features

  • It is shared by threads, and all objects in the heap need to consider thread safety issues.
  • Has garbage collection mechanism

2.4.2. Heap memory overflow

2.4.3. Heap memory diagnosis

  1. jps tool
    to check which java processes are in the current system
  2. jmap tool
    to view heap memory usage jmap - heap process id
  3. jconsole tool
    is a graphical interface and multifunctional monitoring tool that can continuously monitor

Case:
After garbage collection, memory usage is still high

2.5. Method area

Insert image description here

2.5.1. Definition

  • Like the java heap, the method area has a memory area shared by each thread, which is used to store data such as class information loaded by the virtual machine, constants, static variables, and code compiled by the just-in-time compiler.

  • The Java virtual machine has relatively loose restrictions on the method area. In addition to not requiring continuous space like the heap and can choose a fixed size or expandable, you can also choose not to implement garbage collection.

  • Relatively speaking, garbage collection is relatively rare in this area, but it does not mean that data can survive permanently after entering the method area. The recycling goals in this area are mainly the recycling of the constant pool and the unloading of types. Generally speaking, this The regional recycling performance is relatively unsatisfactory. Especially for type of uninstallation, the conditions are quite harsh. According to the Java virtual machine specification, when the method area cannot meet the memory allocation, an OutOfMemoryError exception will be thrown.

2.5.2. Composition

Insert image description here
Insert image description here

2.5.3, method area memory overflow

Before 1.8, it would cause permanent generation memory overflow.

* 演示永久代内存溢出 java.lang.OutOfMemoryError: PermGen space
* -XX:MaxPermSize=8m

After 1.8, it will cause metaspace memory overflow.

* 演示元空间内存溢出 java.lang.OutOfMemoryError: Metaspace
* -XX:MaxMetaspaceSize=8m

2.5.4. Runtime constant pool

  • The constant pool is a table. The virtual machine instructions use this constant table to find the class name, method name, parameter type, literal and other information to be executed.

  • Runtime constant pool, the constant pool is in the *.class file, when the class is loaded, its constant pool information will be put into the runtime constant pool, and the symbol address inside will be changed to a real address

  • The runtime constant pool is part of the method area. In addition to the class version, field, method and interface information, there is also a constant pool in the Class file. Used to store references to various literals and symbols of the compiler. This content will be stored in the constant pool after the class is loaded.

  • Generally speaking, in addition to saving the symbol references described in the Class file, the translated direct references are also stored in the runtime constant pool. One of the biggest characteristics of the runtime constant pool relative to the constant pool of the Class file is its dynamism. The Java language does not require constants to be generated during compilation, that is to say, the contents of the constant pool in the Class file are not preset into the constant pool. It is also possible to put newly generated constants into the constant pool during runtime. The most utilized feature of this feature is the intern() method of String. Since the runtime constant pool is part of the method area, it naturally has the constraints of the method area, so an OutOfMemoryError exception will be thrown when the memory application cannot be obtained.

2.5.5. StringTable characteristics

  • Strings in the constant pool are only symbols and become objects when they are used for the first time.
  • Use the string pool mechanism to avoid repeatedly creating string objects
  • The principle of string variable splicing is StringBuilder (1.8)
  • The principle of string constant splicing is compile-time optimization
  • You can use the intern method to actively put string objects that are not yet in the string pool into the string pool.
    • 1.8 Try to put this string object into the string pool. If it exists, it will not be put into it. If not, it will be put into the string pool and the object in the string pool will be returned.
    • 1.6 Try to put this string object into the string pool, if there is, it will not be put in, if not, it will make a copy of this object, put it into the string pool, and return the object in the string pool

2.5.6. StringTable performance tuning

  • Adjust-XX:StringTableSize=number of buckets

  • Consider whether to put string objects into the pool

2.5.7. Direct memory

Direct memory is not part of the Jvm runtime data area, but this part of the memory area is frequently called, and OutOfMemoryError exceptions may occur, so we will discuss it together. Obviously the direct memory of the machine will not be affected by the memory allocated by the Java heap, but since it is memory, it must be limited by the total memory of the machine. When server administrators configure virtual machine parameters, they will set -Xmx and other parameter information based on actual memory, but often ignore direct memory. The total memory of each area is greater than the physical memory limit, resulting in an OutOfMemoryError exception during dynamic expansion.

2.3. Memory overflow exception

  • Java heap overflow

  • -Xms20m
    -Xmx20m
    -XX:+HeapDumpOnOutOfMemoryError

  • Virtual machine stack and local method stack overflow

  • direct memory overflow

3. Garbage collection

3.1. How to determine whether an object can be recycled

3.1.1. Reference counting method

Insert image description here
The reference counter algorithm is simply summarized as follows: Add a reference counter to the object. Whenever there is a reference to the object, the counter + 1. When the reference expires, the counter - 1. At any time, when the counter is 0, the object is no longer be quoted again. Objectively speaking, the reference counter is simple to implement and has high determination efficiency. It is a good choice in most scenarios. However, the current mainstream JVM does not use the mark and clear algorithm because it is difficult to solve the situation where objects call each other cyclically.

3.1.2. Reachability analysis algorithm

In the mainstream implementation of mainstream commercial programming languages ​​(such as C#, Java), reachability analysis is used to determine whether an object is alive. The idea of ​​this algorithm is to use a series of objects that become "GC Roots" as the starting point. , start searching downwards from these nodes, and the path traveled by the search becomes a reference chain. When an object has no reference chain connected to "GC Roots", it proves that this object is unavailable. Indicates that it can be recycled

Insert image description here

As shown in the figure, although Obj5, Obj6, and Obj7 are related to each other, they do not have any reference chain to the GC root, so they are determined to be objects that need to be recycled.

Often referred to as GC (Garbage Collector) roots, specifically refers to the objects of the Garbage Collector. GC will collect objects that are not GC roots and are not referenced by GC roots.

  • In Java, objects that can be used as GC Roots include the following:

    • Objects referenced in the virtual machine stack;
    • The object referenced by the class static property in the method area;
    • The object referenced by the constant in the method area;
    • The object referenced by JNI (generally speaking Native method) in the native method stack;

Insert image description here

3.1.3. Four types of references

Whether it is the number of references determined by the reference counter or the reachability of the reference chain determined by reachability analysis, determining whether the object is alive is related to references.

Before JDK1.2, a reference was defined as when a reference type data represents the starting address of another memory. This type of data is called a reference. This definition is very pure, but also very narrow. An object Under this definition, there are only two states: quoted and unquoted. It seems powerless to describe some objects that are "tasteless to eat and a pity to discard". We hope to be able to describe this type of object. When the memory is enough, store it in the memory. When the memory space is still tight after garbage collection, this part of the object can be recycled. The cache functions of many systems are suitable for such applications. Scenes.

Therefore, after JDK1.2, references will be re-expanded and divided into strong references, soft references, weak references, and virtual references. The strength of these four references decreases in order.

Strong quote:

  • Only when all GC Roots objects do not reference the object through [strong references] can the object be garbage collected.

  • ​Strong references are ubiquitous in code, similar to Object obj = new Object(). As long as the strong reference exists, the garbage collector will never recycle the referenced object.

​SoftReference :

  • When only soft references refer to the object, after garbage collection, if the memory is still insufficient, garbage collection will be started again to recycle the soft reference objects.

  • You can use the reference queue to release the soft reference itself

  • Soft references are used to describe some objects that are useful but not necessary. For objects associated with soft references, when a memory overflow exception occurs, they will be recycled twice through garbage collection. If the system memory is still not enough after the secondary recycling is completed, a memory overflow exception will be thrown. After jdk1 and 2, the SoftReference class is used to implement soft references.

​Weak reference (WeakReference):

  • When only weak references refer to the object, during garbage collection, the weak reference objects will be recycled regardless of whether the memory is sufficient.

  • You can use the reference queue to release the weak reference
    itself

  • Weak references are also used to describe non-essential objects, but their strength is weaker than soft references. They can only survive until the next garbage collection. When garbage is collected, regardless of whether the memory is enough, weakly referenced objects must be recycled. After jdk1.2, the WeakReference class is used to implement weak references.

PhantomReference :

  • It must be used with the reference queue, mainly with the ByteBuffer. When the referenced object is recycled, the virtual reference will be enqueued, and the Reference Handler thread will call the virtual reference related method to release the direct memory.

  • Virtual reference is the weakest reference relationship. Whether an object has a virtual reference will not affect its survival time at all, and it is impossible to obtain an instance object through a virtual reference. The only purpose of setting a weak reference for an object is that the object receives a system notification during garbage collection. After Jdk1 and 2, PhantomReference is used to implement virtual references.

FinalReference:

  • No manual coding is required, but it is used internally with the reference queue. During garbage collection, the finalizer reference is enqueued (the referenced object is not yet recycled), and then the Finalizer thread finds the referenced object through the finalizer reference and calls its finalize method. , the referenced object can be recycled only during the second GC.

Survive or die?

Even in reachability analysis, if no reference chain reaches GC Roots, it is not "necessary". At this time, the subject is on probation, and to be officially declared dead, he must go through the marking process at least twice.

If the object is found to have no reference chain connected to GC Roots after the reachability analysis, it will be marked for the first time and filtered once. The filtering condition is whether the object needs to execute the finalize() method. When the object The finalize() method has been overridden, or the finalize() method has been called by the virtual machine. The virtual machine treats both cases as unnecessary execution of the finalize() method.

If the object is determined to execute the finalize() method, then the object will be placed in a queue called F-Queue, and later a thread with a lower priority created by the virtual machine itself will execute it. The execution here means that the finalize() method will be triggered, but it will not wait for its execution to end.

The reason for this is that if an object is very slow when executing finalize(), or executes an infinite loop, this will cause other objects in the F-Queue to be waiting, which may seriously cause the entire garbage collection system to collapse. finalize() is the last chance for the object to escape death. Later, the GC will mark the object in the F-Queue twice. If the object wants to save itself in finalize(), it can only reconnect with any one on the reference chain. The object can be associated. For example, if the object itself (this keyword) is assigned to other member variables or objects, it will be removed from the collection to be recycled when it is marked for the second time. If there is no association, it is basically certain that it will be recycled. .

3.2. Garbage collection algorithm

3.2.1, Mark-Sweep (mark removal)

The most basic collection algorithm is the mark-sweep algorithm. As its name suggests, it is divided into two stages: mark and clear. The first step is to mark the objects to be recycled. After the marking is completed, all marked objects will be recycled uniformly. How to mark has been mentioned above. The reason why it is said to be the most basic garbage collection algorithm is that other algorithms are also based on this idea and improve upon its shortcomings.

There are two main problems. The first is efficiency. The efficiency of marking and clearing is not high. The second is the space allocation problem. After the mark is cleared, a large amount of discontinuous memory space will be generated. Too many space fragments may cause the program to be unable to find enough memory space when it needs to allocate space to larger objects during operation. Instead, a garbage collection action has to be performed in advance. As shown in the figure, a large number of garbage fragments will be generated, resulting in low space utilization.

faster

Will cause memory fragmentation

Insert image description here
Insert image description here

3.2.2, Mark-Compact (mark finishing)

slow

No memory fragmentation

The copy algorithm will perform more copy operations when the proportion of surviving objects is relatively high, and the efficiency will become lower. More importantly, if you do not want to waste 50% of the area, you need additional space for allocation guarantee to cope with the memory. In the extreme case where 100% of objects survive, this algorithm is generally not used in the old generation.

According to the characteristics of the old generation, someone proposed another mark-sorting algorithm. The marking process is consistent with the mark-clearing algorithm, but the next step is not to clean up the recyclable objects directly, but to move all surviving objects to one end, and then Directly clean up objects outside the boundary. The schematic diagram is as follows:

Insert image description here

Insert image description here

3.2.3. Copying

No memory fragmentation

Requires double memory space

In order to solve the efficiency problem, a collection algorithm called copying appeared. It divides the available memory into two blocks of equal size, and only uses one of them at a time. When this memory area is used up, the surviving objects will be Copy it to another piece of memory, and then clean up the used space at once, so that half of the area is recycled every time. There is no need to consider issues such as fragmentation when allocating memory. Just move the top pointer of the heap and press Just allocate memory sequentially, which is simple to implement and efficient to run.

It's just that this approach reduces the original memory to half, which is too expensive.

Today's commercial virtual machines use this method to recycle the new generation. IBM's special research shows that 98% of the objects in the new generation are "live and die", so it is not necessary to divide the memory area according to 1:1, but Divide the memory into one larger area for Eden and two smaller areas for Survivor. When recycling, copy the surviving objects in Eden and Survivor areas to another Survivor area at one time, and then copy Eden and Survivor areas Perform a one-time cleanup. The default ratio of Eden to Survivor in the Hotspot area is 8:1, which means that the available memory of the new generation is 90%, and only 10% of the memory will be divided into reserved memory. Of course, it is 98% in most cases, but we cannot guarantee that the surviving objects collected each time are less than 10%. When the Survivor area is not enough, we need to rely on other areas for allocation guarantees. If another Survivor area is no longer enough, the object can directly enter the old generation through the memory guarantee mechanism.

Insert image description here

Insert image description here

3.3. Generational garbage collection

Insert image description here

Insert image description here

  • The object is first allocated in the Eden area

  • When the new generation space is insufficient, minor gc is triggered. The surviving objects of Eden and from are copied to to using copy. The age of the surviving objects is increased by 1 and exchanged from to.

  • Minor gc will trigger stop the world, suspend other user threads, and wait until the garbage collection is completed before the user threads resume running.

  • When the object's lifespan exceeds the threshold, it will be promoted to the old generation. The maximum lifespan is 15 (4bit)

  • When the old generation space is insufficient, minor gc will be triggered first. If the space is still insufficient later, full gc will be triggered, and the STW time will be longer.

The current commercial garbage collectors use generational garbage collection. This algorithm has no new ideas. It just divides the memory into several pieces according to the life cycle of the object. Generally, the Java heap is divided into the new generation and the old generation. In this way, the most appropriate recycling algorithm can be selected based on the object characteristics of each generation. In the new generation, a large number of objects die during each garbage collection, and only a few survive, so it is suitable to use a copy algorithm. Garbage collection can be completed with only a small cost of object copying, while the old generation has a high survival rate and no other memory allocation guarantees, so mark-clean or mark-compact must be used for recycling.

  1. The generation is divided into the young generation and the old generation. The young generation is divided into the Eden area and the Survivor area. Usually the default ratio is 8:1:1. Only 10% of the space is reserved as a reserved area each time, and then 90 % of the space can be used for nascent objects.
  2. After each garbage collection, the age of the surviving object corresponds to +1. When the object is still alive after 15 times, we let it directly enter the old generation.
  3. Another way to enter the old generation is the memory guarantee mechanism, that is, when the space in the new generation is not enough, the object directly enters the old generation
  4. The garbage collection of the new generation is called Minor GC, and the garbage collection of the old generation is called Full GC

3.3.1. Related VM parameters

meaning parameter
heap initial size -Xms
heap max size -Xmx or -XX:MaxHeapSize=size
Cenozoic size -Xmn 或 (-XX:NewSize=size + -XX:MaxNewSize=size )
Survival area ratio (dynamic) -XX:InitialSurvivorRatio=ratio 和 -XX:+UseAdaptiveSizePolicy
Ratio of Survival Zone -XX:SurvivorRatio=account
Promotion Threshold -XX:MaxTenuringThreshold=threshold
Promotion Details -XX:+PrintTenuringDistribution
GC details -XX:+PrintGCDetails -verbose:gc
Before FullGC MinorGC -XX:+ScavengeBeforeFullGC

3.4. Garbage collector

3.4.1, serial

  • single thread
  • The heap memory is small, suitable for personal computers
-XX:+UseSerialGC = Serial + SerialOld

Insert image description here

3.4.2. Throughput priority

  • Multithreading
  • Large heap memory, multi-core CPU
  • Let the STW time be the shortest per unit time 0.2 0.2 = 0.4, and the garbage collection time should be the lowest. This is called high throughput.
-XX:+UseParallelGC ~ -XX:+UseParallelOldGC
-XX:+UseAdaptiveSizePolicy
-XX:GCTimeRatio=ratio
-XX:MaxGCPauseMillis=ms
-XX:ParallelGCThreads=n

Insert image description here

3.4.3. Response time priority

  • Multithreading
  • Large heap memory, multi-core CPU
  • Try to keep the time of a single STW as short as possible 0.1 0.1 0.1 0.1 0.1 = 0.5
-XX:+UseConcMarkSweepGC ~ -XX:+UseParNewGC ~ SerialOld
-XX:ParallelGCThreads=n ~ -XX:ConcGCThreads=threads
-XX:CMSInitiatingOccupancyFraction=percent
-XX:+CMSScavengeBeforeRemark

Insert image description here

3.4.4、G1

Definition: Garbage First

  • 2004 Paper published
  • 2009 JDK 6u14 experience
  • 2012 JDK 7u4 official support
  • 2017 JDK 9 default

Applicable scene

  • Pay attention to both throughput and low latency. The default pause target is 200 ms.

  • Very large heap memory will divide the heap into multiple regions of equal size.

  • The whole is a marking + sorting algorithm, and the two areas are a copy algorithm.

Related JVM parameters

-XX:+UseG1GC
-XX:G1HeapRegionSize=size
-XX:MaxGCPauseMillis=time

3.4.4.1, G1 garbage collection stage

Insert image description here

3.4.4.2、Young Collection

  • Meeting STW (stop the world)

Insert image description here
Insert image description here
Insert image description here

3.4.4.3、Young Collection + CM

  • Initial marking of GC Root will be performed during Young GC.
  • When the proportion of heap space occupied by the old generation reaches the threshold, concurrent marking will be performed (without STW), which is determined by the following JVM parameters.
XX:InitiatingHeapOccupancyPercent=percent (默认45%)

Insert image description here

3.4.4.4、Mixed Collection

Comprehensive garbage collection will be performed on E, S, and O

  • The final mark (Remark) will be STW
  • Copy survival (Evacuation) will STW
-XX:MaxGCPauseMillis=ms

Insert image description here

3.4.4.5、 Full GC

  • SerialGC

    • Garbage collection that occurs due to insufficient memory in the young generation - minor gc
    • Garbage collection that occurs due to insufficient memory in the old generation - full gc
  • ParallelGC

    • Garbage collection that occurs due to insufficient memory in the young generation - minor gc
    • Garbage collection that occurs due to insufficient memory in the old generation - full gc
  • CMS

    • Garbage collection that occurs due to insufficient memory in the young generation - minor gc
    • Insufficient memory in the old generation
  • G1

    • Garbage collection that occurs due to insufficient memory in the young generation - minor gc
    • Insufficient memory in the old generation

3.4.4.6, Young Collection cross-generation reference

The problem of cross-generation references in the new generation recycling (the old generation refers to the new generation)
Insert image description here

  • Card List and Remembered Set
  • Pass post-write barrier + dirty card queue when reference changes
  • concurrent refinement threads 更新 Remembered Set

Insert image description here

3.4.4.7、Remark

pre-write barrier + satb_mark_queue

Insert image description here

3.4.4.8, JDK 8u20 string deduplication

  • Pros: save a lot of memory

  • Disadvantages: Slightly more CPU time is occupied, and the new generation recycling time is slightly increased.

-XX:+UseStringDeduplication
String s1 = new String("hello"); // char[]{'h','e','l','l','o'}
String s2 = new String("hello"); // char[]{'h','e','l','l','o'}
  • Put all newly allocated strings into a queue
  • When the new generation is recycled, G1 concurrently checks whether there are duplicate strings
  • If they have the same value, let them reference the same char[]
  • Note that it is not the same as String.intern()
    • String.intern() focuses on string objects
    • The focus of string deduplication is char[]
    • Inside the JVM, a different string table is used

3.4.4.9, JDK 8u40 concurrent mark class uninstallation

After all objects are marked for concurrency, you can know which classes are no longer used. When all classes of a class loader are no longer used, all classes loaded by it are unloaded.

-XX:+ClassUnloadingWithConcurrentMark 默认启用

3.4.4.10, JDK 8u60 recycling giant objects

  • When an object is larger than half of a region, it is called a giant object.
  • G1 does not copy huge objects
  • Prioritized for recycling
  • G1 will track all incoming references in the old generation, so that giant objects with an incoming reference of 0 in the old generation can be disposed of during the new generation garbage collection.

3.4.4.11. Adjustment of concurrent mark starting time in JDK 9

  • Concurrent marking must be completed before the heap space is full, otherwise it will degenerate into FullGC.
  • Before JDK 9, you need to use -XX:InitiatingHeapOccupancyPercent
  • JDK 9 can be dynamically adjusted
    • -XX:InitiatingHeapOccupancyPercent is used to set the initial value
    • Data sampling and dynamic adjustment
    • Always add a safe empty space

3.5. Garbage collection tuning

Preliminary knowledge

  • Master GC related VM parameters and know basic space adjustment
  • master related tools
  • Understand one thing: Tuning is related to the application and environment, and there is no one-size-fits-all rule.

3.5.1. Tuning areas

  • Memory
  • lock contention
  • CPU usage
  • io

3.5.2. Determine goals

  • [Low latency] or [High throughput], choose the appropriate collector
  • CMS,G1,ZGC
  • ParallelGC
  • Zing

3.5.3. The fastest GC

The answer is that no GC occurs

Check the memory usage before and after FullGC and consider the following questions:

  • Is there too much data?

    • resultSet = statement.executeQuery(“select * from 大表 limit n”)
  • Is the data representation too bloated?

    • object graph
    • Object size 16 Integer 24 int 4
  • Is there a memory leak?

    • static Map map =
    • soft
    • weak
    • Third-party cache implementation

3.5.4. New generation tuning

Characteristics of the new generation

  • All new operations are very cheap to allocate memory
    • TLAB thread-local allocation buffer
  • The recycling cost of dead objects is zero
  • Most objects die immediately after use
  • The time of Minor GC is much lower than that of Full GC

Is bigger better?

-Xmn
Sets the initial and maximum size (in bytes) of the heap for the young generation (nursery). GC is
performed in this region more often than in other regions. If the size for the young generation is
too small, then a lot of minor garbage collections are performed. If the size is too large, then only
full garbage collections are performed, which can take a long time to complete. Oracle
recommends that you keep the size for the young generation greater than 25% and less than
50% of the overall heap size.
  • The new generation can accommodate all [concurrency * (request-response)] data
  • The survival area is large enough to retain [currently active objects + objects that need to be promoted]
  • Properly configure the promotion threshold to promote long-lived objects as quickly as possible
-XX:MaxTenuringThreshold=threshold
-XX:+PrintTenuringDistribution
Desired survivor size 48286924 bytes, new threshold 10 (max 10)
- age 1: 28992024 bytes, 28992024 total
- age 2: 1366864 bytes, 30358888 total
- age 3: 1425912 bytes, 31784800 total
...

3.5.5. Old generation tuning

Take CMS as an example

  • The bigger the old generation memory of CMS, the better.
  • Try not to tune first. If there is no Full GC, then..., otherwise try to tune the new generation first.
  • Observe the old generation memory usage when Full GC occurs, and increase the old generation memory default by 1/4 ~ 1/3.
-XX:CMSInitiatingOccupancyFraction=percent

3.6. Garbage collector

3.6.1. Serial collector

The Serial collector is the most basic and oldest developed collector. Before JDK 1.3.1, it was the only choice for garbage collection in the new generation of virtual machines. This collector is single-threaded. The significance of its single thread does not only mean that it will only use one CPU or one collection thread to complete the collection work, the most important thing is that when it performs garbage collection, other worker threads will be suspended until the collection ends. This work is automatically initiated and executed by the virtual machine in the background, and all worker threads are stopped without being visible to the user, which is intolerable for many applications. We can imagine what happens when our computer stops for 5 minutes after running for 1 hour? Regarding this design, the virtual machine designers expressed their grievances, because it is impossible to collect while garbage objects are being generated continuously, which cannot be cleaned up.

So from 1.3 to now, the virtual machine development team has been working hard to reduce thread pauses caused by garbage collection. The virtual machines that have emerged are getting better and better, but until now, they have not been completely eliminated.

Speaking of this, it seems that the Serial collector is already "a pity to eat and discard", but in fact, it is still the default garbage collector for the new generation of virtual machines in Client mode. It has advantages over other garbage collectors. For example, since there is no overhead of switching between threads, concentrating on garbage collection can naturally reap the highest thread utilization efficiency. In the background of user desktop applications, the memory allocated to the virtual machine is generally not too large. When collecting tens of megabytes or one or two hundred megabytes of new generation objects, the pause time can be completely controlled between tens of milliseconds and one hundred milliseconds. It's acceptable as long as it doesn't happen frequently. Therefore, the Serial collector in Client mode is still a good choice for the new generation.

Insert image description here

3.6.2, ParNew collector

The ParNew collector is actually a multi-threaded version of the Serial collector. In addition to using multi-threads for garbage collection, the other controllable parameters, collection algorithms, stopping working threads, object allocation principles, recycling strategies, etc. are completely consistent with the Serial collector.

Except for multi-threaded garbage collection, there are not many other innovations, but it is indeed the preferred virtual machine collector for the new generation in Server mode. One of the important reasons is that except for the Serial collector, it is the only one that can be used with CMS. During the JDK1.5 period, HotSpot launched an epoch-making collector CMS for strong interactive applications. This collector is HotSpot's first truly concurrent collector, and it is the first time that garbage collection and worker threads work simultaneously. possibility, in other words, you can pollute and collect at the same time.

However, as an old-age collector, CMS cannot be used with the latest new-generation garbage collector released in 1.4. On the contrary, only one of Serial or Parnew can be used. The ParNew collector can be forced to specify it using -XX:+UseParNewGC, or the default new generation collector using the -XX:+UseConcMarkSweepGC option.

The ParNew collector will never have a better effect than the Serial collector in a single CPU environment, or even better than the thread interaction overhead. This collector cannot guarantee 100% surpassing in the environment of two CPUs implemented through hyper-threading technology. Serial collector. Of course, as the number of CPUs increases, it is still very beneficial to the effective resource utilization of the system during GC. When there are a lot of CPUs, you can use -XX:ParallelGCThreads to limit the number of garbage collection threads.

Insert image description here

3.6.3, Parallel Scavenge collector

The Parallel Scavenge collector is a new generation collector that uses a copy algorithm and is a parallel multi-threaded garbage collector. Its focus is different from that of other collectors. The focus of collectors such as CMS is to shorten the time for user threads to stop during garbage collection, while the Parallel Scavenge collector is to achieve a controllable throughput. The so-called throughput It is the ratio of the CPU running user thread time to the total CPU running time, that is, throughput = (user thread working time)/(user thread working time + garbage collection time), for example, the virtual machine runs for 100 minutes in total, and garbage collection takes 1 minute , the throughput is 99%. The shorter the pause time, the more suitable for programs that interact with users. Good response speed can improve user experience, but high throughput can efficiently use CPU time and complete program computing tasks as soon as possible. It is mainly suitable for background computing without requiring Too many interactive programs.

There are two parameters to control throughput, which are the maximum garbage collection time: -XX:MaxGCPauseMills, and directly set the throughput size: -XX:GCTimeRatio

-XX:+UseAdaptiveSizePolicy

The adaptive strategy is also an important point that distinguishes the Parallel Scavenge collector from the Parnew collector.

3.6.4, Serial Old collector

The Serial Old collector is an old-age version of the Serial collector. It is also a single-threaded collector that uses a mark-sort algorithm. The main purpose of this collector is also used in the Client mode. If it is in Server mode, there are two uses, one is used in conjunction with the Parallel Scavenge collector in versions prior to jdk5, and the other is used as a backup solution for CMS when Concurrent Mode Failure occurs in concurrent collection.

3.6.5, Parallel Old collector

The Parallel Old collector is the old version of the Parallel Scavenge collector. It uses multi-threading and mark-collation algorithms. This collector was only started to be used in jdk6. Before that, the Parallel Scavenge collector had been in an awkward stage because of , if the new generation adopts the Parallel Scavenge collector, then the old generation has no choice but Serial Old. Due to the drag of Serial in the old generation on the server side, the use of the Parallel Scavenge collector may not achieve the effect of maximizing throughput. , since the single-threaded old generation cannot fully utilize the processing power of the server's multi-CPU, in an environment where the old generation is large and the hardware is relatively advanced, the throughput of this combination is not even as good as the Parallel Scavenge collector + CMS. Until the emergence of the Parallel Old collector, the "throughput priority collector" finally had a veritable combination. In situations where throughput priority and CPU resource sensitivity are important, the Parallel Scavenge collector + Parallel Old collector can be used

Insert image description here

3.6.6, CMS collector

The CMS collector is a collector that aims to obtain the shortest pause time. As can be seen from the name (Concurrent Mark Sweep), the mark-clearing algorithm used is divided into four steps:

Only the initial mark and remark require pausing the user thread.

  1. Initial mark - only associate objects that GC Roots can directly associate with, very fast

  2. Concurrent marking — the process of GC Roots Tracing,

  3. Remarking — to correct the marking record for that part of an object whose marking has changed due to user program operations during concurrent marking

  4. Concurrent clear

Insert image description here

Since the concurrent marking and concurrent clearing process collectors, which take the longest time in the entire process, can work together with the user thread, in general, the CMS memory recycling process is executed concurrently with the user thread.

Three major disadvantages of the CMS collector:

  1. CMS collector is very sensitive to CPU resources
  2. Unable to handle floating garbage
  3. Because it is based on the mark removal algorithm, a large amount of garbage fragments will be generated -XX:+UseCMSCompactAtFullCollection

3.6.7, G1 collector

First of all, the design principle of G1 is simple and feasible performance tuning.

-XX:+UseG1GC -Xmx32g -XX:MaxGCPauseMillis=200

Among them, -XX:+UseG1GC turns on the G1 garbage collector, -Xmx32g designs the maximum heap memory to be 32G, and -XX:MaxGCPauseMillis=200 sets the maximum GC pause time to 200ms. If we need to tune, when the memory size is certain, we only need to modify the maximum pause time.

  1. memory allocation

  2. Young Garbage Collection

  3. Mixgarbage collection

Insert image description here

Common setting parameters:
Insert image description here

4. Virtual machine performance monitoring and troubleshooting tools

4.1, Jconsole tool

jmap tool command: jmap -dump:live,format=b,file=heap.bin 4308

Jconsole is a JMX-based visual monitoring and management tool.

  1. The directory where Jsonsole is located

Insert image description here

It can be seen in the bin directory of jdk, and then double-click it to open the monitoring interface. After entering, we can see this interface. The overview tab is mainly an overview of the main running data of the virtual machine. There are four graphs, namely heap, thread, class and CPU occupancy. The memory tab is used to monitor the changing trend of virtual machine memory (java heap and permanent generation) managed by the collector.

Insert image description here

4.2. Jprofiler tool

Insert image description here

Guess you like

Origin blog.csdn.net/shuai_h/article/details/131031243