JVM architecture explanation + detailed explanation of garbage collection mechanism (based on JDK8 version)

1. JVM memory structure


The memory structure of JVM includes five areas: program counter, virtual machine stack, native method stack, heap area, and method area.
insert image description here

Program Counter (PC Register):

Since in the JVM, multithreading obtains the CPU execution time by switching threads in turn, therefore, at any specific moment, the core of a CPU will only execute the instructions in one thread,

Therefore, in order to enable each thread to restore the program execution position before the switch after thread switching, each thread needs to have its own independent program counter, and cannot interfere with each other,

Otherwise, the normal execution order of the program will be affected. So, so to speak, the program counter is private to each thread. Since the size of the space occupied by the data stored in the program counter does not change with the execution of the program,

Therefore, OutOfMemory will not occur for the program counter.

Java Virtual Machine Stacks:

The Java stack stores stack frames one by one, each stack frame corresponds to a called method, and the stack frame includes the local variable table (Local Variables), the operand stack (Operand Stack)

A reference to the runtime constant pool of the class to which the current method belongs (the concept of runtime constant pool will be discussed in the method section) (Reference to runtime constant pool)

The method returns the address (Return Address) and some additional additional information. When a thread executes a method, it creates a corresponding stack frame and pushes the created stack frame onto the stack. When the method finishes executing, the stack frame is popped off the stack.

insert image description here

Native method stack:

The difference from the above is that the Java stack is for Java methods, while the native method stack is for executing native methods.

heap:

The heap in Java is used to store the objects themselves and arrays (array references are stored in the Java stack). The heap is shared by all threads, there is only one heap in the JVM.

Method area:

Like the heap, it is an area shared by threads. In the method area, the information of each class (including the name of the class, method information, field information), static variables, constants, and the code compiled by the compiler are stored.

A very important part in the method area is the runtime constant pool, which is the runtime representation of the constant pool of each class or interface. After the class and interface are loaded into the JVM, the corresponding runtime constant pool is Create it. Of course, only the content in the constant pool of the Class file can enter the runtime constant pool, and new constants can also be put into the runtime constant pool during runtime, such as the intern method of String.

2. JVM heap


Heap space structure diagram:
insert image description here
(Note: The permanent generation is abandoned in the JDK1.8 version)


The entire heap space is divided into two halves:

  • Half: Cenozoic (including: Eden + survivor (S0 + S1) survivor).
  • The other half: the old age.

Garden of Eden: When loading or creating new objects at the beginning, they are stored in the Garden of Eden.

S0 and S1: Store objects that survived GC cleanup.

Old age: After several rounds of cleaning, the saved objects will be placed in the old age.

3. JVM's garbage collector (GC, Garbage Collector)

3.1 Garbage collector classification + combination


Garbage collection (GC, Garbage Collection), garbage collector (GC, Garbage Collector).

The combination method is: the combination method of the young generation and the old generation.

There are several combinations of garbage collectors:

  • The new generation (New Generation), the old generation (tenured generation) and the eternal generation (Perm Generation, jdk8 has been abandoned).
    insert image description here
其中Serial Old作为CMS出现"Concurrent Mode Failure"失败的后备预案
(红色虚线)在jdk8时将这两个组合声明为废弃,并在jdk9中完全取消
(绿色虚线)在jdk14中废弃
(绿色虚线)jdk14中,删除CMS垃圾收集器

3.2 Young Generation Garbage Processors


Serial GC:

  • 新生代的垃圾回收器,收集工作是单线程的,基于复制算法的。
  • Parameter command: -XX:+UseSerialGC.

Paralel Scavenge GC

  • 新生代的垃圾回收器,并行收集的多线程收集器,采用的是复制算法。
  • By setting parameter commands to achieve controllable throughput (Thoughput, the time the CPU uses to run user code/the total CPU consumption time) ParallelScavenge can ensure that the throughput of the new generation is given priority, but it cannot guarantee the overall throughput. .
  • Parameter command: -XX:+UseParallelGC.
    insert image description here

ParNew GC: (default new land garbage collector)

  • 新生代的垃圾回收器,Serial 收集器的多线程版本,采用的也是复制算法。The number of garbage collector threads can be limited by the -XX:ParallelGCThreads parameter.
  • Command parameters: -XX:+UseParNewGC.

3.3 Old-age garbage disposals


Serial Old GC:

  • 老年代的Serial 垃圾收集器,单线程串行的收集器,是基于标记-整理算法。
  • The -XX:+UseSerialGC parameter can specify that both the young and old generations use the serial collector.

Parallel Old GC:

  • 老年代的垃圾收集器,多线程并发的收集器,基于标记整理算法。
  • Parallel Old is to provide a throughput-first garbage collector in the old age as well.
  • Parameter command: -XX:+UseParallelOldGC.

CMS GC:

  • 老年代的垃圾收集器,多线程并发的收集器,基于标记清除算法。
  • Parameter command: -XX:+UseConcMarkSweepGC.

3.4 G1 GC garbage collector (jdk9 default)


G1 GC:

  • 青年代和老年代的垃圾收集器,多线程并发,并行的收集器,基于复制算法,标记-整理 算法。
  • Parameter command: -XX:+UseG1GC.

4. JVM garbage collection algorithm

4.1 Mark-Sweep algorithm (Mark-Sweep)


Mark all recyclable objects, and recycle all marked objects uniformly after the marking is completed.
insert image description here
Mark-Clear Cons:

  • There is memory fragmentation, just like the above surviving objects are unevenly occupied, and each space has memory gaps of different sizes. If you save an object with a larger byte size next time, there will be no space that can accommodate it.
  • Low efficiency, mark and clear, need to scan the entire memory space and release memory one by one.

4.2 Replication algorithm


A garbage collection algorithm for the young generation in the Java heap.

The copy algorithm is the abbreviation of the mark-copy algorithm. The available memory is divided into two blocks of equal size according to the capacity. Only one block is used at a time. When the memory of this block is used up, the surviving objects are copied to another block of memory. , and then clear the used memory space once again.
insert image description here

Features:

  • Solved the problem of memory fragmentation.
  • The space utilization is low, part of the system memory is lost, and a part of the memory is always freed up for replication.

4.3 Marking-Compression Algorithm (aka Marking-Collation Algorithm)


The garbage collection algorithm for the old generation in the Java heap.

Mark-compression algorithm: first clear the marked objects, and then compress other surviving objects to one end of the memory. At this time, the memory at the boundary can be directly cleared.
insert image description here

Features:

  • There will be no memory fragmentation, and space utilization is also improved.
  • Inefficiency, not only to clear the objects marked by scanning, but also to organize one more step.

4.4 Generational Collection Algorithm


Generally, virtual machines can use the generational collection algorithm:

  • It is the integration of the above algorithms. The Java heap is divided into the new generation and the old generation. According to different scenarios, different garbage collection algorithms are used.

The new generation generally adopts the replication algorithm.

  • In the new generation, every time garbage collection is performed, it will be found that a large number of objects die and only a few survive. Therefore, the replication algorithm is used to complete the collection with only a small cost of copying the surviving objects.

The old age generally adopts: mark-collation algorithm.

  • In the old age, because the object survival rate is high, mark-sweep and mark-sort algorithms are used for recycling.

5. JVM garbage collection mechanism process


The whole process of garbage collection mechanism:

  • Minor GC: Clean up the young generation.
  • Major GC: Clean up old generation.
  • Full GC: Clean up the entire heap space, including the young and permanent generations.
  • All GCs stop applying all threads.

The newly generated objects are first placed in the Eden (Eden) area of ​​the young generation. When the Eden space is full, Minor GC is triggered, and the surviving objects are moved to the Survivor0 area. When the Survivor0 area is full, the Minor GC is triggered, and the surviving objects in the Survivor0 area are moved to the Survivor1. area, which ensures that there is always a survivor area that is empty for a period of time. Objects that are still alive after multiple Minor GCs are moved to the old generation. The old generation stores long-lived objects. When it is full, Major GC=Full GC will be triggered. During the GC, all threads will be stopped and wait for the GC to complete. Therefore, for applications with high response requirements, minimize the occurrence of Major GC to avoid response timeout.


jdk command: jstat [-options parameter] process number [how many milliseconds are displayed once]

  • jstat command: Monitors running data such as class loading, memory, GC, JIT editing, etc. in the JVM (some cases in the heap).
  • options parameter:
Options 参数如下:
-gc:统计 jdk gc时 heap信息,以使用空间字节数表示
-gcutil:统计 gc时, heap情况,以使用空间的百分比表示
-class:统计 class loader行为信息
-compile:统计编译行为信息
-gccapacity:统计不同 generations(新生代,老年代,持久代)的 heap容量情况
-gccause:统计引起 gc的事件
-gcnew:统计 gc时,新生代的情况
-gcnewcapacity:统计 gc时,新生代 heap容量
-gcold:统计 gc时,老年代的情况
-gcoldcapacity:统计 gc时,老年代 heap容量
-gcpermcapacity:统计 gc时, permanent区 heap容量
  • After executing the command, the effect diagram:

insert image description here

S0C:第一个幸存区的大小
S1C:第二个幸存区的大小
S0U:第一个幸存区的使用大小
S1U:第二个幸存区的使用大小
EC:伊甸园区的大小
EU:伊甸园区的使用大小
OC:老年代大小
OU:老年代使用大小
MC:方法区大小
MU:方法区使用大小
CCSC:压缩类空间大小
CCSU:压缩类空间使用大小
YGC:年轻代垃圾回收次数
YGCT:年轻代垃圾回收消耗时间
FGC:老年代垃圾回收次数
FGCT:老年代垃圾回收消耗时间
GCT:垃圾回收消耗总时间
单位:KB

6. Set the size of the heap structure


Demonstrate the effect of a heap overflow:

package com.itholmes;

import java.util.ArrayList;

public class MyGC {
    
    
	
	byte[] b = new byte[1024*1024*2];//相当于一个对象2M大小
	
	public static void main(String[] args) throws Exception{
    
    
		
		ArrayList<Object> list = new ArrayList<>();
		
		//睡眠20s
		Thread.sleep(20000);
		
		for(int i = 0;;i++) {
    
    
			Thread.sleep(20);
			list.add(new MyGC());
			System.out.println(i);
			if(i % 20 == 0) {
    
    
				list = null;
				list = new ArrayList<>();
			}
		}
		
	}
}

In the object, we create an array of size 2M, such an object occupies about 2M. Constantly add to a list, and empty the list at a certain time (set the list to null, triggering the garbage collection mechanism).


Set the heap structure size:

Right-click -> run as -> run configurations as shown below:
insert image description here
(In eclipse, set the command parameters; normal code execution can also be executed directly through the command java -Xmx36m, etc., in this way.)

-Xmx: Maximum heap size. -Xmx36m is the maximum heap 36M.

-Xms: Initial heap size. -Xms16m is the minimum heap 16M

In this way, the size of the heap will change back and forth between the maximum heap and the minimum heap, which has a certain scalability. The above is that we have written it to death, and the maximum and minimum are 36M for easy testing.

-Xmn: Young generation size. -Xmn16m is the young generation 16M

-XX:SurvivorRatio:eden(Eden):(S0+S1) ratio. S0 and S1 are the same size. For example: the young generation size is 16M, if -XX:SurvivorRatio=2 then Eden:S0:S1 is the ratio of 8:4:4.


According to this logic, the execution of the above code will cause the effect of heap overflow.

7. jconsole command


jconsole is a visual tool for viewing java processes.

After we create a new connection and select the corresponding java process, we can see the relevant information:
insert image description here

Configured with the jvm parameter command above, one set and one view:
insert image description here

8. jps command


Many people think that this command comes with the Linux system, but it is actually a jvm command to view the current java process.

9. jvisualvm command


Similar to the jconsole command, it is more detailed than the command displayed by jconsole, and it can also be connected remotely.

  • You can also monitor some processes remotely. By default, jvisualvm cannot monitor remote JVM processes, and some configuration is required.
    insert image description here

For tomcat want to monitor remotely, the operation is as follows:insert image description here

Configure the catalina.sh/bat file in the bin directory:
insert image description here
insert image description here

10. jinfo command


jinfo [ option ] pid

View detailed information about a java process.
insert image description here

11. jmap command


jmap [options] pid

  • If no parameters are specified, the memory image information of the java virtual machine process will be displayed.
    insert image description here
  • -heap, displays information about the Java heap.
  • -histo, show statistics of objects in the Java heap.
  • There are other commands.

12. jstack command


jstack [ option ] vmid

  • Use jstack to view thread stacks (partial results).

Guess you like

Origin blog.csdn.net/IT_Holmes/article/details/125433386