The blogger takes you to understand the JVM in depth

1. Overview of JVM architecture

1. JVM runs on the operating system, it does not directly interact with the hardware

Insert picture description here

2. JVM architecture diagram

Insert picture description here
According to the storage content, it is divided into 5 areas, namely: method area, Java stack, local method stack, heap, and program counter. Among them, the method area and heap are shared by threads, and the Java stack, local method stack, and program counter are unique to threads.
1. The program counter
is a small memory space. It can be regarded as the line number indicator of the bytecode executed by the current thread. When the bytecode interpreter works, it selects the next one to be executed by changing the value of this counter. Basic functions such as bytecode instructions, branches, loops, jumps, exception handling, thread skinning, etc. all rely on this counter to complete. At the same time, in order to restore the correct execution position after thread switching, each thread needs an independent The program counter of each thread does not affect each other and is stored independently. We call this type of memory area the thread private memory.
2. The Java stack (virtual machine stack),
like the program counter, is also private to the thread, and the declaration period is the same as that of the thread. The virtual machine stack describes the memory model of Java method execution. Each method creates a stack frame (StackFrame) at the same time as it is executed, which is used to store information such as local variable table, operand stack, dynamic link, method exit, etc. The process from invocation to execution of a method corresponds to the process of pushing a stack frame in the virtual machine stack to popping out of the stack.
Insert picture description here
3. The local method stack
serves the Native method used by the virtual machine, and is also a data area private to the data thread. Under normal circumstances, we do not need to pay attention to this area. C language description
4. The heap
is the largest memory managed by the Java virtual machine A block, a memory area shared by all threads, is created when the virtual machine starts. The main purpose is to store instance objects. At the same time, this area is the main area managed by the garbage collector, so it is often called the "GC" heap, according to The generational collection algorithm used by garbage collection. The heap memory logic is divided into
Insert picture description here
5, the method area is
also called non-heap, which is used to store the class information, constants, static variables, and code compiled by the instant compiler that have been loaded by the virtual machine. , Which is also a shared memory area shared by each thread

Note: There is an area called Runtime COnstant Pool in the method area, which is mainly used to store various literal values ​​and symbol references generated by compilation. This part of the content will be stored in the runtime constant pool after the class is loaded. .

3. Class loading

(1) Class loading mechanism: The virtual machine loads the data describing the class from the .class inquiry into the memory, checks the data, converts, analyzes and initializes, and finally forms a Java type that can be directly used by the virtual machine.
Note: Class loading is only performed once
(2) Class loading process:
Insert picture description here
Loading : Load the bytecode content of the class file into the memory, and convert the content into the runtime data structure in the method area, and generate a representation of this in the memory The java.lang.Class object (class object) of the class serves as the access entry for the class data in the method area.
Verification : Ensure that the byte content in the class file conforms to the JVM specification and does not endanger the JVM's own security.
Preparation : Formally allocate memory space for class variables (static variables), and initialize static variables (assign default values). The memory of static variables is allocated in the method.
Resolution : The process of replacing the symbolic references of the virtual machine constant pool with direct references.
For example: String s = "aaa", the address converted to s points to the address of "aaa"
Initialization : According to the subjective plan made by the programmer through the program, the initialization of static variables and other resources is completed. In this process, static variable assignment and static code will be completed In the statement.

4. Class loader

(1) Class loader: It is used to implement the loading stage of the class loading process. It is responsible for loading the bytecode content of the class file into the memory, converting these content into the runtime data structure in the method, and generating a representative in the memory The java.lang.Class object of this class serves as the access entry for class data in the method area.
Insert picture description here
(2) Classification of class loading
1. Class loading of the virtual machine's own amount:
Start the class loader: C++ language implementation, responsible for loading the content in %java_home%jre/lib/rt.jar
Extended class loader: Java language implementation, responsible for loading the content in %java_home%jre/lib/ext/*.jar
Application class loader: It can also be called the system class loader, which is responsible for loading all classes of the user classpath classPath. If your own class loader has not been defined in the deactivated program, the application class loading is generally used by default.
2. User-defined class loader
*Users can customize the class loader, inheriting the java.lang.ClassLoader
Insert picture description here
(3) Parents Delegation Model of the class loader (Parents Delegation Model)
*Working process: if a class loader receives To load a request, it will not try to load the class by itself first, but will delegate the request to the parent class loader to complete it. This is the case for every level of class loader, so all load requests should eventually be sent to the most In the top-level startup class, only when the parent class reports that it cannot complete the loading request (the required class is not found in his search range), the child class will try to load it by itself.
*Benefit: Use the parental delegation model to organize the accumulator The relationship between the two, there is an obvious advantage that java has a priority level relationship along with its class loading, such as loading the java.lang.Object class located in rt.jar, no matter which class is loaded This class is ultimately entrusted to the top-level startup class loader for loading, so the Object class is the same class in various class loading environments of the program.

Code example

public class Test{
	 public static void main(String[] args) throws IOException {
		 Object obj = new Object();
		 System.out.println(obj.getClass().getClassLoader());   //启动类加载器,所以打印结果为null
		 
		 MyClass mc = new MyClass();
		 System.out.println(mc.getClass().getClassLoader().
		                        getParent().getParent());
		 
		System.out.println(mc.getClass().getClassLoader().getParent());
		 System.out.println(mc.getClass().getClassLoader()); 
		 }
	}
class MyClass{}

5. Garbage collection

1. Determine whether the object is dead
(1) Reference counting algorithm: Add a reference counter to each object, that is, whenever the object is referenced in one place, the counter is increased by 1, and when the reference is invalid, the counter is decreased by 1 , It is impossible to use an object whose counter is 0 at any time.
* Advantages: The implementation of the reference counting algorithm is relatively simple, and the judgment efficiency is high.
* Disadvantages: It is difficult to solve the problem of changing references between objects.
Insert picture description here
Running the above code can be recycled, because the current mainstream Java virtual machine does not use a reference counting algorithm to manage memory.
(2) Reachability analysis algorithm: Use a series of objects called "GC Roots" as the starting point to search downwards from these nodes. The path searched is called Reference Chain. When an object If there is no reference chain to GC Roots, it proves that this object is unusable.
Insert picture description here
In the Java voice, the objects that can be used as GC Roots include the following:
*Objects referenced in the virtual machine stack (local variable table in the stack frame. That is, local variables)
* Objects referenced by class static properties in the
method area *Method area The object referenced by the constant in
* The object referenced by the native method (Native method)

2. Classification of references
(1) Background: Traditionally, it is understood that an object is filled with only being quoted or not being quoted, but there is no way to describe an object that is "tasteless to eat, and it is a pity to discard". We hope to describe such a category Objects, when the memory space is still sufficient, can be retained in the memory, if the memory space is still very tight after the card machine recovery, you can discard these objects.
(2) After JDK1.2, the reference of the object is divided into four levels, so that the program can control the life cycle of the object more flexibly:
*Strong Refenrence
*Soft Reference
*Weak Reference ( Weak Reference)
*Phantom Reference
(3) The level of reference is from high to low: strong reference>soft reference>weak reference>phantom reference
*Strong reference: it is the most commonly used reference in the program, similar to "Object obj = "new Object()", this kind of reference, as long as the strong reference still exists, the garbage collector will never reclaim the object in the strong credit. In time, when the memory space is insufficient, the JVM throws OutOfMemoryError and does not reclaim it.
*Soft references: used to describe some useful but non-essential objects. If an object has only soft references, the memory space is sufficient, the garbage collector will not reclaim it, the object can be used by the program, and soft references can be used To achieve memory-sensitive notification caching, after JDK1.2, the SoftReference class is provided to implement soft references. Insert picture description here
*Weak reference: It is also used to describe some non-essential objects. The difference between weak reference and soft reference is that only objects with weak references have a more accurate life cycle, and the garbage collector thread scans the memory area under its jurisdiction. In the process, once an object with only weak references is found, its memory will be reclaimed regardless of whether the current memory space is sufficient or not. After JDK1.2, the WeakReference class is provided to implement weak references.
*Virtual reference: It is the same as a virtual reference, which is different from other types of references. A virtual reference does not determine the life cycle of an object. If an object only holds a virtual reference, then it is the same as if there is no reference, and it may be affected at any time. The garbage collector reclaims, and cannot obtain an instance object through virtual reference. The only purpose of setting a virtual reference association for an object is to receive a notification when the current object is recycled by the garbage collector. After JDK1.2, it provides The PhantomReference class implements virtual references.

3. Garbage collection in different areas
Insert picture description here
Method area: the permanent generation in the HotSpot virtual machine. If the permanent generation of garbage collection is to recycle two parts: obsolete constants and useless class objects
* obsolete constants: Assume that the string constant "abc" has entered the constant Pool, but the current system does not have any String type reference to the "abc" constant, and no other place uses the "abc" literal constant. If memory reclamation occurs, and if necessary, the "abc" constant will be cleared .
*Useless class objects
*All instance objects of this class have been recycled, that is, there is no instance of this class in the Java heap
*The ClassLoader that loaded this class has been recycled
*The java.lang.Class object used by this class is not available It is referenced anywhere, and the methods of this class cannot be accessed through reflection anywhere.
Note: Useless objects here can be recycled if they meet 3 conditions, but it is not necessary. Whether to be recycled can be controlled by the -Xnoclassgc parameter; at the same time, you can also use -XX:+TraceClassLoading to view the class loading information.

Heap area: Especially in the new generation of garbage collection, a garbage collection performed by conventional applications can generally recover 70% to 95% of the space.
Heap memory allocation diagram
Insert picture description here
Brief description: The newborn area is the area where objects are created, applied, and died. An object is created, applied, and eventually collected by the garbage collector and dies. The newborn area is divided into two parts: the Eden area and the survivor area. All newly created objects (new) are in the Eden area; the survivors are divided into two: survivor zone 0 and zone 1. When the space in the Eden zone is used up, the program needs to create a new object, the JVM object Yi Garbage collection starts in the Eden area, and YGC is used to destroy the objects that are no longer used in the Eden area, and then move the remaining objects in the Eden area to the survivor area 0. The area 0 is full, and the area 0 is garbage destroyed. , The surviving objects are moved to the survivor zone 1. If zone 1 is also full, then the zone 1 is moved to the elderly care zone; if the elderly care zone is also full, the JVM will turn on FullGC (abbreviation: FGC) at this time to proceed Memory cleaning in the retirement area. However, if the new object cannot be saved after the Full GC is executed, an OOM exception occurs: heap memory overflow.

4. Garbage collection algorithm
(1) Mark-Sweep: It is the most basic garbage collection algorithm. Other algorithms are improved based on this idea. The mark-sweep algorithm is divided into "mark" and "sweep". "Clear" two stages: first mark the objects that need to be recycled, and collect all marked objects uniformly after marking.
Insert picture description here
Disadvantages:
*Marking and clearing are not very efficient
. After marking clearing, a large number of discontinuous memory fragments will be generated, and there will be a problem that large objects cannot find available space in the future.

(2) Copying algorithm (Copying): It divides the available memory into two blocks, and only uses one block at a time; when this block of memory is used up, the surviving objects are copied to the other block, and then the used memory The memory space is cleaned up at once.
Insert picture description here
Analysis: Although this algorithm is simple to implement, has high memory efficiency, and is not prone to fragmentation, the biggest problem is that the available memory is compressed to half of the original. And if the number of surviving objects increases, the efficiency of the Copying algorithm will be greatly reduced.
Application: The current commercial version of the virtual machine uses the replication algorithm to reclaim the new generation. 98% of the objects in the new generation are "live and die", so the heap memory is divided into a larger Eden space and two smaller Serivors (Survivor) space, HotSpot virtual machine default Eden and Serivor size ratio is 8:1. Each time you use Eden and a piece of Servivor, when reclaiming, copy the surviving objects in Eden and Serivor to another piece of Serivor at one time, and finally clear the space of Eden and the used Serivor.

(3) Mark-Compact algorithm: The mark operation is the same as the "mark-clear" algorithm. The subsequent operation does not directly clean up the object, but completes it when cleaning up useless objects so that all surviving objects are turned to one end Move, and then directly clean up the memory outside the end boundary.
Insert picture description here
Analysis: There will be no memory fragmentation, and the need to move objects on the basis of marking will still reduce efficiency.
Application: The survival rate of objects in the old age is relatively high, and this collection algorithm is generally used for recycling.

(4) Generational collection algorithm (Generational Collection): At present, the garbage collection of commercial virtual machines adopts "generational collection". Its core idea is to divide the memory into several different areas according to the life cycle of the object's survival. Under normal circumstances, the heap area is divided into Tenured Generation and Young Generation. The characteristic of the old generation is that only a small number of objects need to be collected during each garbage collection, while the characteristic of the new generation is that every garbage collection There are a large number of objects that need to be recycled, so the most suitable collection algorithm can be adopted according to the characteristics of different generations. Most JVM GC adopts the Copying algorithm for the new generation. The old generation uses the Mark-Compact algorithm because only a small number of objects are collected each time.

Guess you like

Origin blog.csdn.net/qq_44962429/article/details/107332429