How much do you know about JVM (java virtual machine)?

JVM (java virtual machine)

Hello! Dear programmers, welcome to this blog to learn about JVM. This article is mainly about the memory model, OOM, class loading mechanism in the java virtual machine, and some common garbage collection algorithms and garbage collectors. For related introductions, click on the catalog on the left to quickly access the places you are interested in!

First of all what is JVM?

For example: Create a java class, write it on the idea, compile and run the main method until the result of the compilation comes out. There is jvm in this process, so what exactly is jvm? Let's look at the picture below to explain:
insert image description here

Create a java process that starts after the java class is compiled. At this time, the system allocates a piece of memory space for the process and executes the code instructions of the process: a java virtual machine will be created (a thread is started in the java virtual machine to execute main method, the execution method is that the java virtual machine translates the content of the class bytecode into the machine code of the system where it is located),
so the JVM is: a Java underlying tool that translates the Class bytecode into machine code (which can be recognized by the computer)
. What is the relationship between the frequently mentioned JDK, Jre, and Jvm? And look at the figure below.
This is the relationship diagram stored in the file:
insert image description here
so jdk contains jre, and jre contains jvm
insert image description here

Java virtual machine runtime data area

Before the program is executed, the java code must be converted into a bytecode (class file). The JVM first needs to load the bytecode into the memory through a class loader (ClassLoader) in a certain way. Runtime Data Area (Runtime Data Area) ), and the bytecode file is a set of instruction set specifications of the JVM, which cannot be directly handed over to the underlying operating system for execution, so a specific command parser and execution Interface) is required to translate the bytecode into the underlying system instructions and then It is handed over to the CPU for execution, and in this process, it is necessary to call the interface of other languages ​​Native Interface (Native Interface) to realize the function of the entire program, which is the responsibility and function of these four main components. The jvm runtime data area is shown in the figure below: Next, the functions of each part are briefly introduced:

insert image description here

  1. Method area : The role of the method area: the runtime constant pool
    used to store data loaded by the virtual machine, such as class information, constants, static variables, and code compiled by the just-in-time compiler, is part of the method area, storing literals and symbol references
    .
    Literals: strings (moved to the heap in JDK 8), final constants, values ​​of basic data types.
    Symbolic references: Fully qualified names of classes and structures, names and descriptors of fields, names and descriptors of methods.

  2. Heap area :
    All objects created in the program are stored in the heap. The heap is divided into two areas: the young generation and the old generation area. The new generation is a newly created object. When the new generation passes through multiple GC (garbage collection), it will be put into the old generation area.

  3. Virtual machine stack :
    The life cycle of the Java virtual machine stack is the same as that of threads (a thread corresponds to a stack frame). The Java virtual machine stack describes the memory model of Java method execution: each method will create a stack frame while executing ( Stack Frame) is used to store local variable table, operand stack, dynamic link, method exit and other information. In the heap memory and stack memory we often say, the stack memory refers to the virtual machine stack.
    The description of the relevant data stored in it is as follows:

Local variable table : Stores various basic data types (8 basic data types) and object references known to the compiler. The memory space required by the local variable table is allocated during compilation. When entering a method, how much local variable space this method needs to allocate in the frame is completely determined, and the size of the local variable table will not be changed during execution. Simply put, it stores method parameters and local variables.
Operand stack : Each method generates a first-in, last-out operation stack.
Dynamic Link : A method reference pointing to the runtime constant pool.
Method return address : the address of the PC register

Check the status of the stack frame in IDEA, and create a method as shown in the figure:
insert image description here
run at the break point to view the data information,
insert image description here
run to the position of the break point, and
insert image description here
find that the output data has not changed after running, because the m and in the method in swap n is a local variable. The values ​​of m and n are passed in from the main method. In the swap method, it is the changed result, but the result is not received in the mian method, so the value in the main method is the same as that in the swap method. values ​​are different.

  • Native method stack :
    The local method stack is similar to the virtual machine stack, except that the Java virtual machine stack is for the JVM, while the local method stack is for the local method.
  • Program counter :
    The role of the program counter: it is used to record the line number executed by the current thread. The program counter is a relatively small memory space, which can be regarded as a line number indicator of the bytecode executed by the current thread.

Exception problem in memory:

heap overflow

Heap overflow is called OOM for short, and the full name of OOM is OutOfMemoryError (out of memory exception), so why is there a situation of insufficient memory? There are two main reasons, memory leak and memory overflow . The introduction to memory overflow and memory leak is as follows:
Memory overflow: The memory object should indeed survive. A certain runtime data area needs to create data, but there is insufficient space, and memory overflow will occur.
Memory leak: In that memory area, some objects and other data are saved, but they will not be used in the future, and there is no way to be GC, and a memory leak will occur. For example, after the user logs in to the page, the data is still retained if the user does not perform an exit operation. If a large number of users do not perform an exit operation, memory leaks will occur.

Virtual machine stack overflow:

Common virtual machine stack exceptions are StackOverFlow exceptions and OOM exceptions.
The explanations for these two types of exceptions are:
StackOverFlow: Caused by stack frame calls being too deep. Creating a continuous recursive call of a method will continuously create stack frames in the thread, and finally the exception will appear.
OOM: If the virtual machine cannot apply for enough memory space when expanding the stack, an OOM exception will be thrown.

class loading

1. Timing of class loading

1. When executing the mian method in a java class, you need to perform class loading first;
2. At runtime, execute static method calls, and load when operating static variables
3. When new an object, perform class loading
4. Create through reflection An object of a class can then generate an instance object through reflection, or call the static method class loading to execute only once (the class loading has been executed, the method area already has class information, and the heap also has class objects). If in multithreading , there is code that needs to execute class loading (the above timings). When jvm executes class loading, it will perform synchronized locking to ensure thread safety

2. Class loading process

The process of class loading is mainly divided into three parts: loading, connection, and initialization
loading :
loading class bytecode to the method area, and generating a class object in the heap
(connection operation) verification :
verifying whether the class bytecode data is safe And conform to the java virtual machine specification
(connection operation) Preparation :
static variables are set to the initial value (the initial value of the object is null, the basic data type is the corresponding initial value) the constant (finally modified) will be set to the real value (connection
operation ) analysis : the process of replacing the symbol references
in the constant pool with direct references , that is, the process of initializing constants . Use "symbolic reference" to represent this relationship Direct reference: execute class loading, load calss bytecode into memory, and the relationship between variables and values ​​reflected in memory is called direct reference to initialize static variables. The real initialization assignment, static Daibuy block initialization



3. Class loading mechanism:

Parental delegation model (the default loading mechanism of jdk): instead of directly executing the class loading code of the current class loader, it is necessary to find the parent class loader of the current class loader, and the parent class is also to find the parent class until the top level Load it, if you can't find it, leave it to the next category.
In general: find the class loader from bottom to top, and execute class loading from top to bottom.
Advantages: use the parental delegation mechanism to ensure that the classes provided by jdk are loaded first, avoiding repeated loading of classes. When creating a class with the same name as the jdk class, it will avoid using a locally created class to perform class loading, ensuring the safety of compilation.
Disadvantages: Scalability is relatively poor. For example, JDBC operation is not the same without the driver package of the database, and jdk cannot independently identify different driver packages, so it cannot be fully realized for all class loading.
Solution: Use the SPI mechanism to put the fully qualified name of the class to be loaded in a location where jdk can find it , and then tell jdk to let jdk find the location to load when executing class loading.

garbage collection

Judgment algorithm for dead objects:

Reference counting algorithm: Add a reference counter to the object. Whenever there is a place to refer to it, the counter will be +1; The subject is "dead".
Reachability analysis algorithm:
use "references" to judge dead objects: divide references into four types: Strong Reference, Soft Reference, Weak Reference and Phantom Reference. The strengths of the four citations are in descending order. An introduction to each citation is as follows:

  1. Strong references: Strong references refer to
    references that commonly exist in program code, similar to "Object obj = new Object()". As long as strong references still exist, the garbage collector will never recycle the referenced object instance.
  2. Soft references: Soft references are used to describe objects that are useful but not necessary. For objects associated with soft references,
    before the system is about to overflow memory, these objects will be included in the scope of recycling for the second recycling. If
    there is still not enough memory for this recovery, a memory overflow exception will be thrown.
  3. Weak references: Weak references are also used to describe non-essential objects. But its strength is weaker than soft references. Objects associated with weak references
    can only survive until the next garbage collection occurs. When the garbage collector starts working, no matter whether the current content is
    enough or not, it will recycle objects that are only associated with weak references.
  4. Phantom reference: Phantom reference is also called ghost reference or phantom reference, which is the weakest kind of reference relationship. Whether an object
    has phantom references will not affect its lifetime at all, and an object instance cannot be obtained through phantom references
    . The only purpose of setting a phantom reference to an object is to receive a system notification when the object is reclaimed by the collector
    .

garbage collection algorithm

1. Mark Sweep Algorithm

It is divided into two stages of "marking" and "clearing": first mark all objects that need to be recycled, and recycle all marked objects uniformly after the marking is completed.
But there are two disadvantages:
1. Low efficiency
2. After clearing elements, a large number of discontinuous memory fragments will be generated

2. Replication Algorithm

The "copy" algorithm is to solve the efficiency problem of "mark-clean". Divide the used memory into two spaces of the same size, and only use one of them each time. Copying the surviving object to another space cleans up the previous space.
Advantages: high efficiency when clearing the half area, no memory fragmentation
Disadvantages: low utilization rate (only 50%)

3. Marking Algorithm

Similar to the mark-and-clear algorithm, the solution adopted is to move the surviving objects to a continuous space, and then clean up the remaining space.
Advantages: There will be no memory fragmentation problems

4. Generational algorithm

In the heap, according to the characteristics of object creation and recycling
, it is divided into two areas: (1) The new generation
The new generation is divided into: Eden (E area), 2 Survivor (S area) objects live and die soon: soon Created, and soon becomes unusable garbage.
The default division is E:S:S=8:1:1, and the space utilization rate is 90%. The default is to use the E area and one S area to save objects each time, and leave the other S area empty. When performing Gc, copy the inventory partner to another S area that is left blank.
Algorithm adopted: Copy Algorithm
(2) Objects in the old generation
may survive for a long time, and
Algorithms adopted: Mark Clear Algorithm, Mark Sorting Algorithm

common garbage collector

1.Serial collector

The Serial collector is the most basic and oldest collector.
This collector is a single-threaded collector, but its "single-threaded" meaning does not only mean that it will only use one CPU or one collection thread to complete garbage collection work, but more importantly, when it collects garbage , must suspend all other worker threads until the end of its collection (Stop The World, translated as stopping the entire program, referred to as STW).

2. ParNew collector

The ParNew collector is actually a multi-threaded version of the Serial collector.
In addition to using multiple threads for garbage collection, the rest of the behavior, including all control parameters available to the Serial collector, collection algorithms, Stop The World, object allocation rules, recycling strategies, etc., are exactly the same as the Serial collector.

3. Parallel Scavenge collector

The Parallel Scavenge collector is a new generation collector, it is also a collector using the copy algorithm, and it is also a parallel multi-threaded collector
.

4. Parallel Old Collector

Serial Old is an old-age version of the Serial collector, which is also a single-threaded collector that uses a mark-sort algorithm.

5.G1 collector (the only garbage collector in the whole area)

The G1 (Garbage First) garbage collector is used when the heap space is large, and divides the heap into many equal areas. Each area is set with E (Eden), S (Survivor), T (Tenured old area) ). It is then garbage collected in parallel. After the G1 garbage collector clears the memory space occupied by the instance, it will also perform memory compression.
Young generation:
In the G1 garbage collector, the garbage collection process of the young generation uses the copy algorithm. Copy the objects in the Eden area and the Survivor area to the new Survivor area.
Old generation:
For garbage collection on the old generation, the G1 garbage collector is also divided into 4 stages, which are basically the same as the CMS garbage collector, but slightly different: the
initial mark (Initial Mark) stage-same as the Initial of the CMS garbage collector Like the Mark phase, G1 also needs to suspend the execution of the application. It will start from the root object and mark all reachable objects in the first-level child nodes of the root object. But the Initial Mark phase of G1's garbage collector occurs together with minor gc. That is to say, in G1, you don't need to suspend the execution of the application to run the Initial Mark phase like in CMS, but when G1 triggers the minor gc, the Initial Mark on the old generation is also done.
Concurrent Mark (Concurrent Mark) phase - in this phase G1 does the same thing as CMS. But G1 also does one more thing at the same time, that is, if in the Concurrent Mark stage, it is found that the survival rate of objects in the Tenured region is very small or basically no objects survive, then G1 will recycle them at this stage instead of Wait until the later clean up stage. This is also the origin of the Garbage First name. At the same time, at this stage, G1 will calculate the object survival rate of each region, which is convenient for the subsequent clean up stage.
final mark(Remark stage in CMS) - At this stage, G1 does the same thing as CMS, but uses a different algorithm. G1 uses an algorithm called SATB (snapshot-at-the-begining) to mark faster in the Remark stage reachable object.
Screen recovery (Clean up/Copy) phase - In G1, there is no corresponding Sweep phase in CMS. On the contrary, it has a Clean up/Copy phase. In this phase, G1 will select those regions with low survival rate of objects for recycling. This phase also occurs together with minor gc

The above is the JVM-related knowledge brought today, thank you for watching! ! !

Guess you like

Origin blog.csdn.net/qq_53699052/article/details/126789345