Java Virtual Machine (JVM) Summary

Java Memory Model JMM

concept

JMM is the Java Memory Model (Java Memory Model), which is used by the virtual machine to define a consistent cross-platform memory model, so that Java programs can achieve the ability of "compile once, run everywhere".
The Java memory model describes the relationship between the various variables of a program, as well as the underlying details of storing and retrieving variables from memory in the actual computing value system.

JMM stipulates that all variables are stored in the main memory, and each thread has its own working memory, which stores a copy of the variables used by the thread in the main memory . ) can only be in its own working memory, and different threads cannot operate variables in each other's working memory, and the value transfer between threads needs to be completed through the main memory

insert image description here
insert image description here
insert image description here

The above is the Java memory model in JDK1.7 and 1.8. The biggest change is to move the method area in JDK1.7 to the local memory and name it the metadata area.

Analysis of each memory area

Thread-private memory area

  • Program counter: A small piece of memory space unique to each thread, which can be regarded as a line number indicator of the bytecode executed by the current thread
  • JVM virtual machine stack: When each method in the thread is executed, a stack frame is created to store information such as local variable table, operand stack, dynamic link, and method exit. The process of each method from invocation to execution completion corresponds to the stacking and popping process of a stack frame in the virtual machine
  • Native method stack: The local method stack is exactly the same as the JVM virtual machine stack, except that the local method stack serves the Native method used by the virtual machine, while the JVM virtual machine stack serves the Java method executed by the JVM

memory area shared by threads

  • Heap: Created when the JVM starts, all object instances and arrays must be allocated on the heap. If there is not enough space on the heap to complete instance allocation and the heap cannot be expanded, OOM will be thrown.
  • Method area/metadata area: used to store loaded class information, constants, static variables, and code compiled by the just-in-time compiler.
  • Runtime constant pool: exists in the metadata area, and constants generated by the compiler or during runtime are placed in the runtime constant pool. The constants here include: basic types, wrapper classes (wrapper classes do not manage floating-point types, integers only Manage -128 to 127) and String.
  • String constant pool: stored in the heap, used to store string objects, or references to string objects.
  • Direct memory: The NIO (New Input/Output) class was newly added in JDK 1.4, and an I/O method based on channels (Channel) and buffers (Buffer) was introduced, which can use the Native function library to directly allocate outside the heap Memory, and then operate as a reference to this memory through a DirectByteBuffer object stored in the Java heap.

As can be seen from the above figure, multiple threads have shared data areas, heap area, and metadata area.
Each thread also has its own independent data area, program counter, JVM virtual machine stack, and local method stack.

Java program running process:
insert image description here

  1. Use tools to write Java source code, the suffix is ​​.java
  2. The source code is converted into bytecode by the compiler, and the suffix becomes .class
  3. Load the bytecode file into the JVM through the class loader for system execution

class loading mechanism

If a Java program actively uses a class that has not been loaded into memory, the JVM will initialize the class through three steps of loading, connecting, and initializing. These three stages are collectively referred to as the class loading stage.

The virtual machine loads the data describing the class from the Class file into the memory, and verifies, parses and initializes the data, and finally forms a Java type that can be directly used by the virtual machine. This is the Java class loading mechanism

class life cycle

The life cycle of a class starts from loading, then connects, initializes, uses, and finally unloads
insert image description here

The specific five steps of class loading

The class loading phase is divided into five steps: loading, verification, preparation, parsing, initialization

Loading: Obtain the binary byte stream of this class through the permission name of the class, and convert the static storage structure of the binary byte stream into the runtime data structure of the method area. A java.lang.Class object representing this class is generated in the memory as an access entry for various data of this class in the method area.

Verification: This stage is mainly to protect the information contained in the Class byte stream to meet the requirements of the current virtual machine, and will not endanger the security of the virtual machine itself

Preparation: formally allocate memory space for class variables, and set the initial value of the variable

Resolution: The process by which the virtual machine converts symbolic references in the constant pool into direct references (a symbolic reference is a set of symbols that describe the referenced target, and the symbol can be any literal value, as long as there is no conflict and can be located To the line. Direct reference is directly pointing to the memory address)

Initialization: In the preparation phase, variables have been assigned the initial value required by the system once, while in the initialization phase, variables and other resources must be assigned according to the user's subjective plan

when the class is loaded

  1. Create a new object instance
  2. Access a static variable of a class, or assign a value to it
  3. Calling a static method of a class
  4. reflection
  5. Initializes a subclass of a class (which first creates the parent class of the subclass)
  6. The startup class marked when the JVM starts, that is, the class whose class name is the same as the file name

class loader

Class loader: The binary byte stream code block that realizes the acquisition of the class through the permission name of the class is called a class loader (itself is also a class), which generates a java.lang.Class instance for all classes loaded into memory object

Class loaders can be divided into startup class loaders, extension class loaders, application class loaders, and custom class loaders

Parental Delegation Mechanism

Working process: If a class loader receives a class loading request, it will not try to load the class first, but will first delegate the request to the parent class loader to complete. This is the case for each class loader. Delegate all the way up to the very top. If the parent class loader reports that it cannot complete the loading request (the required class is not found in its search range), the loader will try to load it by itself. insert image description here
The advantage of the parent delegation model: the advantage is that the java class and the class loader have a hierarchical relationship with priority. This hierarchical relationship can avoid repeated loading of classes. For example, when the parent class loader has loaded a class, At this time, the subclass loader receives the loading request of this class, and it will load it repeatedly through the class loader of the current level, which ensures the security of the java code to a greater extent, but because each loading must be from bottom to top Request loading, not flexible enough.

garbage collection mechanism

After the Java process starts, it will create a garbage collection thread to automatically recycle useless objects in memory.

When Garbage Collection Occurs

1. Determined by the JVM garbage collection mechanism

  • Create objects to allocate memory space, and automatically trigger GC when the space is insufficient
  • Other recycling mechanisms
    There is a finalize() method in the Object class. When the JVM detects that the object is no longer pointed by any reference, the garbage collection period will call the finalize() method of the object

2. The displayed manual call System.gc()
calls this method to suggest that the JVM perform FGC. Generally, this method will not be used, but it will be managed by the virtual machine itself

Judgment strategy for garbage collection

When the virtual machine performs garbage collection, how to judge whether an object should be recycled? Generally there are two methods

  1. Reference counting method:
    Add a reference counter to the object. Whenever there is a reference pointing to the object, the counter is +1, and the reference disconnection counter is -1. When the counter is 0, it means that the object can be recycled. This method is relatively simple, but it is difficult to solve the problem of mutual reference between objects
  2. Reachability analysis algorithm:
    use a series of "GC Roots" objects as the starting point, start searching from these nodes, and the path traveled by the search is called "GC Root reference chain". When an object does not have any reference chain connection to GC Roots That is, when the object cannot reach "GC Roots", the object can be recycled, which is exactly the strategy used in java

insert image description here
As shown in the figure above, when the connection between Object3 and Object2 is disconnected, neither Object3 nor Object4 can find the "GC Root", so these two objects can be recycled at this time.

Memory space that needs to be garbage collected

1. Method area/metadata area

  • The method area of ​​JDK1.7 is generally called permanent generation in GC
  • The metaspace of JDK1.8 exists in local memory, and GC is also garbage collection of metaspace
  • Garbage collection of permanent generation or metaspace mainly has two parts, obsolete constants and unreferenced classes
    2. Heap
  • The heap is the main area managed by Java garbage collection, also known as the "GC heap"
  • From the perspective of recycling, based on the generational algorithm of garbage collection, the heap can also be subdivided:
    1. New generation: it is divided into Eden area, From Survivor area and To Survivor area
    because most java objects have the characteristics of life and death , and the space allocation of most classes is in the new generation, so the new generation garbage collection (Minor GC) is very frequent and the recovery speed is also very fast 2.
    Old generation: The old generation garbage collection (Major GC) is often accompanied by at least One-time Minor GC, because the garbage collection of the old generation is not as frequent as the new generation, so the speed will be more than 10 times slower than the Minor GC. 3.
    Full GC: under different semantic conditions, the definition of Full GC is also different, sometimes refers to Garbage collection in the old generation, sometimes refers to the garbage collection of the whole heap (new generation + old generation), and may also refer to garbage collection with user thread suspension (Stop-The-World) (as in the GC log)
    insert image description here

Garbage Collection Algorithms

1. Mark-and-clean method

  • The most basic recycling algorithm, used in the old generation
  • The algorithm is divided into two stages: marking and clearing. The old space is searched once as a whole, the space that needs to be reclaimed is marked, and finally the marked space is reclaimed uniformly.
  • The mark-and-sweep method is the most basic recycling algorithm, and other algorithms are improved on the basis of it.
  • Insufficient: Marking and clearing are not efficient, and many discontinuous small spaces will be left after marking and clearing, which may cause the next time a larger object is allocated, it cannot be stored, resulting in a decrease in space utilization
    insert image description here

2. Marking method

  • Old Generation Collection Algorithm
  • The marking process is still the same as the marking and clearing algorithm, but the second stage is not a simple clearing, but first moves all surviving objects aside, and then directly cleans up the space outside the boundary
    insert image description here

3. Copy algorithm

  • New Generation Collection Algorithm
  • Divide the available memory into two parts, and always use only part of it each time memory is allocated. When this part of the memory block is used up, copy all surviving objects directly to another unused memory, and then use it up clean up that part of
  • Advantages: Simple operation and high execution efficiency. Disadvantages: The space utilization is only half, and the cost is too high
    insert image description here

Generational Collection Algorithm

  • The current JVM uses a generational collection algorithm. This algorithm has no new ideas. It is just an optimization of the above methods. The memory space is divided into several parts according to the life cycle of the object.
  • First divide the java heap into the new generation and the old generation
  • Since 98% of the objects in the new generation have the characteristics of life and death, there is no need to allocate space according to the 1:1 copy algorithm, but divide the new generation into a large Eden area and two small Survivor areas . Its default ratio is 8:1:1. So in fact, the memory space utilization rate of the new generation is 90%.
  • In the new generation, due to the need for frequent garbage collection, and a large amount of recyclable space is often generated during garbage collection, the copy algorithm is used, and the objects entering the old generation have a high survival rate, are not easy to be reclaimed, and have no additional space to store. copy, so a mark-and-sweep algorithm is used

garbage collection process

  1. When the Eden space is insufficient, Minor GC is triggered, and then the surviving data in the Eden area and a Survivor area (From) are copied to another Survivor area (To), and finally the objects in the Eden area and the Survivor area (From) are cleaned upinsert image description here
  2. After the space cleaning is completed, create a new object in this thread and put it into the Eden area again until the space in the Eden area is insufficient, repeat the above operation again for Minor GC
  3. When an object in the Eden area always survives after multiple cleanings, put it into the old generation
  4. When the space in the Survivor area is insufficient to store the copied surviving objects, all the objects enter the old age
  5. Magor GC is triggered when there is insufficient space in the old generation

Memory allocation and recycling strategy

  1. Most created objects will first enter the Ende area. When the memory in the Ende area is insufficient, Minor GC is triggered
  2. Large objects will directly enter the old age. The so-called large objects here are java objects that need to occupy a large amount of continuous memory space for storage. The most typical ones are very long strings and large arrays.
  3. The long-term surviving objects in the new generation enter the old age, because most of the objects in the Ende area have the characteristics of life and death, which is what we expect, but if the long-term objects in the new generation are always alive, it will cause each Minor GC total It is necessary to move the object so that the efficiency is reduced. Therefore, in order to avoid this situation, the virtual machine defines an age counter for each object, which is initially 1. If the object survives every Minor GC, the age will be +1. When the age reaches a certain threshold (15 by default), The object will enter the old generation
  4. Dynamically judge the age. For the objects of the new generation, it is not necessary to enter the old generation only when the age reaches the threshold. When objects of the same age in the Survivor space occupy more than half of the total space, these objects will also enter the old generation.
  5. Space allocation guarantee mechanism, when Minor GC occurs, we cannot guarantee that the space occupied by the remaining surviving objects is always smaller than the Survivor area (To), and when the space occupied by the remaining surviving objects is always greater than the Survivor area (To), it will trigger Space allocation guarantee mechanism, try to put these surviving objects directly into the old generation
    Before the Minor GC occurs, the virtual machine checks whether the maximum continuous available memory space in the old generation is greater than the total space of all objects in the new generation;
    if it is larger, it means that Minor GC is safe. The space allocation guarantee can be carried out. If it is less than, the virtual machine will check whether the set HandlePromotionFailure setting value allows guarantee failure (whether it is true);
    if it is true, then check that the maximum continuous space of the old generation is much larger than each promotion of the old generation object If it is greater than the average size, continue to try a Minor GC, but there is still a risk at this time, if it is less than or HandlePromotionFailure == false, you need to perform a Full GC

The impact of garbage collection

  1. The user thread suspends
    garbage collection work is executed in the garbage collection thread. In many cases, when performing garbage collection work or a certain step in the garbage collection thread, it is necessary to suspend the user thread, that is, Stop-The-World, this The process will cause the user to look stuck

In the garbage collector, the concepts of concurrency and parallelism are different
Parallel: Refers to multiple garbage collection threads executing at the same time, the user thread is in a waiting
state on another CPU

  1. Indicators for judging the garbage collector: throughput and user experience
    Throughput: the ratio of CPU time for running user code to CPU consumption time, that is,
    throughput = time for running user code/(time for running user code + garbage collection Time)
    Pause time: GC causes the single pause time and total pause time of user threads.
    Throughput priority: The total pause time of user threads is short, even if the single pause time is longer, it is acceptable.
    User experience priority: Single pause of user threads Short times, even longer total pause times are acceptable.

garbage collector

insert image description here

Serial collector (new generation collector)

  • single thread
  • copy algorithm
  • Stop The World(STW)
  • Application scenario: the default new generation collector in Client mode
  • Advantages: For a single CPU, the Serial collector has no thread interaction overhead, can concentrate on garbage collection, and can obtain the highest single-thread collection efficiency

ParNew (new generation collector, parallel GC)

  • Multithreading
  • copy algorithm
  • Stop The World
  • Application scenario: used with CMS collectors in programs that give priority to user experience
  • Advantages: As the number of CPUs that can be used increases, the utilization of CPU resources is of great benefit

Parallel Scavenge (new generation collector, parallel GC)
multi-threaded version of the serial collector

  • Multithreading
  • copy algorithm
  • Controllable throughput
  • Adaptive adjustment strategy: The virtual machine collects performance monitoring information based on the current system operating conditions, and dynamically adjusts these parameters to provide the most appropriate pause time or maximum throughput
  • Application scenario: "throughput priority" collector, suitable for task-based programs with high throughput requirements

Serial Old collector (old generation, serial GC)

  • single thread
  • Mark-Collating Algorithm
  • Application scenario: used in conjunction with the Parallel Scavenge collector, as a backup solution for CMS, used when Concurrent Mode Failure occurs in concurrent collection

Parallel Old collector (old age collector, concurrent GC)

  • Multithreading
  • Mark-Collating Algorithm
  • Application Scenario: Throughput Priority

CMS collector (old age collector, concurrent GC)

  • Concurrent collection, low pause
  • User experience first
  • The whole process is divided into four steps:
    Initial marking : Initially mark the objects that GC Roots can be associated with, the speed is very fast, and STW concurrent
    marking is required : concurrent marking is the process of GC Roots Tracing, starting from the objects associated with GC Roots downwards Search for objects that are not connected by the reference chain
    to remark : the remark stage is to correct the concurrent mark stage, because user threads are also executing concurrently, so the objects that were not originally associated with GC Roots are associated again, and the mark is generated Changes require re-flagging. This stage requires STW, and the pause time is a little longer than the initial mark, but much shorter than the
    concurrent
    mark . It can be executed concurrently with user threads and will not be STW, so on the whole, the memory recovery of the CMS collector is a defect of concurrent execution with user threads: CMS will seize CPU resources, CMS cannot
    handle floating garbage, and CMS uses The "mark-and-sweep" algorithm will cause a lot of space debris

G1 collector (garbage collector for all regions)

  • When the heap memory is large, divide the heap into many region blocks, and then perform garbage collection on them in parallel
  • G1 garbage collection basically does not use STW, but performs garbage collection on regions based on the strategy of most garbage priority collection (on the whole, it is a "mark-sort" algorithm, and locally, it is a copy algorithm of two regions).
  • User experience first
  • A region may belong to Eden, Survivor or Tenured memory area
  • After the G1 collector cleans up the space occupied by garbage, it will also perform memory compression

The result is shown in the figure below. E in the figure indicates that the region belongs to the Eden memory area, S indicates that it belongs to the Survivor memory area, and T indicates that it belongs to the Tenured memory area. Blank spaces in the figure represent unused memory space. The G1 garbage collector also adds a new memory area called the Humongous memory area, such as the H block in the figure. This memory area is mainly used to store large objects - that is, objects whose size exceeds 50% of the size of a region:
insert image description here

Young generation garbage collection
In the G1 garbage collector, the young generation garbage collection process uses the copy algorithm to refresh the objects in the Eden area and the Survivor area to the new Survivor area. The
insert image description here
old generation garbage collection
For the old generation garbage collection, G1 is divided into four phase, basically the same as the root CMS garbage collector

  • Initial marking phase: Same as CMS, it will mark all reachable objects starting from the root object in the first layer of child nodes of the root object, but the difference is that G1 does not use STW, but works with minor GC occur
  • Concurrent marking stage: this stage is the same as G1 and CMS. At the same time, G1 also discovers which objects in the Tenured region have a low survival rate or basically no objects survive. Then G1 will recycle them at this stage instead of waiting for the following In the clean up stage, G1 will calculate the object survival rate of each region to facilitate subsequent cleanup
  • Final marking phase: re-mark all reachable objects
  • Screening and recycling phase: In this phase, G1 will pick out those regions with low object survival rate for recycling, and it will happen at the same time as Minor GC
    insert image description here

When garbage collection is triggered

Minor GC:
Create objects in the Eden area, and the space in the Eden area is insufficient
Migor GC:
When there are objects to be placed in the old generation, but the space in the old generation is insufficient

  • The new generation is promoted to the old generation (when the age of the object reaches a certain threshold, or when the total space occupied by objects of the same age is greater than half of the Survivor area)
  • Large objects (need to occupy a lot of continuous space) go directly to the old age
  • The space allocation guarantee mechanism of Minor GC, before Minor GC occurs, if the maximum available continuous space in the old generation is less than all object space in the new generation, Migor GC will be triggered in the following situations 1. Space guarantee is not allowed HandlePromotionFailure == false
    2.
    Allowed Space guarantee, but the maximum available continuous space in the old generation is smaller than the average size of objects promoted to the old generation.
    3. Space guarantee is allowed, but the space in the old generation is still insufficient after the execution of Major GC

The above is a summary of the knowledge points of the Java Virtual Machine (JVM). With the deepening of the follow-up study, the content will be supplemented and modified synchronously. It will be a great honor to help all bloggers. Please correct me

Guess you like

Origin blog.csdn.net/m0_46233999/article/details/118216648