Summary of Java interview questions | Summary of Java interview questions 3-JVM module (continuously updated)

JVM

JVM memory composition model

JVM is a java virtual machine, consisting of four parts: ClassLoader (class loader), Runtime Data Area (runtime data area, memory partition), Execution Engine (execution engine), Native Interface (local library interface), The following figure can roughly describe the structure of the JVM.

Java's memory model is JMM. JMM masks the memory access differences of various hardware and operating systems, so that Java programs can achieve consistent concurrency effects on various platforms.

[The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-5xOD2NL0-1681540635269) (D:/Study/JAVA/Interview/Interview Questions Compiled Version.assets/jvm- 1.jpg)]

Before the program is executed, the java code must be converted into bytecode (class file). The jvm first needs to load the bytecode into the runtime data area (Runtime Data Area) of the memory through the class loader (Class Loader). The bytecode file is a set of instruction set specifications of jvm and cannot be directly handed over to the underlying operating system for execution. Therefore, a specific command parser execution engine (Execution Engine) is required to translate the bytecode into underlying system instructions and then hand them over to the underlying operating system. The CPU executes it, and in this process it is necessary to call the interface of other languages: the native interface (Native Interface) to realize the functions of the entire program. This is the responsibility and function of the four main components.

Shared variables between threads are stored in main memory (main memory). Each thread has a private local memory (local memory). The thread's local memory stores the variables used by the thread and a copy of the main memory. All thread operations on variables are performed in local memory . Threads cannot directly read or write variables in main memory .

Main memory: Mainly stores Java instance objects. Instance objects created by all threads are stored in main memory , regardless of whether the instance object is a member variable or a local variable in a method (also called a local variable). Of course, it also includes shared Class information, constants, static variables. In shared data areas, thread safety issues may be found when multiple threads access the same variable.

Working memory: It mainly stores all the local variable information of the current method (the working memory stores copies of variables in the main memory). Each thread can only access its own working memory, that is, the local variables in the thread are not visible to other threads. Yes, even if two threads execute the same piece of code, they will each create local variables belonging to the current thread in their own working memory, which of course also includes bytecode line number indicators and related Native method information. Since the working memory is the private data of each thread, threads cannot access the working memory of each other, so the data stored in the working memory does not have thread safety issues.

What is defined by Java's memory model?

The entire Java memory model is actually built around three characteristics. They are: atomicity, visibility, and orderliness . These three characteristics can be said to be the foundation of the entire Java concurrency.

1 Atomicity: An operation is uninterruptible, either all executions succeed or all executions fail.
2 Visibility: All threads can see the latest status of main memory. (When one thread modifies a shared variable, other threads can immediately see the latest value of the variable)
3 Ordering: That is, the order of program execution is executed in the order of the code.

Java memory distribution

During the execution of a Java program, the Java virtual machine divides the memory it manages into several different data areas. These areas have their own purposes, as well as the time of creation and destruction. Some areas always exist with the start of the virtual machine process, and some areas are created and destroyed depending on the start and end of the user thread. According to the "Java Virtual Machine Specification", the memory managed by the Java virtual machine will include the following runtime data areas.

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-NWRhcFiS-1681540635270) (D:/Study/JAVA/Interview/Interview Questions Compiled Version.assets/jvm- 2.jpg)]

  1. program counter

    The Program Counter Register is a small memory space that can be regarded as a line number indicator of the bytecode executed by the current thread. In the conceptual model of the Java virtual machine, the bytecode interpreter works by changing the value of this counter to select the next bytecode instruction that needs to be executed. It is an indicator of program control flow, branches, loops, and jumps. , exception handling, thread recovery and other basic functions all need to rely on this counter to complete.

    Since the multi-threading of the Java virtual machine is implemented by switching threads in turn and allocating processor execution time, at any given moment, a processor (a core for a multi-core processor) will only execute one thread. instructions in the thread. Therefore, in order to return to the correct execution position after thread switching, each thread needs to have an independent program counter. The counters between each thread do not affect each other and are stored independently. We call this type of memory area "thread private" of memory.

    If the thread is executing a Java method, this counter records the address of the virtual machine bytecode instruction being executed; if the thread is executing a native (Native) method, the counter value should be empty (Undefined). This memory area is the only area that does not specify any OutOfMemoryError conditions in the "Java Virtual Machine Specification".

  2. Java virtual machine stack

    Like the program counter, the Java Virtual Machine Stack is also thread-private and its life cycle is the same as the thread. The virtual machine stack describes the thread memory model of Java method execution: when each method is executed, the Java virtual machine will synchronously create a stack frame [illustration] (Stack Frame) to store local variable tables, operand stacks, dynamic Connection, method export and other information. The process from each method being called until execution is completed corresponds to the process of a stack frame being pushed into the virtual machine stack and popped out of the stack.

    In the "Java Virtual Machine Specification", two types of exceptions are specified for this memory area: if the stack depth requested by the thread is greater than the depth allowed by the virtual machine, a StackOverflowError exception will be thrown; if the Java virtual machine stack capacity can be dynamically expanded, When the stack is expanded and cannot apply for enough memory, an OutOfMemoryError exception will be thrown.

  3. native method stack

    The functions played by Native Method Stacks and virtual machine stacks are very similar. The difference is that the virtual machine stack serves the virtual machine to execute Java methods (that is, bytecode), while the native method stack serves the virtual machine. Native method services used by the machine.

    The "Java Virtual Machine Specification" does not impose any mandatory requirements on the language, usage and data structure of methods in the local method stack. Therefore, specific virtual machines can freely implement it according to needs, and even some Java virtual machines (such as Hot-Spot Virtual machine) directly combines the local method stack and the virtual machine stack into one. Like the virtual machine stack, the local method stack will also throw StackOverflowError and OutOfMemoryError exceptions respectively when the stack depth overflows or the stack expansion fails.

  4. Java heap

    For Java applications, the Java Heap is the largest piece of memory managed by the virtual machine. The Java heap is a memory area shared by all threads and is created when the virtual machine starts. The only purpose of this memory area is to store object instances, and "almost" all object instances in the Java world allocate memory here. The description of the Java heap in the "Java Virtual Machine Specification" is: "All object instances and arrays should be allocated on the heap", and the "almost" I write here refers to the implementation point of view, with the Java language With the development of , we can now see some signs that value type support may appear in the future. Even if we only consider the present, due to the advancement of just-in-time compilation technology, especially the increasingly powerful escape analysis technology, stack allocation and scalar replacement optimization methods have led to Some subtle changes have quietly occurred, so it is gradually becoming less absolute that Java object instances are allocated on the heap.

    According to the "Java Virtual Machine Specification", the Java heap can be in physically discontinuous memory space, but it should be considered logically continuous. This is just like how we use disk space to store files. Each file is required to be stored continuously. However, for large objects (typically array objects), most virtual machine implementations are likely to require continuous memory space for the sake of simple implementation and efficient storage.

    The Java heap can be implemented as either a fixed size or scalable, but the current mainstream Java virtual machines are all implemented as scalable (set through the parameters -Xmx and -Xms). If there is no memory in the Java heap to complete the instance allocation, and the heap cannot be expanded, the Java virtual machine will throw an OutOfMemoryError exception.

  5. method area

    The Method Area, like the Java heap, is a memory area shared by each thread. It is used to store data such as type information, constants, static variables, and code cache compiled by the just-in-time compiler that have been loaded by the virtual machine. Although the "Java Virtual Machine Specification" describes the method area as a logical part of the heap, it has an alias called "Non-Heap" to distinguish it from the Java heap.

    According to the "Java Virtual Machine Specification", if the method area cannot meet the new memory allocation requirements, an OutOfMemoryError exception will be thrown.

  6. Runtime constant pool

    The Runtime Constant Pool is part of the method area. In addition to the description information of the class version, fields, methods, interfaces, etc., the Class file also has a constant pool table (Constant Pool Table), which is used to store various literals and symbol references generated during compilation. This part The content will be stored in the runtime constant pool in the method area after the class is loaded.

    Since the runtime constant pool is part of the method area, it is naturally limited by the memory of the method area. When the constant pool can no longer apply for memory, an OutOfMemoryError exception will be thrown.

  7. direct memory

    Direct memory (Direct Memory) is not part of the virtual machine runtime data area, nor is it a memory area defined in the "Java Virtual Machine Specification". However, this part of memory is also frequently used, and may also cause OutOfMemoryError exceptions.

    Obviously, the allocation of local direct memory will not be limited by the size of the Java heap. However, since it is memory, it will definitely be affected by the size of the total local memory (including physical memory, SWAP partition or paging file) and the processor addressing space. Generally, when server administrators configure virtual machine parameters, they will set -Xmx and other parameter information based on the actual memory, but often ignore the direct memory, making the sum of each memory area greater than the physical memory limit (including physical and operating system level). limit), resulting in an OutOfMemoryError exception during dynamic expansion.

What is the program counter?

The program counter records the execution address of the next JVM instruction, in order to accurately record the address of the current bytecode instruction being executed by each thread.

If there is no program counter, the flow control in the Java program will not be correctly controlled, and multi-threads will not be able to rotate correctly.

What are stored in the heap, stack, and method area?

Heap: newly created instance objects, the main place of GC

Stack: local variable table, operand stack frame, method exit information

Method area: class information, constants, static variables, etc.

Before JDK1.7, the runtime constant pool contained a string constant pool, which was stored in the method area.

JDK1.7 string constant pool is placed in the heap

JDK1.8 method area changed from permanent generation to metaspace

The difference between heap and stack

  • The Java stack is associated with each thread. When the JVM creates each thread, it will allocate a certain amount of stack space to the thread, which mainly stores local variables during thread execution, method return values, and basic type variables (, int , short, long, byte, float, double, boolean, char) and the context of the method call. The stack space is released as the thread terminates. The advantage of the stack is that the access speed is faster than the heap, and the stack data can be shared. But the disadvantage is that the size and lifetime of the data stored in the stack must be determined and there is a lack of flexibility. A very important special feature of the stack is that the data stored in the stack can be shared.
  • The heap in Java is a memory area shared by all threads. The heap is used to store various JAVA objects, such as arrays, thread objects, etc. The heap in Java is a runtime data area, and class (objects allocate space from it. These objects Created through instructions such as new, newarray, anewarray and multianewarray, they do not require program code to be explicitly released. The heap is responsible for garbage collection. The advantage of the heap is that it can dynamically allocate memory size, and the lifetime does not need to be told to the compiler in advance. Because it dynamically allocates memory at runtime, Java's garbage collector will automatically collect the data that is no longer used. But the disadvantage is that since memory must be dynamically allocated at runtime, the access speed is slow.

class loading

From the time a type is loaded into the virtual machine memory to the time it is unloaded from the memory, its entire life cycle will go through loading, verification, preparation, resolution, and initialization. , using (Using) and unloading (Unloading) seven stages, of which the three parts of verification, preparation, and analysis are collectively called linking (Linking). The order in which these seven stages occur is shown in the figure below.

[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-vcU4YhQ6-1681540635270) (D:/Study/JAVA/Interview/Interview Questions Organized Version.assets/jvm- 3.jpg)]

The above seven stages include the entire process of class loading, namely the five stages of loading, verification, preparation, parsing and initialization.

1. Loading

The "Loading" phase is a phase in the entire "Class Loading" process. During the loading phase, the Java virtual machine needs to complete the following three things:

  1. Gets the binary byte stream that defines a class by its fully qualified name.
  2. Convert the static storage structure represented by this byte stream into a runtime data structure in the method area.
  3. Generate a java.lang.Class object representing this class in memory, which serves as the access point for various data of this class in the method area.

After the loading phase is completed, the binary byte stream outside the Java virtual machine is stored in the method area according to the format set by the virtual machine. The data storage format in the method area is completely defined by the virtual machine implementation. "Java Virtual Machine" The Specification does not specify the specific data structure of this area. After the type data is properly placed in the method area, an object of the java.lang.Class class will be instantiated in the Java heap memory. This object will serve as the external interface for the program to access the type data in the method area.

2. Verification

Verification is the first step in the connection phase. The purpose of this phase is to ensure that the information contained in the byte stream of the Class file complies with all the constraints of the "Java Virtual Machine Specification" and to ensure that this information will not be harmful when run as code. The security of the virtual machine itself. The verification phase will roughly complete the following four stages of verification actions: file format verification, metadata verification, bytecode verification and symbol reference verification.

  1. File format verification:

    The first stage is to verify whether the byte stream conforms to the specifications of the Class file format and can be processed by the current version of the virtual machine.

  2. Metadata verification:

    The second stage is to perform semantic analysis on the information described by the bytecode to ensure that the information described meets the requirements of the "Java Language Specification".

  3. Bytecode verification:

    The third stage is to determine that the program semantics are legal and logical through data flow analysis and control flow analysis.

  4. Symbol reference verification:

    Symbol reference verification can be seen as a matching check for various types of information outside the class itself (various symbol references in the constant pool). In layman's terms, it means whether the class lacks or is prohibited from accessing some external things it depends on. Classes, methods, fields and other resources.

3. Preparation

The preparation phase is the phase where memory is formally allocated for variables defined in the class (ie, static variables, variables modified by static) and the initial values ​​of class variables are set. Conceptually, the memory used by these variables should be allocated in the method area, but it must be noted that the method area itself is a logical area. In JDK7 and before, HotSpot used the permanent generation to implement the method area. It is completely consistent with this logical concept. In JDK 8 and later, class variables will be stored in the Java heap along with the Class object. At this time, "class variables are in the method area" is completely an expression of a logical concept.

4. Analysis

The parsing phase is the process in which the Java virtual machine replaces the symbol references in the constant pool with direct references. The symbolic references appear in the Class file as constants of CONSTANT_Class_info, CONSTANT_Fieldref_info, CONSTANT_Methodref_info and other types. The direct references and symbolic references mentioned in the parsing phase are also What's the connection?

Symbolic References: A symbolic reference uses a set of symbols to describe the referenced target. The symbol can be any form of literal, as long as the target can be located unambiguously when used. Symbolic references have nothing to do with the memory layout implemented by the virtual machine, and the target of the reference is not necessarily the content that has been loaded into the virtual machine memory. The memory layouts implemented by various virtual machines can be different, but the symbol references they can accept must be consistent, because the literal form of symbol references is clearly defined in the Class file format of the "Java Virtual Machine Specification".

Direct References: A direct reference is a pointer, a relative offset that can directly point to the target, or a handle that can indirectly locate the target. Direct references are directly related to the memory layout implemented by the virtual machine. The direct references translated by the same symbol reference on different virtual machine instances are generally not the same. If there is a direct reference, the referenced target must already exist in the memory of the virtual machine.

5. Initialization

The initialization phase of a class is the last step in the class loading process. Among the several class loading actions introduced before, except that the user application can partially participate in the loading phase through a custom class loader, the rest of the actions are completely performed by Java. The virtual machine takes control. It is not until the initialization phase that the Java virtual machine actually starts executing the Java program code written in the class and hands over the dominance to the application.

During the preparation phase, the variables have been assigned an initial zero value required by the system. During the initialization phase, class variables and other resources will be initialized according to the subjective plan made by the programmer through program coding. We can also express it in another more direct form: the initialization phase is <clinit>()the process of executing the class constructor method. <clinit>()It is not a method written directly by the programmer in Java code, it is an automatic generation of the Javac compiler.

Interaction between JMM main memory and local memory

  1. **lock (lock): **A variable that acts on main memory. A variable can only be locked by one thread at the same time. This operation indicates that the thread has an exclusive lock on the variable.
  2. **unlock (unlock): **A variable that acts on main memory, indicating that the state of this variable is released from the locked state so that other threads can lock the variable.
  3. **read (read): ** Acts on the main memory variable, which means transferring the value of a main memory variable to the working memory of the thread for use in subsequent load operations.
  4. **load (load): ** Acts on the variables of the thread's working memory, which means that the value of the variable read from the main memory by the read operation is placed in the variable copy of the working memory (the copy is a variable relative to the main memory in terms of).
  5. **use (use): ** Acts on variables in the working memory of the thread, which means passing the value of a variable in the working memory to the execution engine. Whenever the virtual machine encounters a bytecode instruction that requires the value of the variable, This operation will be performed.
  6. **assign (assignment): **A variable that acts on the thread's working memory, indicating that the result returned by the execution engine is assigned to the variable in the working memory. Whenever the virtual machine encounters a bytecode instruction that assigns a value to a variable, it will Do this.
  7. **store (storage): ** Acts on variables in the thread's working memory, passing the value of a variable in the working memory to the main memory for use in subsequent write operations.
  8. **write (write): ** Acts on variables in main memory, putting the value of the variable obtained from the working memory by the store operation into the variable in main memory.

How volatile ensures visibility

When our shared variables are modified with volatile, it will help us open a MESI cache protocol on the bus . When one of the threads transfers a modified value to the bus, the bus's monitoring is triggered to tell other threads that are using the current shared variable to invalidate the value in the current working memory. At this time, you need to get the new value from the main memory.

How does volatile ensure orderliness?

Volatile prevents instruction reordering through memory barriers . It tells the CPU and compiler that commands preceding this command must be executed first.

line, the memory barrier must be executed after this command, and the results of the previous program are visible to the following

Have you ever understood happen-before?

The results of the previous operation are visible to subsequent operations

  • Program sequence rules : Within a thread, according to code order, operations written in the front happen-before operations written in the back;
  • Unlocking rules : Unlock happens-before locking;
  • Volatile variable rules : A write operation to a volatile variable happens-before a subsequent read operation to the volatile variable. To put it bluntly, the result of a write operation to a volatile variable is visible to any subsequent operations.
  • Transitive rule : If A happens-before B, and B happens-before C, then A happens-before C;
  • Thread startup rules : start()The methods of the Thread object happen-before every action of this thread.

If two operations do not satisfy any of the above happens-before rules, then the order of these two operations is not guaranteed, and the JVM can reorder the two operations.

The concept of memory leaks and memory overflows

Memory leak: Memory leak means that the program allocates memory to temporary variables during operation. After use, it is not reclaimed by GC. This memory is always occupied. Even if it cannot be used, it cannot be allocated to other variables, so it occurs. memory leak

Memory overflow: To put it simply, the memory requested by the program during operation is greater than the memory that the system can provide, resulting in the inability to apply for enough memory, so a memory overflow occurs.

How to solve the causes of memory leaks

The root cause of memory leaks is that long-lived objects hold references to short-lived objects. Although the short-lived objects are no longer needed, they cannot be recycled because the long-lived objects hold references to them.

  1. Release references to useless objects as early as possible.
  2. Avoid creating objects in loops.
  3. Avoid using String when working with string processing, use StringBuffer instead.
  4. Use static variables as little as possible, because static variables are stored in the permanent generation and basically do not participate in garbage collection.

What is memory overflow and how to solve it?

This means that the memory requested during the running of the program is greater than the memory that the system can provide, resulting in the inability to apply for enough memory, so a memory overflow occurs.

There are many reasons for memory overflow, the common ones are as follows:

  1. The amount of data loaded into the memory is too large, such as retrieving too much data from the database at one time;
  2. There is a reference to the object in the collection class, which is not cleared after use, making it impossible for the JVM to recycle;
  3. There is an infinite loop in the code or the loop generates too many duplicate object entities;
  4. BUGs in the third-party software used;
  5. The startup parameter memory value is set too small.

Solution to memory overflow:

  • The first step is to modify the JVM startup parameters and directly increase the memory.
  • The second step is to check the error log to see if there are other exceptions or errors before the "OutOfMemory" error.
  • The third step is to walk through and analyze the code to find out where memory overflow may occur.
  • The fourth step is to use the memory viewing tool to dynamically view memory usage.

How to tell if an object is dead

Reachability analysis method: The basic idea of ​​this algorithm is to use a series of objects called "GC Roots" as the starting point, and search downward from these nodes. The path traveled by the nodes is called a reference chain. When an object arrives If there is no reference chain connected to GC Roots, it proves that the object is unavailable and needs to be recycled.

For a string, if no object references the string constant, it is considered useless.

If it is a class, you must first determine whether all instances of the class have been recycled, whether the loader of the class has been recycled, whether the class object corresponding to the class is not referenced anywhere, and the class cannot be accessed through the reflection mechanism.

GC Roots:

  • Objects referenced in the virtual machine stack (local variable table in the stack frame)
  • Objects referenced in the native method stack (Native method)
  • The object referenced by the class static property in the method area
  • Objects referenced by constants in the method area
  • All objects held by synchronization locks

There are four types: strong reference, soft reference, weak reference, and virtual reference (the reference strength gradually weakens)

1. StrongReference

Most of the references we used before were actually strong references, which are the most commonly used references. If an object has a strong reference, it is similar to an essential daily necessities , and the garbage collector will never reclaim it. When there is insufficient memory space, the Java virtual machine would rather throw an OutOfMemoryError error and cause the program to terminate abnormally, rather than arbitrarily recycling objects with strong references to solve the problem of insufficient memory.

2. SoftReference

If an object only has soft references, it is similar to a dispensable daily necessities . If there is enough memory space, the garbage collector will not reclaim it. If there is insufficient memory space, the memory of these objects will be reclaimed. As long as the garbage collector does not collect it, the object can be used by the program. Soft references can be used to implement memory-sensitive caches.

Soft references can be used in conjunction with a reference queue (ReferenceQueue). If the object referenced by the soft reference is garbage collected, the JAVA virtual machine will add the soft reference to the associated reference queue.

3.WeakReference

If an object only has weak references, it is similar to a dispensable daily necessities . The difference between weak references and soft references is that objects with only weak references have a shorter life cycle. During the process of the garbage collector thread scanning the memory area under its jurisdiction, once an object with only weak references is found, its memory will be reclaimed regardless of whether the current memory space is sufficient. However, because the garbage collector is a very low priority thread, it may not necessarily find objects with only weak references quickly.

Weak references can be used in conjunction with a reference queue (ReferenceQueue). If the object referenced by the weak reference is garbage collected, the Java virtual machine will add the weak reference to the associated reference queue.

4. PhantomReference

"Virtual reference", as the name suggests, is in name only. Unlike other types of references, virtual references do not determine the life cycle of the object. If an object holds only phantom references, it is as if it had no references and may be garbage collected at any time.

Virtual references are mainly used to track the activities of objects being garbage collected .

One difference between virtual references, soft references and weak references is that virtual references must be used in conjunction with a reference queue (ReferenceQueue). When the garbage collector is preparing to recycle an object, if it finds that it still has a virtual reference, it will add the virtual reference to the reference queue associated with it before recycling the object's memory. The program can learn whether the referenced object will be garbage collected by determining whether a virtual reference has been added to the reference queue. If the program finds that a virtual reference has been added to the reference queue, it can take necessary actions before the memory of the referenced object is recycled.

Special attention should be paid to the fact that weak references and virtual references are rarely used in programming, and soft references are often used. This is because soft references can speed up the recovery of garbage memory by the JVM, maintain the operating safety of the system, and prevent memory overflow. (OutOfMemory) and other problems arise .

What are the garbage collection algorithms?

(1) Mark and clear algorithm

The algorithm is divided into "marking" and "clearing" phases: first, all objects that do not need to be recycled are marked, and after the marking is completed, all unmarked objects are uniformly recycled. It is the most basic collection algorithm, and subsequent algorithms are improved upon its shortcomings. This garbage collection algorithm brings two obvious problems:

  1. efficiency issues
  2. Space issues (a large number of discontinuous fragments will be generated after mark clearing)

(2) Marking sorting algorithm

A marking algorithm proposed based on the characteristics of the old generation. The marking process is still the same as the "mark-clear" algorithm, but the subsequent steps are not to directly recycle recyclable objects, but to move all surviving objects to one end and then clean them up directly. memory beyond end boundaries.

Advantages: No memory fragmentation

Disadvantages: Sorting involves moving objects and is slow. Disadvantages:

(3) Copy algorithm (survival area is suitable for use, divided into two parts for exchange)

In order to solve the efficiency problem, the "mark-copy" collection algorithm appeared. It can divide the memory into two blocks of the same size and use one block at a time. When this block of memory is used up, copy the surviving objects to another block, and then clean up the used space at once. This allows each memory recycling to recycle half of the memory range.

Generational collection algorithm

Generally, the Java heap is divided into the new generation and the old generation, so that we can choose the appropriate garbage collection algorithm according to the characteristics of each generation.

For example, in the new generation, a large number of objects will die in each collection, so you can choose the "mark-copy" algorithm, which only needs to pay a small amount of object copy cost to complete each garbage collection. The survival probability of objects in the old generation is relatively high, and there is no extra space to allocate guarantees for them, so we must choose the "mark-clear" or "mark-organize" algorithm for garbage collection.

(1) The object is first allocated in the Eden area

(2) When there is insufficient space in the new generation, minor gc (garbage collection of the new generation) is triggered, and the surviving objects in the Garden of Eden and the survivor area from are used

copy is copied to the survivor area to, the age of the surviving object is increased by 1 and the positions of from and to are exchanged;

Minor gc will trigger stop the world (STW), suspend other user threads, and wait until the garbage collection is completed before the user thread resumes operation.

(3) When the lifespan of an object in the survivor area exceeds the threshold, it will be promoted to the old generation. The maximum lifespan is 15 (4 bits)

(4) When the old generation space is insufficient, minor gc will be triggered first. If the space is still insufficient later, full gc will be triggered, and the STW time will be longer.

When the object is very large and the space in the new generation is not enough for storage, but the space in the old generation is enough, it will be directly promoted to the old generation without causing new generation.

generation gc

Garbage collector

Serial Collector

The Serial collector is the most basic and oldest garbage collector. As you can see from the name, this collector is a single-threaded collector. Its "single-threaded" meaning not only means that it will only use one garbage collection thread to complete garbage collection work, but more importantly, it must suspend all other working threads while performing garbage collection work ("Stop The World ” ) until it is collected.

The new generation uses the mark-copy algorithm, and the old generation uses the mark-collation algorithm.

ParNew Collector

The ParNew collector is actually a multi-threaded version of the Serial collector. Except for using multi-threads for garbage collection, the rest of the behavior (control parameters, collection algorithms, recycling strategies, etc.) is exactly the same as the Serial collector.

The new generation uses the mark-copy algorithm, and the old generation uses the mark-collation algorithm.

  • Parallel : refers to multiple garbage collection threads working in parallel, but at this time the user thread is still in a waiting state.
  • Concurrent : refers to the user thread and the garbage collection thread executing at the same time (but not necessarily in parallel, they may execute alternately). The user program continues to run, while the garbage collector runs on another CPU.

Parallel Scavenge Collector

The Parallel Scavenge collector is also a multi-threaded collector using the mark-copy algorithm. It looks almost the same as ParNew. So what's so special about it?

The focus of the Parallel Scavenge collector is throughput (efficient use of the CPU). Garbage collectors such as CMS focus more on the pause time of user threads (to improve user experience). The so-called throughput is the ratio of the time used to run user code in the CPU to the total CPU time consumed. The Parallel Scavenge collector provides many parameters for users to find the most suitable pause time or maximum throughput. If you do not know much about the operation of the collector and have difficulty in manual optimization, you can use the Parallel Scavenge collector with the adaptive adjustment strategy to reduce the memory. It is also a good choice to leave management optimization to virtual machines.

The new generation uses the mark-copy algorithm, and the old generation uses the mark-collation algorithm.

This is the default collector for JDK1.8

Serial Old Collector

The old generation version of the Serial collector , which is also a single-threaded collector. It has two main uses: one is to be used with the Parallel Scavenge collector in JDK1.5 and previous versions, and the other is to be used as a backup solution for the CMS collector.

Parallel Old Collector

The old generation version of the Parallel Scavenge collector . Use multi-threading and "mark-and-sort" algorithms. In situations where throughput and CPU resources are important, Parallel Scavenge collector and Parallel Old collector can be given priority.

CMS Collector

The CMS (Concurrent Mark Sweep) collector is a collector that aims to obtain the shortest collection pause time. It is very suitable for use in applications that focus on user experience.

The CMS (Concurrent Mark Sweep) collector is the first truly concurrent collector of the HotSpot virtual machine. For the first time, it allows the garbage collection thread and the user thread to work (basically) at the same time.

As can be seen from the two words Mark Sweep in the name , the CMS collector is implemented by a "mark-sweep" algorithm , and its operation process is more complicated than the previous garbage collectors. The whole process is divided into four steps:

  • Initial marking: Pause all other threads and record objects directly connected to the root, which is very fast ;
  • Concurrent marking: Turn on GC and user threads at the same time, and use a closure structure to record reachable objects. But at the end of this stage, this closure structure is not guaranteed to contain all currently reachable objects. Because the user thread may continuously update the reference domain, the GC thread cannot guarantee the real-time performance of the reachability analysis. So this algorithm will track and record the places where reference updates occur.
  • Re-marking: The re-marking phase is to correct the marking records of that part of the objects that have changed due to the continued running of the user program during concurrent marking. The pause time in this initial marking phase, much longer than the initial marking phase. The concurrent marking phase is short
  • Concurrent cleanup: Start the user thread, and at the same time, the GC thread starts cleaning the unmarked area.

The CMS collector is far from perfect. It has at least the following three obvious shortcomings:

First, the CMS collector is very sensitive to processor resources . In the concurrent phase, although it will not cause user threads to pause, it will cause the application to slow down and reduce the overall throughput by occupying a part of the threads (or the computing power of the processor) .

Then, because the CMS collector cannot handle "Floating Garbage" (Floating Garbage), a "Con-current Mode Failure" failure may occur, resulting in another complete "Stop TheWorld" Full GC.

There is one last drawback. CMS is a collector based on the "mark-and-sweep" algorithm, which means that a large amount of space debris will be generated at the end of the collection. When there are too many space fragments, it will cause a lot of trouble for large object allocation. It often happens that there is a lot of remaining space in the old generation, but it is unable to find a large enough continuous space to allocate the current object, and has to trigger a Full GC in advance. Case.

G1 Collector

G1 (Garbage-First) is a server-oriented garbage collector, mainly targeting machines equipped with multiple processors and large-capacity memory. While meeting the GC pause time requirements with a very high probability, it also has high throughput performance characteristics.

It is regarded as an important evolutionary feature of the HotSpot virtual machine in JDK1.7. It has the following characteristics:

  • Parallelism and concurrency : G1 can make full use of the hardware advantages of CPU and multi-core environments, and use multiple CPUs (CPUs or CPU cores) to shorten Stop-The-World pause time. Some other collectors originally need to pause the GC actions executed by the Java thread, but the G1 collector can still allow the Java program to continue executing in a concurrent manner.
  • Generational collection : Although G1 can manage the entire GC heap independently without the cooperation of other collectors, it still retains the concept of generational collection.
  • Spatial integration : Different from the "mark-clean" algorithm of CMS, G1 is a collector implemented based on the "mark-organize" algorithm as a whole; locally, it is implemented based on the "mark-copy" algorithm.
  • Predictable pauses : This is another big advantage of G1 over CMS. Reducing pause time is a common focus of G1 and CMS. However, in addition to pursuing low pauses, G1 can also establish a predictable pause time model, which allows users to or explicitly specified within a time segment of length M milliseconds.

The operation of the G1 collector is roughly divided into the following steps:

  • initial mark
  • concurrent marking
  • final mark
  • Screening and recycling

The G1 collector maintains a priority list in the background, and each time based on the allowed collection time, it gives priority to the Region with the greatest recycling value (this is the origin of its name Garbage-First) . This method of using Region to divide memory space and prioritized region recycling ensures that the G1 collector can collect as efficiently as possible within a limited time (breaking the memory into zeros).

Although G1 is still designed according to the generational collection theory, its heap memory layout is very obviously different from other collectors: G1 no longer adheres to a fixed size and a fixed number of generational area divisions, but divides continuous Java The heap is divided into multiple independent regions (Regions) of equal size. Each Region can act as the Eden space, Survivor space, or old generation space of the new generation as needed. The collector can use different strategies to process Regions that play different roles, so that good collection effects can be obtained whether they are newly created objects or old objects that have survived for a period of time and have survived multiple collections.

CMS recycling process

The CMS (Concurrent Mark Sweep) collector is a collector that aims to obtain the shortest collection pause time (pursuing low pauses). It enables user threads and GC threads to execute concurrently during garbage collection, so during garbage collection Users will not feel any obvious lag

Initial mark, concurrent mark, remark, and concurrent cleanup

Initial marking: Marking starts from GC Root. This process will have a short pause, but there will not be many objects directly related to GCroot, so the speed will be very fast.

Concurrent marking: Other related objects will continue to be marked according to the marking result of the previous step. This process will not block other threads.

Remark: Because the previous step did not block other threads, new garbage may be generated

Concurrent clearing: clearing objects marked as dead. This process is also performed together with other threads.

3 times GC

Concurrent precleaning

Abortable concurrent pre-cleanup

concurrent cleanup

Introduce the old generation

New generation: used to store new objects. Generally occupy 1/3 of the heap space. Due to the frequent creation of objects, the new generation will frequently trigger MinorGC for garbage collection. The new generation is divided into three areas: Eden area, ServivorFrom, and ServivorTo.

**Old generation:** Mainly stores memory objects with long life cycles in applications.

Objects in the old generation are relatively stable, so MajorGC will not be executed frequently. Before the MajorGC, a MinorGC is usually performed first, so that objects in the new generation are promoted to the old generation and it is triggered only when the space is insufficient.

When a large enough contiguous space cannot be found to allocate to a newly created larger object, a MajorGC will be triggered in advance to perform garbage collection to make space.

MajorGC uses a mark-and-sweep algorithm : first scan all old generations, mark surviving objects, and then recycle unmarked objects. MajorGC takes a long time because it needs to scan and then recycle. MajorGC will generate memory fragments. In order to reduce memory consumption, we generally need to merge or mark them for direct allocation next time. When the old generation is full and cannot be loaded, an OOM (Out of Memory) exception will be thrown.

Permanent generation:

Refers to the permanent storage area of ​​​​the memory, which mainly stores Class and Meta (metadata) information. Class is placed in the permanent area when it is loaded. It is different from the area where instances are stored**. GC will not run during the main program. Clean permanent areas**. So this also causes the permanent generation area to become full as the number of loaded Classes increases, eventually throwing an OOM exception.

JAVA8 and metadata:

In Java 8, the permanent generation has been removed and replaced by an area called the " metadata area " ( metaspace ).

The essence of the metaspace is similar to that of the permanent generation. The biggest difference between the metaspace and the permanent generation is that the metaspace is not in the virtual machine, but uses local memory . Therefore, by default, the size of the metaspace is limited only by local memory.

The metadata of the class is placed in native memory, and the string pool and static variables of the class are placed in the Java heap . In this way, the number of metadata of classes that can be loaded is no longer controlled by MaxPermSize, but by the actual available space of the system.

CyclicBarrier

It allows a group of threads to wait for each other. The barrier will not open until all threads reach a certain common barrier point, and then the threads can continue to execute. A simple example is: a tour group takes a group of people to visit an attraction. It is stipulated that Gather at the next scenic spot A, so the tour guide will wait for everyone at the scenic spot A. The tour guide is this gathering point or barrier. The tour guide will not take you to the next scenic spot B until all tourists have gathered.

CyclicBarrier has two constructors:

  • CyclicBarrier(int parties); The int type parameter indicates how many threads participate in this barrier interception (take the above example, that is, how many people are traveling in a group);
  • CyclicBarrier(int parties,Runnable barrierAction); When all threads reach a barrier point, the barrierAction thread will be executed first.

The most important method:

The static variables of the class are put into the java heap**, so that the metadata of how many classes can be loaded is no longer controlled by MaxPermSize, but by the actual available space of the system.

CyclicBarrier

It allows a group of threads to wait for each other. The barrier will not open until all threads reach a certain common barrier point, and then the threads can continue to execute. A simple example is: a tour group takes a group of people to visit an attraction. It is stipulated that Gather at the next scenic spot A, so the tour guide will wait for everyone at the scenic spot A. The tour guide is this gathering point or barrier. The tour guide will not take you to the next scenic spot B until all tourists have gathered.

CyclicBarrier has two constructors:

  • CyclicBarrier(int parties); The int type parameter indicates how many threads participate in this barrier interception (take the above example, that is, how many people are traveling in a group);
  • CyclicBarrier(int parties,Runnable barrierAction); When all threads reach a barrier point, the barrierAction thread will be executed first.

The most important method:

await(); Each thread calls await(), indicating that I have reached the barrier point, and then the current thread is blocked. (Take the above example, tourist A indicates that he has arrived at attraction A, and then he waits there for everyone to arrive. ).

Guess you like

Origin blog.csdn.net/qq_43167873/article/details/130169872