The memory management and GC algorithms and recycling strategies that Ali interviewers love to ask

JVM memory structure

The JVM stack consists of heap, stack, native method stack, method area, etc. The structure diagram is as follows:

JVM memory reclamation

The principle of Sun's JVMGenerationalCollecting (garbage collection) is as follows: objects are divided into young generation (Young) , old generation (Tenured) , persistent generation (Perm) , and different algorithms are used for objects with different life cycles. (Based on object life cycle analysis)

1.Young (young generation)

The young generation is divided into three districts. One Eden area, two Survivor areas. Most objects spawn in the Eden area. When the Eden area is full, the surviving objects will be copied to the Survivor area (one of the two). When the Survivor area is full, the surviving objects in this area will be copied to another Survivor area. When it is full, the objects copied from the first Survivor area and still alive at this time will be copied to the Tenured area. It should be noted that the two areas of the Survivor are symmetrical and have no relationship, so the same area There may be objects copied from Eden and objects copied from the previous Survivor at the same time, and only the objects copied from the first Survivor are copied to the old area. Moreover, one Survivor area is always empty.

2. Tenured (old generation)

The old generation holds objects that survive from the young generation. Generally speaking, the old generation stores objects with long lifespans.

3.Perm (Persistent Generation)

Used to store static files , now Java classes, methods, etc. Persistent generation has no significant impact on garbage collection, but some applications may dynamically generate or call some classes, such as Hibernate, etc. In this case, a relatively large persistent generation space needs to be set to store these newly added classes during operation. The persistent generation size is set with -XX:MaxPermSize=.

For example: when an object is generated in a program, normal objects will allocate space in the young generation, and if the object is too large, it may be directly generated in the old generation (it is observed that one will be generated each time a program is run. Ten megabytes of space is used to send and receive messages, and this part of the memory will be allocated directly in the old generation). The young generation will initiate memory reclamation when the space is allocated, most of the memory will be reclaimed, and part of the surviving memory will be copied to the from area of ​​the Survivor. Memory reclamation also occurs and the remaining objects are copied to the to area. When the to area is also full, memory reclamation will occur again and the surviving objects will be copied to the old area.

Usually, the JVM memory reclamation we are talking about always refers to the heap memory reclamation. Indeed, only the contents of the heap are allocated dynamically. Therefore, the young and old generations of the above objects refer to the JVM's Heap space, while the persistent generation is It is the MethodArea mentioned earlier, not a Heap.

Some advice on JVM memory management

  1. Manually set the generated useless objects and intermediate objects to null to speed up memory recovery.

  2. Object pooling technology If the generated objects are reusable objects, but the properties are different, you can consider using object pooling to generate fewer objects. If there are free objects, they will be taken out of the object pool for use, and no new objects will be generated, which greatly improves the reuse rate of objects.

  3. JVM tuning improves the speed of garbage collection by configuring the parameters of the JVM. If there is no memory leak and the above two methods cannot guarantee JVM memory recovery, you can consider using JVM tuning to solve the problem, but it must go through the entity Long-term testing of the machine, because different parameters may cause different effects. Such as -Xnoclassgc parameters and so on.

Judgment of garbage objects

Almost all object instances are stored in the Java heap. Before the garbage collector collects the objects in the heap, it must first determine whether these objects are still useful. The following algorithm is used to determine whether the objects are garbage objects:

reference counting algorithm

Add a reference counter to the object. Whenever there is a place to refer to it, the counter value is incremented by 1. When the reference is invalid, the counter value is decremented by 1. An object whose counter is 0 at any time cannot be used anymore.

The implementation of the reference counting algorithm is simple and the judgment efficiency is high. In most cases, it is a good choice. When the Java language does not choose this algorithm for garbage collection, the main reason is that it is difficult to solve the problem between objects. mutual circular reference problem .

root search algorithm

**Java and C#** both use the root search algorithm to determine whether the object is alive. The basic idea of ​​this algorithm is to use a series of objects named "GC Roots" as the starting point, and start searching downward from these nodes. The path traversed by the search is called the reference chain. When an object has no reference to the GC Roots When the chain is connected, the object is proved to be unavailable. In the Java language, the redemptions that can be used as GC Roots include the following:

  • Objects referenced in the virtual machine stack (local variable table in the stack frame).
  • The object referenced by the class static properties in the method area.
  • The object referenced by the constant in the method area.
  • The reference object of the JNI (Native method) in the native method stack.

In fact, in the root search algorithm, to truly declare an object dead, at least two marking processes are required: if the object finds no reference chain connected to GC Roots after root search, it will be marked for the first time And perform a screening, the screening condition is whether it is necessary to execute the finalize() method for this object. When the object does not override the finalize() method, or the finalize() method has been called by the virtual machine, the virtual machine treats both cases as unnecessary. If the object is determined to be necessary to execute the finalize() method, then the object will be placed in a queue called F-Queue, and later by a low-priority Finalizer thread automatically created by the virtual machine to execute the finalize() method. The finalize() method is the last chance for the object to escape the fate of death (because the finalize() method of an object will only be called automatically by the system at most once), and later the GC will perform a second small-scale operation on the objects in the F-Queue Mark, if you want to successfully save yourself in the finalize() method, just let the object re-reference any object in the chain to establish an association in the finalize() method. And if the object is not associated with any on-chain reference at this point, it will be recycled.

Garbage Collection Algorithms

After it is determined that the garbage objects are removed, garbage collection can be performed. The following introduces some garbage collection algorithms. Since the implementation of garbage collection algorithms involves a large number of program details, here is mainly to clarify the implementation ideas of each algorithm, without discussing the specific implementation of the algorithm.

mark-sweep algorithm

The mark-sweep algorithm is the most basic collection algorithm. It is divided into two stages: "marking" and "clearing": first mark the objects to be recycled, and after the marking is completed, all marked objects are uniformly collected. The process is actually the marking process of determining garbage objects in the previous root search algorithm. The execution of the mark-sweep algorithm is shown in the following figure:

This algorithm has the following disadvantages:

  • The marking and clearing process is not very efficient.
  • After the mark is cleared, a large number of discontinuous memory fragments will be generated. Too much space fragmentation may cause that when the program needs to allocate larger objects in the later running process, it cannot find enough continuous memory and has to trigger another garbage collection action.
replication algorithm

The replication algorithm is more suitable for the new generation . The replication algorithm is improved on the basis of the shortcomings of the mark-sweep algorithm. The memory used for lectures is divided into two equal-sized blocks according to the capacity, and only one of them is used each time. , when this piece of memory is used up, the surviving objects are copied to another piece of memory, and then the used memory space is cleaned up once. The replication algorithm has the following advantages:

  • Only one block of memory is reclaimed at a time, which is efficient.
  • Just move the top pointer of the stack and allocate memory in order, which is simple to implement.
  • The occurrence of memory fragmentation is not considered when memory is reclaimed.

Its disadvantage is that the maximum memory that can be allocated at one time is halved. The execution of the replication algorithm is shown in the following figure:

However, it is generally not necessary to divide the memory space by 1:1, and it can be divided into a large eden and two small survivors.

Marking-Organizing Algorithms

In the old age , the survival rate of objects is relatively high. If more copy operations are performed, the efficiency will become lower, so other algorithms are generally used in the old age, such as mark-sorting algorithms. The marking process of this algorithm is the same as the marking process in the marking-clearing algorithm, but the processing of the garbage objects after marking is different. It does not directly clean up the recyclable objects, but makes all objects go to one end. move, and then directly clean up the memory outside the end boundary. The recycling of the mark-collate algorithm is as follows:

Generational collection

The current garbage collection of commercial virtual machines uses generational collection to manage memory. It divides memory into several blocks according to the different life cycles of objects. Generally, the Java heap is divided into new generation and old generation. In the new generation, a large number of objects die and only a few survive each time garbage collection. Therefore, the copy algorithm can be used to complete the collection. In the old generation, because the object survival rate is high and there is no extra space to allocate it, it is A mark-sweep or mark-sort algorithm must be used for recycling.

Each object has an Age (Age) counter. If the object is alive in Eden and speaks a Minor GC once, it will be moved to the Survivor area and the Age will be set to 1. After that, every Minor survives in the Survivor area. GC, Age is increased by 1, when it increases to a certain level (default is 15), it can be placed in the old age.

garbage collector

The garbage collector is the specific implementation of the memory recovery algorithm. There is no regulation on how the garbage collector should be implemented in the Java Virtual Machine Specification. Therefore, the garbage collectors provided by different manufacturers and different versions of the virtual machine may have large difference. Sun HotSpot Virtual Machine version 1.6 includes the following collectors: Serial, ParNew, Parallel Scavenge, CMS, Serial Old, Parallel Old. These collectors work together in different combinations to complete garbage collection in different generational regions.

Garbage Collection Analysis

Before using code analysis, we clarify the following three points about the memory allocation strategy:

  • Objects are allocated preferentially in Eden. When Eden does not have enough space to allocate, a Minor GC will be initiated
  • Large objects (java objects that require a lot of contiguous space, such as long strings and arrays) go directly to the old age. Since the new generation uses a copy algorithm to reclaim memory, it can avoid a large number of memory copies between Eden and the two Survivor areas.
  • Long-lived objects will enter the old age.

Explain the following two points about the garbage collection strategy:

  • Cenozoic GC ( Minor GC ): The garbage collection action that occurs in the new generation, because most Java objects have the characteristics of dying, so Minor GC is very frequent, and the recovery speed is generally faster.
  • Old age GC ( Major GC/Full GC ): GC that occurs in the old age, there is a Major GC, which is often accompanied by at least one Minor GC. Because the life cycle of objects in the old generation is relatively long, the Major GC is not frequent. Generally, the Full GC is performed after the old generation is full, and its speed is generally more than 10 times slower than the Minor GC. In addition, if Direct Memory is allocated, when Full GC is performed in the old age, discarded objects in Direct Memory will be cleaned up by the way.

The Dalvik virtual machine uses the Mark-Sweep algorithm for garbage collection. As the name suggests, the Mark-Sweep algorithm is to perform garbage collection for the two stages of Mark and Sweep. Among them, the Mark phase starts from the Root Set and recursively marks all currently referenced objects, while the Sweep phase is responsible for recycling those objects that are not referenced. Before analyzing the Mark-Sweep algorithm used by the Dalvik virtual machine, let's first understand under what circumstances will trigger GC.

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324079484&siteId=291194637
Recommended