Understanding and solving Java memory leaks

I learned the difference between out of memory and memory leak and the reasons for out of memory, and I am also curious about the reasons and concerns of memory leak.

Memory management mechanism that relies on reference judgment

Access to memory objects in Java is done by reference.
In Java code, we maintain a reference variable of a memory object. Through the value of this reference variable, we can access the memory object space in the corresponding memory address. In a Java program, the reference variable itself can be stored in both heap memory and code stack memory (same as the basic data type). The GC thread will start tracking from the reference variables in the code stack to determine which memory is in use. If the GC thread cannot track a certain piece of heap memory in this way, then the GC considers that this piece of memory will no longer be used (that is, the code can no longer access this piece of memory).

Java uses a directed graph for memory management, which can eliminate the problem of reference cycles. For example, there are three objects that refer to each other. As long as they are unreachable with the root process, the GC can also recycle them.

The advantage of this method is that the memory is managed with high precision, but the efficiency is low. Another common memory management technique is to use counters. For example, the COM model uses counters to manage components. Compared with directed graphs, it has low precision (it is difficult to deal with the problem of circular references), but it has high execution efficiency.

Through this directed graph memory management method, when a memory object loses all references, the GC can recycle it. Conversely, if there are still references to this object, it will not be reclaimed by GC, even if the Java virtual machine throws OutOfMemoryError.

Two scenarios of memory leaks

Generally speaking, there are two kinds of memory leaks. In one case, as in the C/C++ language, when the allocated memory in the heap is not released, all ways to access this memory are deleted (such as pointer reassignment), and also That is to say, this memory area is unreachable; in
another case, when the memory object is no longer needed, the memory and its access method (reference) are still retained, that is, it is still reachable in the directed graph. , but the object is no longer needed.

GC in Java will solve the first case very well, but the second case requires us to pay attention in coding.

Objects can be thought of as vertices of a directed graph, and reference relationships can be thought of as directed edges of the graph, which point from the referrer to the referenced object. In addition, each thread object can be used as the starting vertex of a graph. For example, most programs start executing from the main process, then the graph is a root tree starting from the vertex of the main process. In this directed graph, all objects reachable by the root vertex are valid objects, and GC will not reclaim these objects. If an object (connected subgraph) is unreachable with this root vertex (note that the graph is a directed graph), then we consider that (these) objects are no longer referenced and can be recycled by GC.

It can be understood that for C++, programmers need to manage edges and vertices themselves, while for Java programmers, they only need to manage edges (no need to manage the release of vertices).

Memory leaks in Java

In Java, a memory leak is the existence of some allocated objects. These objects have the following two characteristics. First, these objects are reachable, that is, in a directed graph, there are paths that can be connected to them;

Second, these objects are useless, i.e. the program will not use these objects in the future. If objects meet these two conditions, these objects can be judged as memory leaks in Java, these objects will not be collected by GC, but they occupy memory.

Generally speaking, if a long-lived object holds a reference to a short-lived object, memory leaks are likely to occur. Although the short-lived object is no longer needed, it cannot be recycled because the long-lived object holds its reference. .

A typical memory leak example:

 
           Vector v= 
           new  
           Vector( 
           10 
           ); 
          
           for  
           ( 
           int  
           i= 
           1 
           ;i< 
           100 
           ; i++){ 
          
           Object o= 
           new  
           Object(); 
          
           v.add(o); 
          
           /** 
          
           * 此时，所有的Object对象都没有被释放，因为变量v引用这些对象 
          
           */ 
          
           o= 
           null 
           ; 
          
           }

We cyclically apply for the Object object and put the applied object into a Vector. If we only release the reference itself, the Vector still references the object, so this object is not recyclable for GC.

Therefore, if an object must be removed from the Vector after it is added to the Vector, the easiest way is to set the Vector object to null.

Scenarios that are prone to memory leaks in Java programs:

1. Collection class, the collection class only has the method of adding elements, but there is no corresponding deletion mechanism, which causes the memory to be occupied
. If this collection class is only a local variable, it will not cause memory leakage at all, and there will be no reference after the method stack exits. It will be recycled normally by jvm,
and if this collection class is a global variable (such as static attributes in the class, global map, etc., that has static references or final always pointing to it), then there is no corresponding deletion mechanism, which may lead to The memory occupied by the collection only increases and does not decrease, so it is necessary to provide such a deletion mechanism or a regular clearing strategy.

2.
Incorrect use of the singleton pattern is a common problem that causes memory leaks. After the singleton object is initialized, it will exist for the entire life cycle of the JVM (in the form of static variables). If the singleton object holds A reference to an external object, then the external object will not be recycled normally by the JVM, resulting in memory leaks. Consider the following example:

 
           class  
           A{ 
          
           public  
           A(){ 
          
           　　 B.getInstance().setA( 
           this 
           ); 
          
           　　} 
          
           　　…. 
          
           　　}

 
           //B类采用单例模式 
          
           class  
           B{ 
          
           private  
           A a; 
          
           private  
           static  
           B instance= 
           new  
           B(); 
          
           public  
           B(){} 
          
           public  
           static  
           B getInstance(){ 
          
           return  
           instance; 
          
           　　} 
          
           public  
           void  
           setA(A a){ 
          
           this 
           .a=a; 
          
           　　} 
          
           //getter… 
          
           　　}

Obviously B adopts the singleton mode, he holds a reference to the A object, and this object of class A will not be recycled. Imagine what will happen if A is a relatively large object or collection type.
Therefore, in the Java development process and code review, we should focus on those long-lived objects: global collections, the use of singleton patterns, static variables of classes, and so on.
When an object is not in use, explicitly empty the object, follow the principle of who creates and releases it, and reduces the chance of memory leaks.

Several reference methods in Java

Strong citations
The citations used in the content we introduced so far are strong citations, which are the most commonly used citations. If an object has a strong reference, it's like an essential household item, and the garbage collector will never recycle it. When the memory space is insufficient, the Java virtual machine would rather throw an OutOfMemoryError error to cause the program to terminate abnormally, rather than solve the problem of insufficient memory by arbitrarily recycling objects with strong references.

Soft Reference (SoftReference)
A typical use of the SoftReference class is for memory-sensitive caches. SoftReference works by keeping references to objects and guaranteeing that all soft references will be cleared before the JVM reports an out-of-memory condition. The point is that the garbage collector may (or may not) free soft-reachable objects at runtime. Whether or not an object is freed depends on the garbage collector's algorithm and the amount of memory available when the garbage collector is running.

WeakReference
A typical use of the WeakReference class is canonicalized mapping. Also, weak references are useful for objects that are relatively long-lived and inexpensive to recreate. The key point is that if the garbage collector runtime encounters a weakly reachable object, it will release the object referenced by the WeakReference. Note, however, that the garbage collector may run multiple times to find and free weakly reachable objects.

PhantomReference (PhantomReference)
The PhantomReference class can only be used to track the upcoming collection of the referenced object. Also, it can be used to perform pre-mortem cleanup operations. PhantomReference must be used with the ReferenceQueue class. ReferenceQueue is needed because it can act as a notification mechanism.
When the garbage collector determines that an object is phantom reachable, the PhantomReference object is placed on its ReferenceQueue. Placing the PhantomReference object on the ReferenceQueue is also a notification that the object referenced by the PhantomReference object has ended and is ready for collection.

Memory management mechanism that relies on reference judgment

Two scenarios of memory leaks

GC in Java will solve the first case very well, but the second case requires us to pay attention in coding.

It can be understood that for C++, programmers need to manage edges and vertices themselves, while for Java programmers, they only need to manage edges (no need to manage the release of vertices).

Memory leaks in Java

Generally speaking, if a long-lived object holds a reference to a short-lived object, memory leaks are likely to occur. Although the short-lived object is no longer needed, it cannot be recycled because the long-lived object holds its reference. .

A typical memory leak example:

 
         Vector v= 
         new  
         Vector( 
         10 
         ); 
        
         for  
         ( 
         int  
         i= 
         1 
         ;i< 
         100 
         ; i++){ 
        
         Object o= 
         new  
         Object(); 
        
         v.add(o); 
        
         /** 
        
         * 此时，所有的Object对象都没有被释放，因为变量v引用这些对象 
        
         */ 
        
         o= 
         null 
         ; 
        
         }

Therefore, if an object must be removed from the Vector after it is added to the Vector, the easiest way is to set the Vector object to null.

Scenarios that are prone to memory leaks in Java programs:

 
         class  
         A{ 
        
         public  
         A(){ 
        
         　　 B.getInstance().setA( 
         this 
         ); 
        
         　　} 
        
         　　…. 
        
         　　}

 
         //B类采用单例模式 
        
         class  
         B{ 
        
         private  
         A a; 
        
         private  
         static  
         B instance= 
         new  
         B(); 
        
         public  
         B(){} 
        
         public  
         static  
         B getInstance(){ 
        
         return  
         instance; 
        
         　　} 
        
         public  
         void  
         setA(A a){ 
        
         this 
         .a=a; 
        
         　　} 
        
         //getter… 
        
         　　}

显然B采用singleton模式，他持有一个A对象的引用，而这个A类的对象将不能被回收，想象下如果A是个比较大的对象或者集合类型会发生什么情况。
所以在Java开发过程中和代码复审的时候要重点关注那些长生命周期对象：全局性的集合、单例模式的使用、类的static变量等等。
在不使用某对象时，显式地将此对象赋空，遵循谁创建谁释放的原则，减少内存泄漏发生的机会。

Java中的几种引用方式

强引用
在此之前我们介绍的内容中所使用的引用都是强引用，这是使用最普遍的引用。如果一个对象具有强引用，那就类似于必不可少的生活用品，垃圾回收器绝不会回收它。当内存空间不足，Java虚拟机宁愿抛出OutOfMemoryError错误，使程序异常终止，也不会靠随意回收具有强引用的对象来解决内存不足问题。

软引用（SoftReference）
SoftReference 类的一个典型用途就是用于内存敏感的高速缓存。SoftReference 的原理是：在保持对对象的引用时保证在 JVM 报告内存不足情况之前将清除所有的软引用。关键之处在于，垃圾收集器在运行时可能会（也可能不会）释放软可及对象。对象是否被释放取决于垃圾收集器的算法以及垃圾收集器运行时可用的内存数量。

弱引用（WeakReference）
WeakReference 类的一个典型用途就是规范化映射（canonicalized mapping）。另外，对于那些生存期相对较长而且重新创建的开销也不高的对象来说，弱引用也比较有用。关键之处在于，垃圾收集器运行时如果碰到了弱可及对象，将释放 WeakReference 引用的对象。然而，请注意，垃圾收集器可能要运行多次才能找到并释放弱可及对象。

虚引用（PhantomReference）
PhantomReference 类只能用于跟踪对被引用对象即将进行的收集。同样，它还能用于执行 pre-mortem 清除操作。PhantomReference 必须与 ReferenceQueue 类一起使用。需要 ReferenceQueue 是因为它能够充当通知机制。
当垃圾收集器确定了某个对象是虚可及对象时，PhantomReference 对象就被放在它的 ReferenceQueue 上。将 PhantomReference 对象放在 ReferenceQueue 上也就是一个通知，表明 PhantomReference 对象引用的对象已经结束，可供收集了。