WeakHashMap和ThreadLocal内存泄漏中的弱引用运行原理

本文原创，如有引用，请指明出处。

WeakHashMap和ThreadLocal内存泄漏中的弱引用运行原理

WeakHashMap的内存泄漏问题

DefaultChannelPipeline中使用了WeakHashMap来作为缓存。事实上，WeakHashMap的设计理念与ThreadLocal很像。但是ThreadLocal重新设计了自己的实现，并没有直接使用WeakHashMap。同时，ThreadLocal存在着内存泄漏的问题。而网上关于WeakHashMap内存泄漏问题却谈的非常少。下面贴一个网上关于WeakHashMap内存泄漏问题的代码。出处在这里对WeakHashMap的使用不慎导致内存溢出分析

import java.util.WeakHashMap;


public class WeekTest {

    public static class Locker {
        private static WeakHashMap<String, Locker> lockerMap = new WeakHashMap<String, Locker>();
        private final String id;
        private Locker(String id) {
            this.id= id;
        }
        public synchronized static Locker acquire(String id) {
            Locker locker = lockerMap.get(id);
            if (locker == null) {
                locker = new Locker(id);
                //lockerMap.put(id, locker);  //问题代码，导致了entry.key == entry.value.id
                lockerMap.put(new String(id), locker);  //这是一种修改方式，保证了WeakHashMap中的key，没有被value直接或间接所引用
            }
            return locker;
        }
        public String getId() {
            return this.id;
        }
        public static int getSize() {
            return lockerMap.size();
        }
    }


    public static void main(String[] args) {
        for (int i = 0; i < 10000000; i++) {
            Locker.acquire("abc" + i);
            if (i % 10000 == 0) {
                System.gc();
                System.out.println(Locker.getSize());    //输出垃圾回收后的Map的Size
            }
        }

    }
}

读者可以自己运行上述代码。比较结果，很容易发现内存泄漏了。事实上，这里出现了许多疑问。当一个对象仅仅被弱引用指定时，它将被回收。因此，它被用来设计ThreadLocalMap和WeakHashMap。二者原理十分类似。二者内部数据结构十分类似。ThreadLocalMap元数据设计Entry代码如下：

/**
 * The entries in this hash map extend WeakReference, using
 * its main ref field as the key (which is always a
 * ThreadLocal object).  Note that null keys (i.e. entry.get()
 * == null) mean that the key is no longer referenced, so the
 * entry can be expunged from table.  Such entries are referred to
 * as "stale entries" in the code that follows.
 */
static class Entry extends WeakReference<ThreadLocal<?>> {
    /** The value associated with this ThreadLocal. */
    Object value;
      Entry(ThreadLocal<?> k, Object v) {
        super(k);
        value = v;
    }
}

下面是WeakHashMap的源代码。这里只比较数据结构，会发现两个内部类十分相似。但是，ThreadLocal会引发内存泄漏，而WeakHashMap很少引发内存泄漏。除非使用错误，WeakHashMap中的key被value直接或间接所引用。在使用正确的情况下，WeakHashMap中的数据在key没有被强引用的情况下，回收器可以正确回收整个Entry的内存；ThreadLocal则必须在当前线程停止后才可以，否则回收器将仅仅回收key(Threshold)内存，value内存无法被回收。

private static class Entry<K,V> extends WeakReference<Object> implements Map.Entry<K,V> {
    V value;
    final int hash;
    Entry<K,V> next;
      /**
     * Creates new entry.
     */
    Entry(Object key, V value,
          ReferenceQueue<Object> queue,
          int hash, Entry<K,V> next) {
        super(key, queue);
        this.value = value;
        this.hash  = hash;
        this.next  = next;
    }
   
   //其他忽略

弱引用运行原理

为了理解这其中原因，我们需要查看三个类WeakReference，Reference，ReferenceQueue。WeakReference继承了Reference，

public class WeakReference<T> extends Reference<T> {

    /**
     * Creates a new weak reference that refers to the given object.  The new
     * reference is not registered with any queue.
     *
     * @param referent object the new weak reference will refer to
     */
    public WeakReference(T referent) {
        super(referent);
    }

    /**
     * Creates a new weak reference that refers to the given object and is
     * registered with the given queue.
     *
     * @param referent object the new weak reference will refer to
     * @param q the queue with which the reference is to be registered,
     *          or <tt>null</tt> if registration is not required
     */
    public WeakReference(T referent, ReferenceQueue<? super T> q) {
        super(referent, q);
    }

}

整个WeakReference的代码十分简单。主要的逻辑由Reference实现。WeakReference仅实现了两个构造器。Reference的实现较晦涩，但原理十分简单。我们将结合例子讲解。如果我们想实现一个字符串变成弱引用。实现如下：

String String = new String("Random"); // 强引用
ReferenceQueue<String> StringQ = new ReferenceQueue<String>();// 引用队列
WeakReference<String> StringWRefernce = new WeakReference<String>(String, StringQ);//弱引用

构造器有两个十分重要的属性。这里构造器第一个参数是new String("Random")。在Reference中，回收器线程将对这个对象进行监视，从而决定是否将Reference放入ReferenceQueue中。而这个ReferenceQueue正是传递给构造器的第二个属性。Reference对象为了自己专门定义了四个内部状态：Active-Pending-Enqueued-Inactive。当它被传进来那一刻，Reference对象的状态为Active。回收器开始运作。当可达性发生变化，Reference对象状态将被转化为Pending（这部分是由JVM实现的）。同时，Reference对象将被加入一个Pending队列（源码中中实现为链表）。Reference类还启动一个守护线程ReferenceHandle。这个线程负责将Pending队列中的Reference对象加入ReferenceQueue队列中。可以说ReferenceQueue中的Reference对象已经是无用对象。ReferenceQueue可以看作是一个Reference链表。同时加入其中的Reference对象已经不可达，自动回收。示例代码如下：（改自Reference Queues）


import java.lang.ref.ReferenceQueue;
import java.lang.ref.WeakReference;
import java.lang.reflect.Field;

public class WeekTest {

    public static <T>boolean checkNotNull(T t,String fieldName){
        try {
            Field field=t.getClass().getDeclaredField(fieldName);
            field.setAccessible(true);
            Object result=field.get(t);
            return result!=null;
        } catch (NoSuchFieldException e) {
            System.out.println("不存在该字段");
            e.printStackTrace();
        } catch (IllegalAccessException e) {
            System.out.println("");
            e.printStackTrace();
        }
        return false;
    }


    public static void main(String[] args) throws InterruptedException {


        String String = new String("Random"); // a strong object

        ReferenceQueue<String> StringQ = new ReferenceQueue<String>();// the ReferenceQueue
        WeakReference<String> StringWRefernce = new WeakReference<String>(String, StringQ);

        System.out.println("String created as a weak ref " + StringWRefernce);
        System.out.println("Is refQ head not null?"+checkNotNull(StringQ,"head"));
        Runtime.getRuntime().gc();
        System.out.println("Any weak references in Q ? " + (StringQ.poll() != null));
        String = null; // the only strong reference has been removed. The heap
        // object is now only weakly reachable

        System.out.println("Now to call gc...");
        Runtime.getRuntime().gc(); // the object will be cleared here - finalize will be called.
        System.out.println("Is refQ head not null?"+checkNotNull(StringQ,"head"));
        //这里改用remove()方法是因为poll()是无阻塞方法。一个对象标记不可达到加入ReferenceQueue是分别由GC和线程ReferenceHandler分别来实现的，因此是存在延迟的。而remove()是阻塞方法。
        System.out.println("Any weak references in Q ? " + (StringQ.remove() != null));
        System.out.println("Does the weak reference still hold the heap object ? " + (StringWRefernce.get() != null));
        System.out.println("Is the weak reference added to the ReferenceQ  ? " + (StringWRefernce.isEnqueued()));

    }
}

这里我们回到ThreadLocal中。ThreadLocal中的内部类Entry继承了WeakReference，传递给Reference监视的对象是ThreadLocal。value则是Entry内部的强引用。当外部ThreadLocal被设置为空时，回收器设置Reference为Pending，再加入ReferenceQueue，最后被回收。至始至终，我们传递给JVM监视并准备回收的都是ThreadLocal。Entry中的value并没有得到处理。因此如果线程一直没结束，那么就会存在Thread->ThreadLocalMap->Entry(null,value)的引用，造成内存泄漏。

为什么WeakHashMap能够回收value内存？很简单，WeakHashMap专门实现了一个方法

private void expungeStaleEntries() {
    for (Object x; (x = queue.poll()) != null; ) {
        synchronized (queue) {
            @SuppressWarnings("unchecked")
                Entry<K,V> e = (Entry<K,V>) x;
            int i = indexFor(e.hash, table.length);
              Entry<K,V> prev = table[i];
            Entry<K,V> p = prev;
            while (p != null) {
                Entry<K,V> next = p.next;
                if (p == e) {
                    if (prev == e)
                        table[i] = next;
                    else
                        prev.next = next;
                    // Must not null out e.next;
                    // stale entries may be in use by a HashIterator
                    e.value = null; // Help GC
                    size--;
                    break;
                }
                prev = p;
                p = next;
            }
        }
    }
}

WeakHashMap专门去翻找了ReferenceQueue。当发现了Reference中有某个key，说明它将被回收。那么e.value = null。直接将value置空，这样value也可以进行回收。

以上就是整个弱引用的运行原理。如果有兴趣，可以自己翻看jdk代码，对照本文。如果本文有误，欢迎指正。

补充

最后贴一个ReferenceQueue的使用代码

ReferenceQueue<Foo> fooQueue = new ReferenceQueue<Foo>();

class ReferenceWithCleanup extends WeakReference<Foo> {
  Bar bar;
  ReferenceWithCleanup(Foo foo, Bar bar) {
    super(foo, fooQueue);
    this.bar = bar;
  }
  public void cleanUp() {
    bar.cleanUp();
  }
}

public Thread cleanupThread = new Thread() {
  public void run() {
    while(true) {
      ReferenceWithCleanup ref = (ReferenceWithCleanup)fooQueue.remove();
      ref.cleanUp();
    }
  }
}

public void doStuff() {
  cleanupThread.start();
  Foo foo = new Foo();
  Bar bar = new Bar();
  ReferenceWithCleanup ref = new ReferenceWithCleanup(foo, bar);
  ... // From now on, once you release all non-weak references to foo,
      // then at some indeterminate point in the future, bar.cleanUp() will
      // be run. You can force it by calling ref.enqueue().
}