In-depth analysis of ThreadLocal implementation principle and memory leak problem

I. Overview

In the 2017 Jingdong Campus Recruitment Written Exam, I encountered the problem of describing the implementation principle of ThreadLocal and memory leaks. I have seen the implementation principle of ThreadLocal before, but there are many articles on the Internet that are confusing, and many of them confuse ThreadLocal and thread synchronization mechanism. , Special attention is that ThreadLocal has nothing to do with thread synchronization, not to solve the problem of multi-threaded shared variables! 
ThreadLocal official website explains:

  This class provides thread-local variables. These variables differ from their normal counterparts in that each thread that accesses one (via its {@code get} or {@code set} method) has its own, independently initialized copy of the variable.  {@code ThreadLocal} instances are typically private static fields in classes that wish to associate state with a thread (e.g.,a user ID or Transaction ID)
  • 1

-> The general meaning of the translation is: The ThreadLocal class is used to provide local variables within the thread. When these variables are accessed in a multi-threaded environment (accessed by get or set methods), the variables in each thread can be guaranteed to be relatively independent of variables in other threads. ThreadLocal instances are usually private static types. 
Summary: ThreadLocal is not designed to solve multi-threaded access to shared variables, but to create a separate copy of the variable for each thread, providing a way to keep the object and avoid the complexity of parameter passing.

The main application scenario of ThreadLocal is the access of objects with multiple instances by thread (each thread corresponds to one instance), and this object is used in many places. For example, when a user logs in to the same website, each user server will open a thread for it, and a ThreadLocal will be created in each thread, which stores basic user information, etc. When many pages are jumped, user information will be displayed or some user information will be obtained. Information and other frequent operations, so that there is no connection between multiple threads and the current thread can obtain the desired data in time.

Second, the realization principle

ThreadLocal can be seen as a container that stores variables belonging to the current thread. The ThreadLocal class provides four open interface methods, which are also the basic methods for users to operate the ThreadLocal class: 
(1) void set(Object value) sets the value of the thread local variable of the current thread. 
(2) public Object get() This method returns the thread local variable corresponding to the current thread. 
(3) public void remove() deletes the value of the local variable of the current thread, in order to reduce the memory occupation, this method is a new method in JDK 5.0. It should be pointed out that when the thread ends, the local variables corresponding to the thread will be automatically garbage collected, so it is not necessary to explicitly call this method to clear the local variables of the thread, but it can speed up the speed of memory reclamation. 
(4) protected Object initialValue() returns the initial value of the local variable of the thread. This method is a protected method, which is obviously designed for subclasses to override. This method is a deferred call method, which is executed when the thread calls get() or set(Object) for the first time, and is executed only once. The default implementation in ThreadLocal directly returns a null.

The above methods can be used to access variables in ThreadLocal, set data, initialize and delete local variables. How does ThreadLocal maintain variable copies for each thread?

In fact, there is a static inner class ThreadLocalMap (similar to Map) in the ThreadLocal class, which stores the variable copy of each thread in the form of key-value pairs. The key of the element in ThreadLocalMap is the current ThreadLocal object, and the value corresponds to the variable copy of the thread. There may be multiple ThreadLocals per thread.

Source code:

/**
 Returns the value in the current thread's copy of this
 thread-local variable.  If the variable has no value for thecurrent thread, it is first initialized to the value returned by an invocation of the {@link #initialValue} method.
  @return the current thread's value of this thread-local
 */
public T get() {
    Thread t = Thread.currentThread();//当前线程
    ThreadLocalMap map = getMap(t);//获取当前线程对应的ThreadLocalMap
    if (map != null) {
        ThreadLocalMap.Entry e = map.getEntry(this);//获取对应ThreadLocal的变量值
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            return result;
        }
    }
    return setInitialValue();//若当前线程还未创建ThreadLocalMap,则返回调用此方法并在其中调用createMap方法进行创建并返回初始值。
}
//设置变量的值
public void set(T value) {
   Thread t = Thread.currentThread();
   ThreadLocalMap map = getMap(t);
   if (map != null)
       map.set(this, value);
   else
       createMap(t, value);
}
private T setInitialValue() {
   T value = initialValue();
   Thread t = Thread.currentThread();
   ThreadLocalMap map = getMap(t);
   if (map != null)
       map.set(this, value);
   else
       createMap(t, value);
   return value;
}
/**
为当前线程创建一个ThreadLocalMap的threadlocals,并将第一个值存入到当前map中
@param t the current thread
@param firstValue value for the initial entry of the map
*/
void createMap(Thread t, T firstValue) {
    t.threadLocals = new ThreadLocalMap(this, firstValue);
}
//删除当前线程中ThreadLocalMap对应的ThreadLocal
public void remove() {
       ThreadLocalMap m = getMap(Thread.currentThread());
       if (m != null)
           m.remove(this);
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51

上述是在ThreadLocal类中的几个主要的方法,他们的核心都是对其内部类ThreadLocalMap进行操作,下面看一下该类的源代码:

static class ThreadLocalMap {
  //map中的每个节点Entry,其键key是ThreadLocal并且还是弱引用,这也导致了后续会产生内存泄漏问题的原因。
 static class Entry extends WeakReference<ThreadLocal<?>> {
           Object value;
           Entry(ThreadLocal<?> k, Object v) {
               super(k);
               value = v;
   }
    /**
     * 初始化容量为16,以为对其扩充也必须是2的指数 
     */
    private static final int INITIAL_CAPACITY = 16;
    /**
     * 真正用于存储线程的每个ThreadLocal的数组,将ThreadLocal和其对应的值包装为一个Entry。
     */
    private Entry[] table;


    ///....其他的方法和操作都和map的类似
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

总之,为不同线程创建不同的ThreadLocalMap,用线程本身为区分点,每个线程之间其实没有任何的联系,说是说存放了变量的副本,其实可以理解为为每个线程单独new了一个对象。

三、内存泄漏问题(参考其他博文)

  在上面提到过,每个thread中都存在一个map, map的类型是ThreadLocal.ThreadLocalMap. Map中的key为一个threadlocal实例. 这个Map的确使用了弱引用,不过弱引用只是针对key. 每个key都弱引用指向threadlocal. 当把threadlocal实例置为null以后,没有任何强引用指向threadlocal实例,所以threadlocal将会被gc回收. 但是,我们的value却不能回收,因为存在一条从current thread连接过来的强引用. 只有当前thread结束以后, current thread就不会存在栈中,强引用断开, Current Thread, Map, value将全部被GC回收. 
  所以得出一个结论就是只要这个线程对象被gc回收,就不会出现内存泄露,但在threadLocal设为null和线程结束这段时间不会被回收的,就发生了我们认为的内存泄露。其实这是一个对概念理解的不一致,也没什么好争论的。最要命的是线程对象不被回收的情况,这就发生了真正意义上的内存泄露。比如使用线程池的时候,线程结束是不会销毁的,会再次使用的。就可能出现内存泄露。

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326274484&siteId=291194637