[Great ThreadLocal] 1. Source code analysis

I. Introduction

In the JDK, some unremarkable classes often contain huge energy. ThreadLocal is such a class. JDK1.2 was born. It can be regarded as a veteran of the JDK. From the beginning of this article, the landlord intends to explain the knowledge of ThreadLocal in three parts. It is expected to gradually demonstrate the charm of ThreadLocal from source code analysis to the application of the open source project ThreadLocal and the most valuable distributed link tracking of ThreadLocal! The following is the title of the article, set up the Flag first and then fill the hole.


Two, ThreadLocal data model

Analysis ThreadLocal, not open around Thread, ThreadLocalMap, ThreadLocalMap.Entrythe iron triangle, sort out the relationship between these three levels is the key to break through the ThreadLocal. A picture is worth a thousand words, above!

  • Each Threadobject holds a threadLocals attribute, its type is ThreadLocalMap, it is ThreadLocala static inner class;
  • ThreadLocalMapInternally maintain an ThreadLocaMap.Entry[]array, where the key of Entry is ThreadLocaltype and value is the object set by the client;
  • Entry's key refers to an ThreadLocalobject. This reference is special. Weak reference is used.WeakReference

Weak Reference (WeakReference): Weak reference associated objects, when GC occurs, regardless of insufficient memory, the object pointed to by the weak reference will be recycled.

Insert picture description here

Q: Why is there a map structure such as ThreadLocalMap in Thread?

A: This design is to enable threads to store multiple ThreadLocal objects, as shown in the following application scenario:

private static ThreadLocal<Integer> threadLocal_int = new ThreadLocal<>();

private static ThreadLocal<String> threadLocal_string = new ThreadLocal<>();

// threadLocal_int 和 threadLocal_int 具有不同的 threadLocalHashCode,故123和abc存放在同一个线程的ThreadLocalMap的2个Entry中
public void test() {
    threadLocal_int.set(123);
    threadLocal_string.set("abc");
}

Three, memory leak

To determine whether you have mastered ThreadLocal, 2 questions can be checked:

  • Why is Entry's reference to ThreadLocal designed as a weak reference?
  • Does ThreadLocal have a memory leak problem?

I will not answer the question first. With these two questions, we will analyze and derive the answer from the analysis, so that the impression can be more profound!

3.1 Strong references have memory leaks?

Think about it from another angle. If it is a strong reference, what will be the problem? For ThreadLocal, there are two sources holding its reference: A. ThreadLocal_ref held by the client, this reference client can operate assignment; B. Reference from the key of ThreadLocalMap.Entry held inside the thread, this reference The client is completely unaware; it is assumed that the client actively releases the reference to ThreadLocal (eg, through assignment ThreadLoca_ref= null;), but because there is still a strong reference from the B source, the ThreadLocal object cannot be recycled by the GC, so the ThreadLocal object has a memory leak risk .

Conclusion: strong references have a risk of memory leaks

3.2 Weak references do not have memory leaks?

The above analysis shows that there is a possibility of a memory leak in a strong reference. Is there a risk of a memory leak in a weak reference? Let's analyze it with a picture: When the client actively releases a strong reference to the ThreadLocal object (by assignment ThreadLoca_ref = null;), GC When this happens, because Entry.key is a weak reference to ThreadLocal, the ThreadLocal object will be recycled; however, the value of Entry still maintains a strong reference to Object. Since Entry.key is already null, the client has no way The Entry can be located, so there is a risk of memory leak in the Value object of the Entry.Insert picture description here

Conclusion: Weak references also have the risk of memory leaks

PS:
1. When the ThreadLocal is actually used, the reference to the ThreadLocal is almost never artificially broken. The suggestion given by JDK is to use static to modify ThreadLocal, so that it will always maintain a strong reference to ThreadLocal;
2. The get, set, and remove methods of ThreadLocalMap all consider the case of Entry.key = null. When these three operations are executed, Entry.key = null Entry.value (which is reflected in the source code) will be cleared, which greatly reduces the incidence of memory leaks;
3. There is a risk of memory leaks regardless of strong and weak references, Why choose weak references in design? The reason is simple. Weak references have a lower probability of memory leaks than strong references. After all, ThreadLocal can reclaim memory space!

3.3 How to completely avoid memory leaks?

The previous analysis is to analyze the client disconnects the reference to ThreadLocal. If the client disconnects a strong reference to the Thread object Thread_ref (by assignment Thread_ref=null;), it is clear that when GC occurs, the Thread object will definitely be killed because it is not referenced, and the ThreadLocalMap itself is a property of the Thread object. Similarly, the ThreadLocalMap, ThreadLocal.Entry and ThreadLocal will be killed by GC, and the entire heap memory is clean. However, Thread is considered to be a scarce resource. In actual applications, thread pools are often used to recycle threads. Therefore, it is theoretically feasible for the client to disconnect the strong reference of the Thread object to avoid memory leaks, but it is not in line with practice.

Recall the above two memory leak scenarios: when the ThreadLocal is strongly referenced, the reason for the memory leak is that the Entry.key holds a strong reference to the ThreadLocal, resulting in the ThreadLocal cannot be recycled by the GC; when the ThreadLocal is weakly referenced, the memory leak occurs in Entry.key In the case of = null, the Entry value cannot be released. Obviously, the problem in both cases lies in the Entry object. If you kill the Entry directly, will there be no memory leak? Yes, JDK design masters naturally thought of this, so they provided the ThreadLocal.remove () method to do this.

In summary : the best practice to avoid memory leaks is to use ThreadLocal.remove () to clear the entire Entry object after using ThreadLocal, and use the following boilerplate code.

ThreadLocal.set(value);
try {
   // 这里是业务代码
} finally {
    ThreadLocal.remove();
}

Four, source code analysis

4.1 ThreadLocal source code

Comparing the first picture, looking at ThreadLocal should be effortless. ThreadLocal commonly used methods are as follows:

  • initialValue()
// 交给给客户端进行覆写自定义初始值的生成
protected T initialValue() {
    return null;
}
  • set(T value)
public void set(T value) {
    // 获得当前线程
    Thread t = Thread.currentThread();
    // 获得当前线程持有的 ThreadLocal.ThreadLocalMap
    ThreadLocalMap map = getMap(t);
    if (map != null)
        // 调用ThreadLocalMap 的set方法;这里的this就是当前触发set方法的ThreadLocal对象本身
        map.set(this, value);
    else
        // 直接 new ThreadLocalMap(this, value) 并赋值给 t.threadLocals
        createMap(t, value);
}


/**
 * Get the map associated with a ThreadLocal. Overridden in
 * InheritableThreadLocal.
 *
 * @param  t the current thread
 * @return the map
 */
ThreadLocalMap getMap(Thread t) {
    // threadLocals 就是线程持有的 ThreadLocal.ThreadLocalMap 类型变量
    return t.threadLocals;
}
  • T get()
public T get() {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        // 调用ThreadLocalMap 的getEntry方法
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            return result;
        }
    }
    // 触发初始化initialValue方法,顺便把set方法的逻辑走一遍,最后返回初始值
    return setInitialValue();
}

/**
 * Variant of set() to establish initialValue. Used instead
 * of set() in case user has overridden the set() method.
 *
 * @return the initial value
 */
private T setInitialValue() {
    // 初始值
    T value = initialValue();
    // 下面的内容跟set方法完全一样;这才体现出 setInitialValue 这个方法名的含义!
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
    // 返回初始值
    return value;
}
  • remove()
 public void remove() {
     ThreadLocalMap m = getMap(Thread.currentThread());
     if (m != null)
         // 调用 ThreadLocalMap的remove方法
         m.remove(this);
 }

Looking at the four methods on ThreadLocal alone, it is actually quite clear. ThreadLocal facade can be seen as a class, not too much logic, real logic heavier entrusted to ThreadLocalMapdo the.

4.2 ThreadLocalMap source code

ThreadLocalMap is really working. Corresponding to the four methods of ThreadLocal, it also provides several methods: set ()-> ThreadLocalMap.set (), get ()-> getEntry (ThreadLocal <?> Key), remove ()-> remove (ThreadLocal <?> key), the source code is as follows:

  • ThreadLocalMap.set(ThreadLocal<?> key , Object value)
private void set(ThreadLocal<?> key, Object value) {

    // We don't use a fast path as with get() because it is at
    // least as common to use set() to create new entries as
    // it is to replace existing ones, in which case, a fast
    // path would fail more often than not.

    Entry[] tab = table;
    int len = tab.length;
    int i = key.threadLocalHashCode & (len-1);

    // 如果ThreadLocal对应的key找得到,则进行赋值
    for (Entry e = tab[i];
         e != null;
         e = tab[i = nextIndex(i, len)]) {
        ThreadLocal<?> k = e.get();

        if (k == key) {
            e.value = value;
            return;
        }

        if (k == null) {
            // Entry值为null, 则进行清理
            replaceStaleEntry(key, value, i);
            return;
        }
    }

    // 如果ThreadLocal对应的key找不到,则新建Entry
    tab[i] = new Entry(key, value);
    int sz = ++size;
    // 顺带做点脏数据清理工作,内部触发 expungeStaleEntry
    if (!cleanSomeSlots(i, sz) && sz >= threshold)
        rehash();
}

// 清除key=null的Entry,显示将Entry.value=null
private int expungeStaleEntry(int staleSlot) {
    Entry[] tab = table;
    int len = tab.length;

    // expunge entry at staleSlot
    tab[staleSlot].value = null;
    tab[staleSlot] = null;
    size--;

    // Rehash until we encounter null
    Entry e;
    int i;
    for (i = nextIndex(staleSlot, len);
         (e = tab[i]) != null;
         i = nextIndex(i, len)) {
        ThreadLocal<?> k = e.get();
        if (k == null) {
            // 清除key=null的Entry的value
            e.value = null;
            tab[i] = null;
            size--;
        } else {
            int h = k.threadLocalHashCode & (len - 1);
            if (h != i) {
                tab[i] = null;

                // Unlike Knuth 6.4 Algorithm R, we must scan until
                // null because multiple entries could have been stale.
                while (tab[h] != null)
                    h = nextIndex(h, len);
                tab[h] = e;
            }
        }
    }
    return i;
}
  • Entry getEntry(ThreadLocal<?> key)
/**
 * Get the entry associated with key.  This method
 * itself handles only the fast path: a direct hit of existing
 * key. It otherwise relays to getEntryAfterMiss.  This is
 * designed to maximize performance for direct hits, in part
 * by making this method readily inlinable.
 *
 * @param  key the thread local object
 * @return the entry associated with key, or null if no such
 */
private Entry getEntry(ThreadLocal<?> key) {
    int i = key.threadLocalHashCode & (table.length - 1);
    Entry e = table[i];
    if (e != null && e.get() == key)
        return e;
    else
        // 什么情况下会进这里来?发生了GC,由于是弱引用,Entry的key指向的ThreadLocal对象已经被GC回收了,但Entry的value还没被清理
        return getEntryAfterMiss(key, i, e);
}

/**
 * Version of getEntry method for use when key is not found in
 * its direct hash slot.
 *
 * @param  key the thread local object
 * @param  i the table index for key's hash code
 * @param  e the entry at table[i]
 * @return the entry associated with key, or null if no such
 */
private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
    Entry[] tab = table;
    int len = tab.length;

    while (e != null) {
        // 拿到key指向的ThreadLocal对象
        ThreadLocal<?> k = e.get();
        if (k == key)
            return e;
        if (k == null)
            // key指向的ThreadLocal对象为null,说明ThreadLocal对象被垃圾回收了,故需要清理掉Entry的value
            expungeStaleEntry(i);
        else
            i = nextIndex(i, len);
        e = tab[i];
    }
    return null;
}
  • remove(ThreadLocal<?> key)
private void remove(ThreadLocal<?> key) {
    Entry[] tab = table;
    int len = tab.length;
    int i = key.threadLocalHashCode & (len-1);
    for (Entry e = tab[i];
         e != null;
         e = tab[i = nextIndex(i, len)]) {
        if (e.get() == key) {
            e.clear();
            // 清除脏数据
            expungeStaleEntry(i);
            return;
        }
    }
}

4.3 Summary

Finally, in the last picture, summarize the objects and call links involved in the three methods of ThreadLocal set, get, and remove.Insert picture description here

in conclusion:

  • ThreadLocal's strong and weak references all have the risk of memory leaks, but this risk is actually not large. Because ThreadLocal's set, get, and remove methods all do additional optimization work to remove dirty data.
  • The best practice of ThreadLocal is to actively execute the remove method after using ThreadLocal to completely eliminate memory leaks!

Full text ~

Published 31 original articles · praised 32 · 40,000+ views

Guess you like

Origin blog.csdn.net/caiguoxiong0101/article/details/105355115