Hard core analysis of ThreadLocal source code

1. What is ThreadLocal

ThreadLocal is a thread local variable, which is a private variable of a thread. Different threads are isolated from each other and cannot be shared. It is equivalent to copying a copy of a variable for each thread.

The purpose is to ensure data security without locking in a multi-threaded environment.

2. The use of ThreadLocal

/**
 * @author 一灯架构
 * @apiNote ThreadLocal示例
 **/
public class ThreadLocalDemo {
    // 1. 创建ThreadLocal
    static ThreadLocal<String> threadLocal = new ThreadLocal<>();

    public static void main(String[] args) {
        // 2. 给ThreadLocal赋值
        threadLocal.set("关注公众号:一灯架构");
        // 3. 从ThreadLocal中取值
        String result = threadLocal.get();
        System.out.println(result); // 输出 关注公众号:一灯架构
        
        // 4. 删除ThreadLocal中的数据
        threadLocal.remove();
        System.out.println(threadLocal.get()); // 输出null
    }

}

The usage of ThreadLocal is very simple. When creating ThreadLocal, specify the generic type, and then it is the operation of assigning, fetching, and deleting values.

Between different threads, ThreadLocal data is isolated, test it:

/**
 * @author 一灯架构
 * @apiNote ThreadLocal示例
 **/
public class ThreadLocalDemo {
    // 1. 创建ThreadLocal
    static ThreadLocal<Integer> threadLocal = new ThreadLocal<>();

    public static void main(String[] args) {
        IntStream.range(0, 5).forEach(i -> {
          	// 创建5个线程，分别给threadLocal赋值、取值
            new Thread(() -> {
                // 2. 给ThreadLocal赋值
                threadLocal.set(i);
                // 3. 从ThreadLocal中取值
                System.out.println(Thread.currentThread().getName()
                        + "," + threadLocal.get());
            }).start();
        });
    }

}

Output result:

Thread-2,2
Thread-4,4
Thread-1,1
Thread-0,0
Thread-3,3

It can be seen that the ThreadLocal data between different threads are isolated from each other and do not affect each other. What are the application scenarios for such an implementation effect?

3. ThreadLocal application scenarios

The application scenarios of ThreadLocal are mainly divided into two categories:

Avoid passing objects layer by layer between methods, breaking the constraints between layers.
For example, user information needs to be used in many places, and it is troublesome to pass it down layer by layer. At this time, the user information can be put in ThreadLocal, and it can be used directly where needed.
Copy object copies, reduce initialization operations, and ensure data security.
For example, database connection, Spring transaction management, and SimpleDataFormat format date all use ThreadLocal, which prevents each thread from initializing an object and ensures data security under multi-threading.

Use ThreadLocal to ensure the thread safety of SimpleDataFormat formatted date, the code is similar to the following:

/**
 * @author 一灯架构
 * @apiNote ThreadLocal示例
 **/
public class ThreadLocalDemo {
    // 1. 创建ThreadLocal
    static ThreadLocal<SimpleDateFormat> threadLocal =
            ThreadLocal.withInitial(() -> new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));


    public static void main(String[] args) {
        IntStream.range(0, 5).forEach(i -> {
            // 创建5个线程，分别从threadLocal取出SimpleDateFormat，然后格式化日期
            new Thread(() -> {
                try {
                    System.out.println(threadLocal.get().parse("2022-11-11 00:00:00"));
                } catch (ParseException e) {
                    throw new RuntimeException(e);
                }
            }).start();
        });
    }

}

4. ThreadLocal implementation principle

The bottom layer of ThreadLocal uses ThreadLocalMap to store data, and inside ThreadLocalMap is an array, which stores Entry objects, which use key-value to store data, key is the ThreadLocal instance object itself, and value is the generic object value of ThreadLocal.

4.1 ThreadLocalMap source code

static class ThreadLocalMap {
    // Entry对象，WeakReference是弱引用，当没有引用指向时，会被GC回收
    static class Entry extends WeakReference<ThreadLocal<?>> {
        // ThreadLocal泛型对象值
        Object value;
        // 构造方法，传参是key-value
        // key是ThreadLocal对象实例，value是ThreadLocal泛型对象值
        Entry(ThreadLocal<?> k, Object v) {
            super(k);
            value = v;
        }
    }
  
    // Entry数组，用来存储ThreadLocal数据
    private Entry[] table;
    // 数组的默认容量大小
    private static final int INITIAL_CAPACITY = 16;
    // 扩容的阈值，默认是数组大小的三分之二
    private int threshold;

    private void setThreshold(int len) {
        threshold = len * 2 / 3;
    }
}

4.2 source code of set method

// 给ThreadLocal设值
public void set(T value) {
    // 获取当前线程对象
    Thread t = Thread.currentThread();
    // 获取此线程对象中的ThreadLocalMap对象
    ThreadLocalMap map = getMap(t);
    // 如果ThreadLocal已经设过值，直接设值，否则初始化
    if (map != null)
        // 设值的key就是当前ThreadLocal对象实例，value是ThreadLocal泛型对象值
        map.set(this, value);
    else
        // 初始化ThreadLocalMap
        createMap(t, value);
}

Take a look at the actual set method source code:

// key就是当前ThreadLocal对象实例，value是ThreadLocal泛型对象值
private void set(ThreadLocal<?> key, Object value) {
    // 获取ThreadLocalMap中的Entry数组
    Entry[] tab = table;
    int len = tab.length;
    // 计算key在数组中的下标，也就是ThreadLocal的hashCode和数组大小-1取余
    int i = key.threadLocalHashCode & (len - 1);

    // 查找流程：从下标i开始，判断下标位置是否有值，
    // 如果有值判断是否等于当前ThreadLocal对象实例，等于就覆盖，否则继续向后遍历数组，直到找到空位置
    for (Entry e = tab[i];
         e != null;
        // nextIndex 就是让在不超过数组长度的基础上，把数组的索引位置 + 1
         e = tab[i = nextIndex(i, len)]) {
        ThreadLocal<?> k = e.get();
        // 如果等于当前ThreadLocal对象实例，直接覆盖
        if (k == key) {
            e.value = value;
            return;
        }
        // 当前key是null，说明ThreadLocal对象实例已经被GC回收了，直接覆盖
        if (k == null) {
            replaceStaleEntry(key, value, i);
            return;
        }
    }
    // 找到空位置，创建Entry对象
    tab[i] = new Entry(key, value);
    int sz = ++size;
    // 当数组大小大于等于扩容阈值(数组大小的三分之二)时，进行扩容
    if (!cleanSomeSlots(i, sz) && sz >= threshold)
        rehash();
}

The specific process of the set method is as follows:

It is known from the source code and flowchart that ThreadLocal resolves hash conflicts through the linear detection method . The specific assignment process of the linear detection method is as follows:

Find the array subscript through the hashcode of the key
If the array subscript position is empty or equal to the current ThreadLocal object, directly overwrite the value and end
If it is not empty, continue to traverse down, traverse to the end of the array, and then traverse from the beginning until the position where the array is empty is found, and the assignment ends at this position

The special assignment process of the linear detection method leads to a similar process when fetching a value.

4.3 get method source code

// 从ThreadLocal从取值
public T get() {
    // 获取当前线程对象
    Thread t = Thread.currentThread();
    // 获取此线程对象中的ThreadLocalMap对象
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        // 通过ThreadLocal实例对象作为key，在Entry数组中查找数据
        ThreadLocalMap.Entry e = map.getEntry(this);
        // 如果不为空，表示找到了，直接返回
        if (e != null) {
            T result = (T)e.value;
            return result;
        }
    }
    // 如果ThreadLocalMap是null，就执行初始化ThreadLocalMap操作
    return setInitialValue();
}

Let's take a look at the specific logic of traversing the Entry array:

// 具体的遍历Entry数组的方法
private Entry getEntry(ThreadLocal<?> key) {
    // 通过hashcode计算数组下标位置
    int i = key.threadLocalHashCode & (table.length - 1);
    Entry e = table[i];
    // 如果下标位置对象不为空，并且等于当前ThreadLocal实例对象，直接返回
    if (e != null && e.get() == key)
        return e;
    else
        // 如果不是，需要继续向下遍历Entry数组
        return getEntryAfterMiss(key, i, e);
}

Let's take a look at the special value method of the linear detection method:

// 如果不是，需要继续向下遍历Entry数组
private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
    Entry[] tab = table;
    int len = tab.length;
    // 循环遍历数组，直到找到ThreadLocal对象，或者遍历到数组为空的位置
    while (e != null) {
        ThreadLocal<?> k = e.get();
        // 如果等于当前ThreadLocal实例对象，表示找到了，直接返回
        if (k == key)
            return e;
        // key是null，表示ThreadLocal实例对象已经被GC回收，就帮忙清除value
        if (k == null)
            expungeStaleEntry(i);
        else
          	// 索引位置+1，表示继续向下遍历
            i = nextIndex(i, len);
        e = tab[i];
    }
    return null;
}

// 索引位置+1，表示继续向下遍历，遍历到数组结尾，再从头开始遍历
private static int nextIndex(int i, int len) {
    return ((i + 1 < len) ? i + 1 : 0);
}

The get method flow of ThreadLocal is as follows:

4.4 remove method source code

The remove method process is similar to the set and get methods. It traverses the array, finds the ThreadLocal instance object, deletes the key and value, and then deletes the Entry object to end.

public void remove() {
    // 获取当前线程的ThreadLocalMap对象
    ThreadLocalMap m = getMap(Thread.currentThread());
    if (m != null)
        m.remove(this);
}

// 具体的删除方法
private void remove(ThreadLocal<?> key) {
    ThreadLocal.ThreadLocalMap.Entry[] tab = table;
    int len = tab.length;
    // 计算数组下标
    int i = key.threadLocalHashCode & (len - 1);
    // 遍历数组，直到找到空位置，
    // 或者值等于当前ThreadLocal对象，才结束
    for (ThreadLocal.ThreadLocalMap.Entry e = tab[i];
         e != null;
         e = tab[i = nextIndex(i, len)]) {
        // 找到后，删除key、value，再删除Entry对象
        if (e.get() == key) {
            e.clear();
            expungeStaleEntry(i);
            return;
        }
    }
}

5. Precautions for using ThreadLocal

After using ThreadLocal, be sure to call the remove method to clean up the threadLocal data. The specific process is similar to the following:

/**
 * @author 一灯架构
 * @apiNote ThreadLocal示例
 **/
public class ThreadLocalDemo {
    // 1. 创建ThreadLocal
    static ThreadLocal<User> threadLocal = new ThreadLocal<>();

    public void method() {
        try {
            User user = getUser();
            // 2. 给threadLocal赋值
            threadLocal.set(user);
            // 3. 执行其他业务逻辑
            doSomething();
        } finally {
            // 4. 清理threadLocal数据
            threadLocal.remove();
        }
    }
}

If you forget to call the remove method, you can cause two serious problems:

Cause memory overflow
If the life cycle of the thread is very long, the data has been put in the ThreadLocal, but it is not deleted, and OOM will eventually occur
If the thread
pool is used, a thread will not be destroyed after executing the task, but will continue to execute the next task, causing the next task to access the data of the previous task.

6. Analysis of common interview questions

After reading the ThreadLocal source code, answer a few interview questions to check how the learning results are.

6.1 How does ThreadLocal ensure data security?

The ThreadLocalMap used at the bottom of ThreadLocal stores data, and ThreadLocalMap is a private variable of the thread Thread, and data is isolated between different threads, so even if the set, get, and remove methods of ThreadLocal are not locked, thread safety can be guaranteed.

6.2 Why does ThreadLocal use arrays at the bottom? instead of an object?

Because multiple ThreadLocal instance objects can be created in one thread, an array is used for storage instead of an object.

6.3 How does ThreadLocal resolve hash conflicts?

The linear detection method used by ThreadLocal solves hash conflicts. The specific assignment process of the linear detection method is as follows:

Find the array subscript through the hashcode of the key
If the array subscript position is empty or equal to the current ThreadLocal object, directly overwrite the value and end
If it is not empty, continue to traverse down, traverse to the end of the array, and then traverse from the beginning until the position where the array is empty is found, and the assignment ends at this position

6.4 Why does ThreadLocal use linear probing to resolve hash conflicts?

We all know that HashMap uses the chain address method (also called the zipper method) to resolve hash conflicts. Why does ThreadLocal use linear detection to resolve hash conflicts? Instead of using the chain address method?

My guess is that it may be that the creator is lazy and troublesome, or that ThreadLocal is used less, and the probability of hash collision is low, so I don't want to be so troublesome.

Using the chain address method requires the introduction of two data structures, the linked list and the red-black tree, and the implementation is more complicated. The linear detection method does not introduce any additional data structures, and directly traverses the array continuously.

As a result, if many ThreadLocals are used in one thread, after a hash collision occurs, the performance of ThreadLocal's get and set drops sharply.

Compared with the chain address method, the linear detection method has obvious advantages and disadvantages:

Advantages: simple implementation, no need to introduce additional data structures.

Disadvantage: After a hash collision occurs, the performance of ThreadLocal's get and set drops sharply.

6.5 Why is the key of ThreadLocalMap designed as a weak reference?

Let me talk about the characteristics of weak references first:

Objects with weak references have a shorter life cycle. When the garbage collector thread scans the memory area under its jurisdiction, once an object with only weak references is found, its memory will be reclaimed regardless of whether the current memory space is sufficient or not. . However, since the garbage collector is a very low-priority thread, objects that only have weak references may not be found quickly.

After the key of ThreadLocalMap is designed as a weak reference, will it be recycled by GC while we are using it?

This is not possible, because we have been strongly referencing the ThreadLocal instance object.

/**
 * @author 一灯架构
 * @apiNote ThreadLocal示例
 **/
public class ThreadLocalDemo {
    // 1. 创建ThreadLocal
    static ThreadLocal<String> threadLocal = new ThreadLocal<>();

    public static void main(String[] args) {
        // 2. 给ThreadLocal赋值
        threadLocal.set("关注公众号:一灯架构");
        // 3. 从ThreadLocal中取值
        String result = threadLocal.get();
        // 手动触发GC
        System.gc();
        System.out.println(result); // 输出 关注公众号:一灯架构

    }

}

From the above code, if we have been using threadLocal, after triggering GC, there will be no threadLocal instance object.

The purpose of designing the key of ThreadLocalMap as a weak reference is:

Prevent us from forgetting to call the remove method to delete data after using ThreadLocal, resulting in the ThreadLocal data in the array not being recycled.

/**
 * @author 一灯架构
 * @apiNote ThreadLocal示例
 **/
public class ThreadLocalDemo {
    // 1. 创建ThreadLocal
    static ThreadLocal<String> threadLocal = new ThreadLocal<>();

    public static void main(String[] args) {
        // 2. 给ThreadLocal赋值
        threadLocal.set("关注公众号:一灯架构");
        // 3. 使用完threadLocal，设置成null，模仿生命周期结束
        threadLocal = null;
        // 触发GC，这时候ThreadLocalMap的key就会被回收，但是value还没有被回收。
        // 只有等到下次执行get、set方法遍历数组，遍历到这个位置，才会删除这个无效的value
        System.gc();
    }

}

6.6 Why does ThreadLocal have a memory leak?

The reason for the memory leak in ThreadLocal is that we did not execute the remove method to delete data after using ThreadLocal.

What exactly is the memory leak caused by too much data?

One is the Entry object of the array , and the key and value in the Entry object are the ThreadLocal instance object and the generic object value respectively.

Because when we use ThreadLocal, we always like to set ThreadLocal as a static variable of the class, and the ThreadLocal object data will not be recycled until the thread life cycle ends.

The other is the value of the Entry object in the array , which is the generic object value. Although the key of ThreadLocalMap is set as a weak reference and will be recycled by GC, the value has not been recycled. It is necessary to wait until the next execution of the get and set methods to traverse the array and traverse to this position before deleting the invalid value. This is also one of the causes of memory leaks.

6.7 How to share ThreadLocal data between parent and child threads?

Only InheritableThreadLocal is needed. When the child thread is initialized, the ThreadLocal data will be copied from the parent thread.

/**
 * @author 一灯架构
 * @apiNote ThreadLocal示例
 **/
public class ThreadLocalDemo {
    // 1. 创建可被子线程继承数据的ThreadLocal
    static ThreadLocal<String> threadLocal = new InheritableThreadLocal<>();

    public static void main(String[] args) {
        // 2. 给ThreadLocal赋值
        threadLocal.set("关注公众号:一灯架构");

        // 3. 启动一个子线程，看是否能获取到主线程数据
        new Thread(() -> {
            System.out.println(threadLocal.get()); // 输出 关注公众号:一灯架构
        }).start();

    }

}