ThreadLocal interview is enough to read this article

注明:本文源码基于JDK1.8版本

What is ThreadLocal

  ThreadLocal is called a thread local variable. When using ThreadLocal to maintain variables, each Thread has its own copy variable, and multiple threads do not interfere with each other, thereby realizing data isolation between threads.
  The variables maintained by ThreadLocal work within the life cycle of the thread, which can reduce the complexity of passing some public variables between multiple functions or components in the same thread.

Let's start with a simple example:

public class ThreadLocalTest01 {
    
    

    private static ThreadLocal<String> threadLocal = new ThreadLocal<>();

    public static void main(String[] args) throws InterruptedException {
    
    

        // 子线程t1调用threadLocal的set方法赋值
        Thread t1 =  new Thread(() ->{
    
    
            threadLocal.set("abc");
            System.out.println("t1 赋值完成");

            System.out.println("t1线程中get:"+threadLocal.get());
        });

        // 子线程t2调用threadLocal的get方法取值
        Thread t2 = new Thread(() ->
            System.out.println("t2线程中get:"+threadLocal.get())
        );

        t1.start();
        t1.join();
        t2.start();

        // 主线程调用threadLocal的get方法取值
        System.out.println("主线程中get:"+threadLocal.get());

    }

}

Output result:

t1 assignment completed
get in thread t1: abc
get in main thread: null
get in thread t2: null

It can be seen from the program that the data saved by threadLocal in the t1 thread through the set method cannot be accessed by other threads.

ThreadLocal data structure

insert image description here

  • Each thread corresponds to a Thread object, and in the Thread object, there is a ThreadLocal.ThreadLocalMap member variable.
  • ThreadLocalMap is similar to HashMap, which maintains key-value key-value pairs. The difference is that the data structure of HashMap is an array + linked list/red-black tree, while the data structure of ThreadLocalMap is an array.
  • The ThreadLocalMap array stores the static internal class object Entry (ThreadLocal<?> k, Object v). It can be simply considered that the ThreadLocal object is the key, and the content of the set is the value. (actually the key is a weak reference WeakReference<ThreadLocal<?>>)

Java's four reference types

When introducing the ThreadLocal data structure earlier, it was mentioned that the key of the ThreadLocalMap is a weak reference. What is the use of weak references? Why use weak references here? In order to clarify these issues, let's take a look at the four reference types of Java.

  • Strong reference: We usually use the strong reference type the most. For example Object obj = new Object(), when we create an Object object, there will be an obj variable in the stack memory, which points to the Object object allocated in the heap memory. This kind of reference is a strong reference. As long as there is a strong reference, even if the JVM throws an OOM exception due to insufficient memory, the garbage collector will not recycle it.
  • Soft reference: Soft reference is decorated with SoftReference. For example SoftReference<Object> sr = new SoftReference<>(new Object()), there is an sr in the stack memory, which is associated with the SoftReference object allocated in the heap memory through a strong reference, and there will be a soft reference in the SoftReference object pointing to the allocated Object object. Regarding the Object object pointed to by the soft reference, when the JVM heap memory is insufficient, it will be reclaimed by the garbage collector. Soft references are used to describe some useful but not necessary objects, such as the implementation of image cache and web page cache.
  • Weak references: Weak references are modified with WeakReference, such as WeakReference<Object> wr = new WeakReference<>(new Object()). If an object is only associated with weak references, then as long as garbage collection occurs, the object will be recycled. A typical application scenario of weak reference is ThreadLocalMap.
  • Phantom reference: Phantom reference is decorated with PhantomReference. For example PhantomReference<Object> pr = new PhantomReference<>(new Object(), QUEUE), phantom reference is different from the previous soft reference and weak reference. It does not affect the life cycle of the object. The only function of phantom reference is to use the queue to receive the notification that the object is about to die. Through this way to manage off-heap memory. Zero Copy in Netty is a typical application of phantom references.

Why do keys in ThreadLocalMap use weak references?

Before exploring this question, let's take a look at what weak references look like in ThreadLocalMap.

/**
 * 测试没有强引用关联ThreadLocal对象时,Entry中的虚引用key是否被回收
 */
public class ThreadLocalTest02_GC {
    
    

    public static void main(String[] args) throws Exception {
    
    

        // 有强引用指向ThreadLocal对象
        ThreadLocal<String> threadLocal = new ThreadLocal<>();
        threadLocal.set("abc");

        // 没有强引用指向ThreadLocal对象
        new ThreadLocal<>().set("def");

        // Thread中成员变量threadLocals是默认访问类型,只允许同一个包里类访问,我们可以通过反射方式拿到。
        Thread t = Thread.currentThread();
        Class<? extends Thread> clz = t.getClass();
        Field field = clz.getDeclaredField("threadLocals");
        field.setAccessible(true);
        Object threadLocalMap = field.get(t);
        Class<?> tlmClass = threadLocalMap.getClass();
        Field tableField = tlmClass.getDeclaredField("table");
        tableField.setAccessible(true);
        Object[] arr = (Object[]) tableField.get(threadLocalMap);
        for (Object o : arr) {
    
    
            if (o != null) {
    
    
                Class<?> entryClass = o.getClass();
                Field valueField = entryClass.getDeclaredField("value");
                Field referenceField = entryClass.getSuperclass().getSuperclass().getDeclaredField("referent");
                valueField.setAccessible(true);
                referenceField.setAccessible(true);
                System.out.println(String.format("key:%s,值:%s", referenceField.get(o), valueField.get(o)));
            }
        }

    }
}

The output is as follows:

key: java.lang.ThreadLocal@4d7e1886, value: abc
key: java.lang.ThreadLocal@3cd1a2f1, value: [Ljava.lang.Object;@2f0e140b
key: java.lang.ThreadLocal@7440e464, value: java.lang. ref.SoftReference@49476842
key: null, value: def
key: java.lang.ThreadLocal@78308db1, value: java.lang.ref.SoftReference@27c170f0

From the output results, we can see that the record key with the value "abc" exists, and the record with the value "def" has a corresponding key of null. This means that when no strong reference exists, the object pointed to by the weak reference will be reclaimed by the garbage collector.

The key of ThreadLocalMap is defined as a weak reference, so when the localThread object has no strong reference point, it will be recycled by gc to avoid memory leaks. However, although the key here is recycled, the value still has memory leaks. Only when the thread life cycle ends or the cleanup algorithm is triggered, the value can be reclaimed by gc.

注:这里除了我们set的两条数据,还有其它三条数据,如StringCoding编解码使用的数据,我们可以忽略

Detailed explanation of ThreadLocal set method


	ThreadLocal<String> threadLocal = new ThreadLocal<>();
    threadLocal.set("abc");
	---------------------------------------------------------
	public void set(T value) {
    
    
		// 获取当前线程对象
        Thread t = Thread.currentThread();
        // 从线程对象t中获取ThreadLocalMap对象
        ThreadLocalMap map = getMap(t);
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
    }
	---------------------------------------------------------
	// 从线程对象t中获取ThreadLocalMap对象
	ThreadLocalMap getMap(Thread t) {
    
    
        return t.threadLocals;
    }
    ---------------------------------------------------------
    // 创建线程对象t的成员变量ThreadLocalMap对象,
    // 初始化一条数据:this(指的是threadLocal对象)为key,firstValue为value
    void createMap(Thread t, T firstValue) {
    
    
        t.threadLocals = new ThreadLocalMap(this, firstValue);
    }

As shown in the above code, when we call the set method to save the information "abc", first obtain the current thread object t through Thread.currentThread(), and then obtain the ThreadLocalMap type variable map in the thread object t, if the map is not null, directly Insert a key/value key-value pair data (threadLocal is the key, the set value "abc" is value), if the map is null, create a ThreadLocalMap, and let the map point to the newly created object, and initialize a key/value Key-value pair data (threadLocal is the key, and the set value "abc" is the value).

Let's first take a look at what operations are done when the map is null new ThreadLocalMap(this, firstValue).

	ThreadLocalMap(ThreadLocal<?> firstKey, Object firstValue) {
    
    
		// 创建一个Entry数组,数组初始容量为INITIAL_CAPACITY(16)
		table = new Entry[INITIAL_CAPACITY];
		// 计算下标位置
		int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
		table[i] = new Entry(firstKey, firstValue);
		size = 1;
		// 设置阈值
		setThreshold(INITIAL_CAPACITY);
	}
	---------------------------------------------------------
	private final int threadLocalHashCode = nextHashCode();
	
	private static AtomicInteger nextHashCode =
        new AtomicInteger();
        
    private static final int HASH_INCREMENT = 0x61c88647;

    private static int nextHashCode() {
    
    
        return nextHashCode.getAndAdd(HASH_INCREMENT);
    }
    ---------------------------------------------------------
    // 设置阈值大小,当数组中的元素大于等于阈值时,会触发rehash方法进行扩容
    private void setThreshold(int len) {
    
    
		threshold = len * 2 / 3;
	}

Let's take a look at when the map is not null

	private void set(ThreadLocal<?> key, Object value) {
    
    
	
		Entry[] tab = table;
		int len = tab.length;
		int i = key.threadLocalHashCode & (len-1);

		// 开放定址法查找可用的槽位(用于解决HASH冲突)
		for (Entry e = tab[i];
			 e != null;
			 e = tab[i = nextIndex(i, len)]) {
    
    
			ThreadLocal<?> k = e.get();

			// 如果槽位上已经有值,并且key相同,则替换value值
			if (k == key) {
    
    
				e.value = value;
				return;
			}

			// 如果槽位上有值,并且key已经被GC回收了,触发探测式清理,清理掉过时的条目
			if (k == null) {
    
    
				replaceStaleEntry(key, value, i);
				return;
			}
		}

		// 找到空的槽位,将key和value插入此槽位
		tab[i] = new Entry(key, value);
		int sz = ++size;
		// 触发清理,并判断如果清理后的size达到了阈值,则进行rehash进行扩容
		if (!cleanSomeSlots(i, sz) && sz >= threshold)
			rehash();
	}

	---------------------------------------------------------
	// 定向寻址,寻找下一个位置,如果到了最后,则再从0下标开始
	private static int nextIndex(int i, int len) {
    
    
		return ((i + 1 < len) ? i + 1 : 0);
	}

Is it very similar to HashMap in structure? Find the position in the array subscript through Hash operation and insert it. The difference is that the way HashMap resolves Hash conflicts is a linked list/red-black tree, while ThreadLocalMap uses it 开放定址法. (The specific algorithm logic of cleaning up the items whose keys are recycled will not be introduced here, interested students can go to the source code.)

Magical 0x61c88647

We can see a value 0x61c88647. Whenever a ThreadLocal object is created, the hashCode increment is this value. This is a very special value. It is 0.618 times the Integer signed integer, which is the golden ratio and the Fibonacci sequence. . The advantage of using this number for hash increment is that the hash distribution is very even.

Use code to demonstrate:

	public static void main(String[] args) throws IOException {
    
    

        threadLocalHashTest(16);
        System.out.println("-------------------------------------------");
        threadLocalHashTest(32);

    }

    public static void threadLocalHashTest(int n){
    
    
        int HASH_INCREMENT = 0x61c88647;
        int nextHashCode = HASH_INCREMENT;
        for(int i=0; i<n; i++ ){
    
    
            System.out.print((nextHashCode & (n-1)));
            System.out.print(" ");
            nextHashCode += HASH_INCREMENT;
        }

    }

The output is:

7 14 5 12 3 10 1 8 15 6 13 4 11 2 9 0
7 14 21 28 3 10 17 24 31 6 13 20 27 2 9 16 23 30 5 12 19 26 1 8 15 22 29 4 11 18 25 0

ThreadLocalMap expansion mechanism

At the end of the ThreadLocalMap.set() method, if the number of entries in the current hash array has reached the expansion threshold of the list (sz >= threshold) after the cleaning is performed, the rehash expansion logic will be executed. The rehash method still performs cleaning work at the beginning, and clears out the entries whose key is null. After cleaning, it judges whether the current number of Entry has reached 3/4 of the threshold (size >= threshold - threshold / 4). If it is reached, Execute the resize method to perform a real capacity expansion operation, which doubles the capacity and recalculates the hash position.

When judging whether the rehash method needs to be executed, the judging basis is whether the threshold is reached, and when rehash internally judges whether the resize method needs to be executed again, the judging basis is whether it reaches 3/4 of the threshold. Why is this? The explanation given in the source code is: Use lower threshold for doubling to avoid hysteresis (use lower threshold doubled to avoid hysteresis).

	// 如果当前数组中的Entry数量已经大于等于阈值,执行rehash方法
	int sz = ++size;
	if (!cleanSomeSlots(i, sz) && sz >= threshold)
		rehash();

	---------------------------------------------------------
	// 阈值大小规则
	private void setThreshold(int len) {
    
    
		threshold = len * 2 / 3;
	}

	---------------------------------------------------------
	private void rehash() {
    
    
		// 清理过时条目,也就是key被GC回收掉的条目
		expungeStaleEntries();

		// Use lower threshold for doubling to avoid hysteresis
		// (使用较低的阈值以避免滞后)
		if (size >= threshold - threshold / 4)
			resize();
	}

	---------------------------------------------------------
	private void expungeStaleEntries() {
    
    
		Entry[] tab = table;
		int len = tab.length;
		for (int j = 0; j < len; j++) {
    
    
			Entry e = tab[j];
			// e.get得到的是key,如果key不存在,则进行清理
			if (e != null && e.get() == null)
				expungeStaleEntry(j);
		}
	}

	---------------------------------------------------------
	private void resize() {
    
    
		Entry[] oldTab = table;
		int oldLen = oldTab.length;
		// 新容量扩为之前的2倍
		int newLen = oldLen * 2;
		Entry[] newTab = new Entry[newLen];
		int count = 0;

		for (int j = 0; j < oldLen; ++j) {
    
    
			Entry e = oldTab[j];
			if (e != null) {
    
    
				ThreadLocal<?> k = e.get();
				if (k == null) {
    
    
					e.value = null; // Help the GC
				} else {
    
    
					int h = k.threadLocalHashCode & (newLen - 1);
					while (newTab[h] != null)
						h = nextIndex(h, newLen);
					newTab[h] = e;
					count++;
				}
			}
		}

		setThreshold(newLen);
		size = count;
		table = newTab;
	}

Detailed explanation of ThreadLocal get method


    public T get() {
    
    
        // 获取当前线程对象
        Thread t = Thread.currentThread();
        // 从线程对象t中获取ThreadLocalMap对象
        ThreadLocalMap map = getMap(t);
        if (map != null) {
    
    
        	// 通过key(当前ThreadLocal对象)寻找value
            ThreadLocalMap.Entry e = map.getEntry(this);
            if (e != null) {
    
    
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                return result;
            }
        }
        // 如果获取不到值,则初始化一个值
        return setInitialValue();
    }

	---------------------------------------------------------
	// 通过key(当前ThreadLocal对象)寻找value
	private Entry getEntry(ThreadLocal<?> key) {
    
    
		int i = key.threadLocalHashCode & (table.length - 1);
		Entry e = table[i];
		if (e != null && e.get() == key)
			return e;
		else
			return getEntryAfterMiss(key, i, e);
	}

	// 如果hash计算出的位置没有找到,则依据开放定址法去查找
	private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
    
    
		Entry[] tab = table;
		int len = tab.length;

		while (e != null) {
    
    
			ThreadLocal<?> k = e.get();
			if (k == key)
				return e;
			if (k == null)
				expungeStaleEntry(i);
			else
				i = nextIndex(i, len);
			e = tab[i];
		}
		return null;
	}

	---------------------------------------------------------
	// map中确实没有,则初始化一个值
    private T setInitialValue() {
    
    
        T value = initialValue();
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
        return value;
    }

	// ThreadLocal默认初始化返回null,可以自定义具体的返回值
    protected T initialValue() {
    
    
        return null;
    }

As shown in the above code, when we call the get method, we first obtain the current thread object t through Thread.currentThread(), and then obtain the ThreadLocalMap type variable map in the thread object t. If the map is not null, the current threadLocal object is the key To query, the query follows the principle of the open addressing method. If the location calculated by the current hash is not found, continue to search later. If the desired result is still not found from the map, a value is initialized according to the rewritten initialization method.

The initialization sample code is as follows:

public class ThreadLocalTest04_init {
    
    

    public static void main(String[] args) {
    
    

        ThreadLocal<String> tl = new ThreadLocal(){
    
    
            @Override
            protected String initialValue(){
    
    
                return "default value";
            }
        };

        System.out.println(tl.get());
    }
}

The output is:

default value

InheritableThreadLocal

A brief introduction to InheritableThreadLocal

When using ThreadLocal, the child thread cannot get the data saved by the parent thread through the set method. If you want to make the child thread also get it, you can use the InheritableThreadLocal class.

public class ThreadLocalTest03 {
    
    

    public static void main(String[] args) {
    
    
        ThreadLocal<String> threadLocal = new ThreadLocal<>();
        ThreadLocal<String> inheritableThreadLocal = new InheritableThreadLocal<>();
        threadLocal.set("threadLocal value");
        inheritableThreadLocal.set("inheritableThreadLocal value");

        new Thread(() ->  {
    
    
            System.out.println("子线程获取父线程threadLocal数据:" + threadLocal.get());
            System.out.println("子线程获取父线程inheritableThreadLocal数据:" + inheritableThreadLocal.get());
        }).start();
    }

}

The output is as follows:

Child thread gets parent thread threadLocal data: null
Child thread gets parent thread inheritableThreadLocal data: inheritableThreadLocal value

Through the example, we can see that the content set in the parent thread using InheritableThreadLocal can be obtained by the get method in the child thread.

Principle of InheritableThreadLocal

So, how to realize that the child thread can get the content saved by the parent thread? Let's analyze the principle.

First of all, in the thread class Thread, there are two ThreadLocalMap member variables, one is used to store common ThreadLocal related information, and the other is used to store InheritableThreadLocal related information.

	// 用来保存ThreadLocal相关信息
	ThreadLocal.ThreadLocalMap threadLocals = null;
	
	// 用来保存InheritableThreadLocal相关信息
    ThreadLocal.ThreadLocalMap inheritableThreadLocals = null;

When the InheritableThreadLocal object calls the set method to save information, it calls the set method of the parent ThreadLocal object, as follows:

    public void set(T value) {
    
    
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
    }

Among them, the getMap method and the createMap method have been rewritten by the InheritableThreadLocal object:

public class InheritableThreadLocal<T> extends ThreadLocal<T> {
    
    

    protected T childValue(T parentValue) {
    
    
        return parentValue;
    }

    ThreadLocalMap getMap(Thread t) {
    
    
       return t.inheritableThreadLocals;
    }

    void createMap(Thread t, T firstValue) {
    
    
        t.inheritableThreadLocals = new ThreadLocalMap(this, firstValue);
    }
}

Seeing this, we know the information about InheritableThreadLocal, which is stored in the member variable of the Thread thread object ThreadLocal.ThreadLocalMap inheritableThreadLocals = null;. However, it has not been explained clearly, how can the sub-thread get it?

This key point is new Thread()in :

	public Thread() {
    
    
        init(null, null, "Thread-" + nextThreadNum(), 0);
    }

	---------------------------------------------------------
    private void init(ThreadGroup g, Runnable target, String name,
                      long stackSize) {
    
    
        init(g, target, name, stackSize, null);
    }

	---------------------------------------------------------
	// 为了直观,这里我用省略号代替了其它的一些逻辑
    private void init(ThreadGroup g, Runnable target, String name,
                      long stackSize, AccessControlContext acc) {
    
    
        ......

        Thread parent = currentThread();
        
		......
		
        if (parent.inheritableThreadLocals != null)
            this.inheritableThreadLocals =
                ThreadLocal.createInheritedMap(parent.inheritableThreadLocals);
		
		......
    }

Here, when creating a child thread, the current thread object (that is, the parent thread object) will be obtained, and then the data in the inheritableThreadLocals member variable of the current thread object will be copied to the child thread to be created. So you can get the content in the child thread.

Notes on InheritableThreadLocal

  • Generally, we use the thread pool for asynchronous processing. InheritableThreadLocal is assigned by the init() method in new Thread, and the thread pool is the logic of thread reuse, so there will be problems here.
  • To pass ThreadLocal to child threads when using components that cache threads such as thread pools, you can use Alibaba open source components TransmittableThreadLocal.
  • The data in the child thread is copied from the parent thread, so the content reset in the child thread is not visible to the parent thread.

ThreadLocal application case

Manage database connections.

  If method a of class A calls method b of class B and method c of class C, method a starts the transaction, method b and method c will operate the database. We know that in order to implement transactions, the database connection used in method b and method c must be the same connection, so how can we realize that the same database connection is used? The answer is to manage through ThreadLocal.

MDC log link tracking.

  MDC (Mapped Diagnostic Contexts) is mainly used to save the context parameters of each request. At the same time, %X{key} can be directly used in the log output format to output the parameters in the context to each line of log. The preservation of context information is mainly achieved through ThreadLocal.
  If you want to print the global serial number transId in the log of each link of the transaction process, the process may involve multiple systems, multiple threads, and multiple methods. In some links, the global serial number cannot be passed as a parameter, so how can you get the transId parameter? Here is the use of the Threadlocal feature. When each system or thread receives a request, it will store the transId in ThreadLocal, and when outputting the log, obtain the transId and print it. In this way, we can query the log information of the whole link in the log file through the transId.

Notes on using ThreadLocal

Memory leak or dirty data. When we use threads, most of the cases will be managed through the thread pool, so that some threads will not be destroyed after use. If our ThreadLocal does not execute the remove method, the saved data will always exist, resulting in memory leakage. If our ThreadLocal object is also a static constant at this time, then the next time the thread is used, it is likely to get the previously saved data, resulting in dirty data. Therefore, when using ThreadLocal, be sure to call the remove method at the end.


END

Guess you like

Origin blog.csdn.net/daidaineteasy/article/details/106202323
Recommended