Chapter 18 ThreadLocal Source Code Analysis of Java Threads

The ancients said

If you don't accumulate steps, you can't reach a thousand miles.
If there is no accumulation of small currents, there is no way to become a river.
Although we feel that life in the current era is becoming more and more difficult, although the world is full of: "Still believe that as long as you work hard, you will succeed?".
From the perspective of history, our current era is just a drop in the ocean. Those wisdoms that have been accumulated and passed down through thousands of years are still the correct direction in life.
Hello everyone, I am a member of the millions of programmers in China, and I am also a programmer who was born in IT training for non-computer majors. When I was confused and lost, I still firmly believed in the wisdom handed down by the ancestors:
Qier Let it go, the dead wood is not broken.
Perseverance, gold and stone can be pierced.

Preface

Before doing an analysis of an article, I always do a lot of homework. This article is the same. I first saw ThreadLocal, but I didn't understand its benefits. There are a lot of online articles with seemingly correct but incorrect conclusions, so I plan to figure it out by myself.
The role of ThreadLocal is a key in my understanding. The thread can be a storage container. When we need to store things in a thread object, we need to use ThreadLocal as a key, and the target value that needs to be stored is value. Similar to HashMap, the key-value pair composed of the key-value pair is stored in the thread.
At present, it can be understood that the advantage is that thread isolation, the focus is on the isolation of two words. We can store the target data in each thread for isolation protection, and realize the variable transfer between methods within the thread scope.

structure

This chapter will start from the source of the design, why design, how to design, and what are the advantages and disadvantages of the design scheme. How to deal with the disadvantages of design.
The general idea is: why do it (purpose), how to do it (practice), and how to do it best (method).

Container in thread

I currently understand that if a thread has its own container, then the data can be safely isolated and protected, and in a multi-threaded environment, it will not be interfered by other threads.
Therefore, in order for the thread to have a function of storing data, it is inevitable that there must be a storage container inside the thread, whether it is List, Map, etc. If you choose List, it is obviously complicated to control when different variables are associated with different types of objects. And Map can meet this characteristic demand well through the key-value characteristics. When the thread has the container function, then during the life cycle of the thread, no matter which method it progresses, the Map can be obtained and the value inside it can be taken out.

Insert picture description here
As shown in the figure above, every time a thread calls a method, it pushes a stack frame. These stack frames are in the same stack as the Map. Therefore, no matter which method the thread runs to, the Map object can be called. It solves the problem. The data transfer between the two isolates the concurrency safety issues in the multi-threaded environment.

Anyone who has used HashMap knows that it is composed of key-value pairs. Now that there is a Map in the thread, and we want to store a target value, what is the key to mark? ThreadLocal object is to solve this problem, a ThreadLocal object corresponds to a value.
So how can it be used? What are the advantages compared to traditional methods?

usage

If you want to solve the variable transfer between methods, there are two situations:
1. Variable transfer between methods in the same class
2. Variable transfer between methods in different classes

For the first problem, the traditional method can be solved by using member variables.

For example, in the above figure, if you want to share a variable in method 1 and method 2, then we can solve it by defining a member variable. Assign the value in method 1, and take the value in method 2.

This can be used if ThreadLocal is used.
Insert picture description here
This also implements the assignment in method 1, and the value in method 2.

For the second question, if it is to transfer data between methods in different classes, how do traditional methods need to be implemented?

Insert picture description here
In the existing call chain, we may need to pass the parameters down layer by layer until we reach the target method. If you say that using static static variables can also be solved, but this will undoubtedly produce multi-threaded safety problems.
How to achieve it if you use ThreadLocal?

At this time, you only need to get the threadLocal key, and you can get the data stored with this key in the current thread from other places.

For ThreadLocal, why is the official recommendation to use private decoration?
If it is modified with private, then the scope of value transfer in the life cycle of this thread is limited to this class. For other classes, it is impossible to get the value through this key. In this case, what else is there to create a member variable? What's the difference? I think its greater effect is that no matter which class it is in, as long as it is within the scope of the thread stack method call, it can be obtained.
So I think the use of private provides protection of variables, and to obtain the ThreadLocal object, we can write a get method to obtain it, and other methods of the entire thread stack can obtain the corresponding value by obtaining the ThreadLocal object.

The above has realized the usage and benefits of ThreadLocal. The variable value stored in the thread can be obtained anywhere, the variable transfer between methods is realized, and the absolute isolation between multiple threads is realized.
Let's analyze their composition. What does this map look like?

ThreadLocal.ThreadLocalMap

ThreadLocalMap is an internal class in the ThreadLoca class. It is a container with map characteristics specially designed for threads. This ThreadLoca can be compared with HashMap.

  1. Their bottom layer is implemented through arrays
  2. They are all located by the hash algorithm
  3. They all have the same advantages and disadvantages

The member variable ThreadLocal.ThreadLocalMap is held in the Thread class, so the thread realizes the function of the container.
So what are the implementation details of this container?

一、ThreadLocalMap

This class is an internal class of ThreadLocal, but ThreadLocal does not hold this variable. It means that you produce it without holding it.
First look at the variable composition of ThreadLocalMap.

	// 它有一个Entry数组
    private Entry[] table;
	// 数组的初始化容量是16,并且必须是2的幂
	private static final int INITIAL_CAPACITY = 16;
	// 这个变量计量着实际元素个数
    private int size = 0;
	// 和HashMap一样的负载因子,默认为0;
    private int threshold; 

From the point of view of variable composition, it is the same as the internal composition of HashMap. Both maintain an Entry array, but the two Entry arrays are not the same.

static class Entry extends WeakReference<ThreadLocal<?>> {
       
        Object value;

        Entry(ThreadLocal<?> k, Object v) {
            super(k);
            value = v;
        }
    }

It can be seen from the composition of Entry that it uses the ThreadLocal object as the Key and implements a weak reference. Object is the actual value to be stored.

Weak references and memory leaks

If it is not designed as a weak reference, then the Entry of the ThreadLocalMap in each thread will strongly reference the ThreadLocal object.
Insert picture description here
As shown in the figure above, it means that the life cycle of this ThreadLocal object is consistent with the life cycle of the thread that references it for the longest time. Often in the thread pool in actual applications, the core thread will be as long as the cycle of the application, which causes the Map in the thread to keep a reference to the ThreadLocal, and the ThreadLocal object cannot be released, causing a memory leak.
And if it is designed to be a weak reference, then no matter how long the life cycle of the thread is, it will not always keep a reference to the ThreadLocal object. The ThreadLocal object will be recycled when it should be recycled during the life cycle phase.
If the ThreadLocal object is recycled, then the references of these threads to it will become null, but the ThreadLocal object as the key corresponds to the Value value, but it is still in the thread. If the thread has been alive, it is also for the Value value. It is easy to cause memory leaks. (Thinking that many threads are mounted with a useless Value, it is terrible, it is strange that memory is not leaked!)

then what should we do?

remove() method

If the above problems may cause memory leaks, then this method was born to prevent memory leaks. See what it does.

 public void remove() {
     ThreadLocalMap m = getMap(Thread.currentThread());	// 获取当前线程中挂载的map
     if (m != null)
         m.remove(this);								// 调用ThreadLocalMap 的remove方法
 }

// 下述为m.remove(this)的调用方法
private void remove(ThreadLocal<?> key) {
        Entry[] tab = table;		// 获取map中哈希桶
        int len = tab.length;
        int i = key.threadLocalHashCode & (len-1);  // 算出当前ThreadLocal对象在这个map中的存放位置
        
        //下方循环中,为了解决哈希冲突,采用的线性查探法。所以定位到的位置上可能不是当前ThreadLocal对象
        对应的Entry。 所以加了判断如果该位置上的ThreadLocal对象内存地址和传入的key相等,才能确定位置。
        for (Entry e = tab[i];e != null;e = tab[i = nextIndex(i, len)]) {
            if (e.get() == key) {			
                e.clear();		
                expungeStaleEntry(i);		// 找到之后调用了这个方法
                return;
            }
        }
    }

// 继续往下看
private int expungeStaleEntry(int staleSlot) {
        Entry[] tab = table;
        int len = tab.length;

        // expunge entry at staleSlot
        tab[staleSlot].value = null;		// 这里释放了value对象。由于key是弱引用会自动收回,所以无需手动释放。
        tab[staleSlot] = null;				// 然后释放了entry结点。 
        size--;
	
	 	// 按理说可以结束了,但是下面它继续在释放。
        // Rehash until we encounter null
        Entry e;
        int i;
        // 这里循环遍历下一个结点,查看它的key是否为null,
        for (i = nextIndex(staleSlot, len);(e = tab[i]) != null;i = nextIndex(i, len)) {
            ThreadLocal<?> k = e.get();
            if (k == null) {		// 也就是如果下一个key的ThreadLocal对象已经被回收了,那么它的value也应该被回收
                e.value = null;
                tab[i] = null;
                size--;
            } else {
            // 如果下一个结点存在,则判断是否需要做位置校正。这是什么意思呢?
                int h = k.threadLocalHashCode & (len - 1);  
                if (h != i) {		// 如果这个结点计算出来的原定位为h,现在却是了i这个位置,说明这个结点是因为哈希冲突被顺延过来的。现在你前面的位置空出来了,你该回去了。 
                    tab[i] = null;		// 这里就说,这个位置腾出来吧,那腾出来了,这个Entry去哪儿呢 ?

                    // Unlike Knuth 6.4 Algorithm R, we must scan until
                    // null because multiple entries could have been stale.
                    // 于是这个entry又从原始位置h出开始重新定位,如果哈希冲突则顺延,不过它肯定不会再回到原来的位置上了。因为它原来位置前方肯定有空着的位置。
                    while (tab[h] != null)
                        h = nextIndex(h, len);
                    tab[h] = e;
                }
            }
        }

What can be summarized is what the remove method does.

  1. The value object bound to the current ThreadLocal object is released, and the corresponding Entry node in the map is released.
  2. Check whether there are other ThreadLocal null phenomena in the subsequent nodes at this position, and release it if it exists.
  3. If there is a node that is delayed due to hash conflict in the subsequent position, the node is relocated.

Why private static is recommended

This reason comes from modifying the ThreadLoad object with static. The official recommendation is that threadLocal instances should be private static, which can save memory consumption and avoid repeated ThreadLocal object generation. If the ThreadLoca object is a non-static variable, it means that the generation of every external object means the generation of a ThreadLocal. Thread calls to different external objects, the ThreadLocal accessed is not the same, unless the external object is a singleton. Of course, this will not cause any exceptions, but it will increase the memory overhead.
So if we decorate with static, only one ThreadLocal object will be generated, which is enough.
But this also has disadvantages. The static variable maintains a strong reference to the ThreadLocal object, and the static variable will always exist after it is created when the class is loaded. This will also cause the strong reference to always exist, which will cause memory leaks, even if the thread The Entry object in the Map maintains a weak reference to it. What does it mean?

Insert picture description here
Since the use of private static will cause memory leaks, why does the official suggest this? If this is done, what is the point of weak references in the official similar approach?
My personal opinion is that it is certain that this will cause a memory leak, but only one object is occupied, which can save memory more than generating many ThreadLocal objects. Another point is that the same effect can be achieved without using static modification. For example, when the external object is a singleton, there is only one ThreadLocal object. So designing as a weak reference still has its effect.

get() method

 public T get() {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);		// 获取线程内部的map
    if (map != null) {
        ThreadLocalMap.Entry e = map.getEntry(this);		// 如果不是null则获取结点
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            return result;		//返回value值
        }
    }
    return setInitialValue();  // 否初始化一个新的map,并创建结点,value初始值为null
}

set() method

 public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);		// 获取线程内部map
    if (map != null)
        map.set(this, value);		// 在现有map中添加结点
    else
        createMap(t, value);		// 或者初始化一个map
}

ThreadLocalMap features

(1) Only the array structure
ThreadMap maintains an array table, which means it has an array structure. The location of the node also relies on the Hash algorithm to locate. Unlike HashMap, the Entry node in ThreadLocalMap only has a key-value structure, and there is no reference to the front and back nodes, so it does not have a chain structure. So how does it deal with Hash conflicts?

(2) Linear
detection of the open address method The linear detection of the open address method adopted by ThreadLocalMap, if there is already a node in the position calculated by the hash positioning, then it is good to postpone it backward, if there are more later , Then delay again until it is found.

Is there no suitable place after searching for it?
No, because ThreadLocalMap also has a load factor, the default load factor is len * 2/3;

private void setThreshold(int len) {
        threshold = len * 2 / 3;
    }

After the expansion, how to hash the elements in the original hash bucket?

private void resize() {
        Entry[] oldTab = table;
        int oldLen = oldTab.length;
        int newLen = oldLen * 2;		// 扩容倍数被2倍
        Entry[] newTab = new Entry[newLen];
        int count = 0;

        for (int j = 0; j < oldLen; ++j) {
            Entry e = oldTab[j];
            if (e != null) {
                ThreadLocal<?> k = e.get();
                if (k == null) {
                    e.value = null; // Help the GC		// 再散列前,先释放无效的结点。
                } else {
                    int h = k.threadLocalHashCode & (newLen - 1);	// 用新长度对每个结点重新定位
                    while (newTab[h] != null)
                        h = nextIndex(h, newLen);
                    newTab[h] = e;
                    count++;
                }
            }
        }

        setThreshold(newLen);
        size = count;
        table = newTab;
    }

to sum up

This is probably the case for the entire ThreadLocal. If the explanation is incorrect, please give pointers. If you have any doubts, you can also discuss it together.

Guess you like

Origin blog.csdn.net/weixin_43901067/article/details/106504597