ThreadLocal source code analysis---same thread data sharing, different thread data isolation

      Originally I wanted to talk about Thread in Java. After all, I wrote the state of join and Thread before. By the way, I want to talk about the following, but I encountered a ThreadLocal. There are many connections between the two. In order to thoroughly understand Thread , I still have to talk about ThreadLocal first, so I'm back here.

      Everyone should be familiar with ThreadLocal. When I was just getting familiar with Java, everyone needed to create a database connection to write data to the database. I wonder if you had any problems with the connection being occupied, \(^o^)/~, After checking the information, I found out that ThreadLocal should be used. Of course, now all tools are used directly, such as C3P0, druid and so on.

      ThreadLocal doesn't seem to be used much in normal times. If it is not for interviews, there may be many friends who don't want to look at it at all. In fact, ThreadLocal is very useful and the encapsulation is relatively classic.

      In terms of usage, I just said, connect to the database, and, I don’t know if you use PageHelper, this paging plug-in, its internal mechanism is also to use ThreadLocal to pass information such as page numbers, before performing SQL, these parameters are encapsulated into SQL In the statement, so as to realize the function of paging. Another example is the Spring framework. This should be familiar to everyone. The injected bean we use is singleton by default. When using this attribute to define a Bean, the IOC container creates only one Bean instance. The IOC container The same Bean instance is returned this time. So think about it, in the case of multithreading, is a Bean instance really okay? Will there be a security issue?

      If you look at the source code, you will find that this is actually the ThreadLocal used. The same thread data is shared, and different thread data is isolated, ensuring the uniqueness and security of each request.

     How does ThreadLocal implement this mechanism? Let's take a look.

public class LocalTest {
    private static final ThreadLocal<String> threadLocal = new ThreadLocal<>();

    public static void main(String[] args) throws Exception {
        Thread threadA = new Thread(new Runnable() {
            @Override
            public void run() {
                threadLocal.set("线程A");
                System.out.println("线程A中threadLocal:" + threadLocal);
                // to do sth
                try {
                    String s = threadLocal.get();
                    Thread.sleep(20);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }

        }, "线程A");
        /* 换一种写法,类似Runnable*/
        Thread threadB = new Thread(() -> {
            threadLocal.set("线程B");
            System.out.println("线程B中threadLocal:" + threadLocal);

            try {
                Thread.sleep(20);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            // to do sth
        }, "线程B");
        threadA.start();
        threadB.start();

       // threadLocal.set("666");
        System.out.println("线程main中threadLocal:" + threadLocal);
        String s = threadLocal.get();
        System.out.println("获取信息s:" + s);
    }
}

        This code is very simple. It creates a ThreadLocal, then starts two threads, assigns values ​​in ThreadLocal, and then obtains them in the main method. Obviously, no information can be obtained at this time. The value in ThreadLocal can only be obtained by this thread.

        Also, threadlocal is just a key. It does not have a so-called map and cannot store data. This map just uses a weakly referenced class ThreadLocalMap in ThreadLocal. As for this map, it is also stored in the thread, using ThreadLocal. When you go to get it, you are going to get an object property. There are multiple data in threadlocals in a thread, which means that there are multiple ThreadLocal stuffed values ​​in it.

      The following is to analyze the source code of ThreadLocal to take a look at the mechanism, but before looking at the source code, I still want to talk about why ThreadLocal is used? The simplest, can't I just pass in directly? In this way, logical processing can also be performed, and can’t I create a static CurrentHashMap? Pass parameters through Map.

       Yes, of course it’s no problem! ThreadLocal only plays a role of parameter passing.

      but:

      1 "Use parameter transfer directly, if it is a single method, it's okay, what if there are many methods that need to be used? What about calling between different Services? In order to adapt to the method, the required parameters must be added. First, the parameters of this method will be very bloated. Second, the coupling between the various businesses is too deep, not in line with our design specifications, and the changes are more cumbersome. , I think this should be the main reason for using ThreadLocal.

      2 "Using a CurrentHashMap is also no problem, but in this case you have to maintain this Map specifically, and this Map is not bound to the thread. What if other threads modify the data of this thread? Do you have to consider the issue of data security? Etc., etc.

      In short, the purpose of using ThreadLocal is to decouple, and the function is to share data in the same thread and isolate data in different threads.

     I have been entangled with ThreadLocal before. It's not the same when using pass-through parameters. I don't always want to understand it. If I know its function and read the source code, I feel more comfortable, O(∩_∩)O haha~.

     Okay, let's get to the point.

     

//threadLocal set方法 
public void set(T value) {
        // 获取当前线程
        Thread t = Thread.currentThread();
        // 获取当前线程的ThreadLocalMap
        ThreadLocalMap map = getMap(t);
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
    }

        This code is very simple, everyone can understand it, let's dive into it bit by bit. Not much to say about getting the current thread.

 ThreadLocalMap map = getMap(t);

        I still want to talk about this code. When I first saw this code, I thought it was also a get method of Map. I always thought that Thread had two Maps. In fact, when we enter this method, we will find that this is actually to obtain an attribute threadLocals in the current thread.

/*------------------threadLocal--------------------------------*/  
ThreadLocalMap getMap(Thread t) {
        return t.threadLocals;
    }

/*--------------------thread------------------------------*/
   
 ThreadLocal.ThreadLocalMap threadLocals = null;

 

There are pictures and the truth~

        Therefore, this code first obtains an attribute of a ThreadLocalMap in the thread, and then obtains the value inside. The set() method is the same, but obtains the ThreadLocalMap, which is equivalent to the get method of the attribute in the class. If the map is not empty, subsequent operations will be performed, otherwise it will be initialized. 

       Let's continue to look at the set method. The ThreadLocalMap is empty when it is set for the first time, and a new ThreadLocalMap needs to be created. When you are debugging, it is recommended to create a new thread and use ThreadLocal, because the main method will stuff things into the ThreadLocalMap when it is started. When we operate again, this map is not empty.

//ThreadLocalMap 构造方法
ThreadLocalMap(ThreadLocal<?> firstKey, Object firstValue) {
            // 容量初始化16
            table = new Entry[INITIAL_CAPACITY];
            //使用&,类似于求余数,不过速度快很多
            int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
            // 新建一个entry对象
            table[i] = new Entry(firstKey, firstValue);
            size = 1;
            //设置threshold
            setThreshold(INITIAL_CAPACITY);
        }

//------------------------------------------------------------
 //负载因子是2/3,不是HashMap的0.75哦
   private void setThreshold(int len) {
            threshold = len * 2 / 3;
        }

        This is a parameterized construction method, everyone should understand it, just initialize some parameters, and then encapsulate the key and value into an entry, put it in the table, and save the data in the same time as the initialization.

       //set 方法
        private void set(ThreadLocal<?> key, Object value) {
            
            Entry[] tab = table;
            int len = tab.length;
            int i = key.threadLocalHashCode & (len-1);
            /* 遍历,线性探测法,就是hash定位的i是不为空,则进行循环 */
            for (Entry e = tab[i];e != null;e = tab[i = nextIndex(i, len)]) {
                // 获取当前位置的key
                ThreadLocal<?> k = e.get();
                // key 相同,value直接替换,然后返回
                if (k == key) {
                    e.value = value;
                    return;
                }
                // key 为空,e又不为空,那就是value不为空,key明显是被垃圾回收了,value设置为 
                //空,返回
                if (k == null) {
                    replaceStaleEntry(key, value, i);
                    return;
                }
            }
            // 为空直接设置就行
            tab[i] = new Entry(key, value);
            int sz = ++size;
            // rehash
            if (!cleanSomeSlots(i, sz) && sz >= threshold)
                rehash();
        }

          This is the set method of ThreadLocal. The key is a weak reference. Why use a weak reference? Because if the common key-value form is used here to define the storage structure, it will essentially cause the life cycle of the node to be strongly bound to the thread. As long as the thread is not destroyed, the node is always reachable in the GC analysis and cannot be Recycling, and the program itself cannot determine whether the node can be cleaned up. Weak references are the third of the four references in Java. They are weaker than soft references. If an object is not reachable by a strong reference chain, it will generally not survive the next GC. When a ThreadLocal is no longer reachable by a strong reference, as it is garbage collected, the corresponding Entry key value in the ThreadLocalMap will become invalid, which facilitates the garbage cleaning of the ThreadLocalMap itself.

         The final effect of calling replaceStaleEntry(key, value, i) is to put entry (key, value) in the position of i, or to create a new entry and put it in i. Look at the code below:

 private void replaceStaleEntry(ThreadLocal<?> key, Object value, int staleSlot) {
            Entry[] tab = table;
            int len = tab.length;
            Entry e;

            //需要去清除的默认i位置
            int slotToExpunge = staleSlot;
			// 往前遍历,找到Slot(槽)为null的位置,然后退出,就是为了找到null之后第一个key=null的槽然后赋值给slotToExpunge(翻译:需要清除的槽)
			// 为什么要这样做呢?因为这个清除都是以槽为null作为标识的,这样应该是防止当前位置(往前)到null的地方没有清除掉
            for (int i = prevIndex(staleSlot, len);(e = tab[i]) != null;i = prevIndex(i, len))
                if (e.get() == null)
                    slotToExpunge = i;

            //向后遍历,查找key的位置
            for (int i = nextIndex(staleSlot, len);(e = tab[i]) != null;i = nextIndex(i, len)) {
                ThreadLocal<?> k = e.get();
				// 找到了key的位置,因为ThreadLocalMap用的是线性探测寻址,位置不一定是求得的hash
                if (k == key) {
					//更新对应slot的value值,并与staleSlot进行互换位置,
                    e.value = value;

                    tab[i] = tab[staleSlot];
                    tab[staleSlot] = e;
					// 开始清理,如果在整个扫描过程中(包括函数一开始的向前扫描与i之前的向后扫描),找到了之前的无效slot则以那个位置作为清理的起点,
                    //否则则以当前的i作为清理起点
                    if (slotToExpunge == staleSlot)
                        slotToExpunge = i;
					//清理方法,这个是调用expungeStaleEntry,清理槽(slot)
                    cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
                    return;
                }

                // 如果当前的槽(slot)已经无效,并且向前扫描过程中没有无效槽(slot),则更新slotToExpunge为当前位置
				//  如果没有找到key,staleSlot这个位置的value会在下面置空,找到的话会和staleSlot互换位置(上面代码),然后这个staleSlot就到后面去了,(*^▽^*))
                if (k == null && slotToExpunge == staleSlot)
                    slotToExpunge = i;
            }

            // 呶,就在这吧staleSlot的value置空了
            tab[staleSlot].value = null;
            tab[staleSlot] = new Entry(key, value);

            // 没找到key,也是要做一次清除的哦
            if (slotToExpunge != staleSlot)
                cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
        }

           I have to say that the person who wrote the source code is cowhide (broken sound)! It's all-round and continuous. The idea of ​​threadlocal cleanup is to clean up with null as the node (of course, there are also places for global cleanup, which will be discussed below), so replaceStaleEntry is to first query the null slot before the key, and then start backward Traverse, compare whether there is a slot (slot) key equal to the passed key. After that, the garbage data is cleaned up.

        We have read it for a long time before we understand (╥╯^╰╥), let’s look at deleting the expired key:

  // 删除过时entry
  private int expungeStaleEntry(int staleSlot) {
            Entry[] tab = table;
            int len = tab.length;

            // 删除就是从key为null的地方开始的,这个是在传参的地方控制的
			// staleSlot ->过时的槽
            tab[staleSlot].value = null;
            tab[staleSlot] = null;
            size--;

            // 刷新Map,直到槽(slot)为null
            Entry e;
            int i;
            for (i = nextIndex(staleSlot, len);(e = tab[i]) != null;i = nextIndex(i, len)) {
                ThreadLocal<?> k = e.get();
				// 判断key==null,就是已经回收的数据,数据肯定要置空的
                if (k == null) {
                    e.value = null;
                    tab[i] = null;
                    size--;
                } else {
					//这里是重新进行对entry进行位置处理,看看能不能放到求hash的地方,不然就线性探测,重新放置entry,保证map一直是符合线性探测放置entry
					//因为一些key删除掉了,整个map的排序可能会变得比较混乱,这里重新进行放置。
                    int h = k.threadLocalHashCode & (len - 1);
                    if (h != i) {
                        tab[i] = null;
						// 循环找到下一个空的槽(slot),位置赋值给h,然后e,也就是tab[i]赋值给tab[h]---->线性探测法
                        while (tab[h] != null)
                            h = nextIndex(h, len);
                        tab[h] = e;
                    }
                }
            }
            return i;
        }

        We have also seen that the judgment condition for deletion is (e = tab[i]) != null, which means that the node is null, but it will be called multiple times. This method is relatively simple to see if the key is empty, and if it is empty, clear the data. Otherwise, use the linear detection method to rearrange the data to prevent disorder.

//清理一些槽
 private boolean cleanSomeSlots(int i, int n) {
            boolean removed = false;
            Entry[] tab = table;
            int len = tab.length;
            do {
                i = nextIndex(i, len);
                Entry e = tab[i];
				//清除操作
                if (e != null && e.get() == null) {
                    n = len;
                    removed = true;
					//循环调用
                    i = expungeStaleEntry(i);
                }
            } 
			// 这里应该是涉及到一个频率和分段的问题,初始化的话如果需要回收的少,那么四次也就结束,否则
			//  n = len; 会继续回收(哪位大神有了解可以还望指点下,感觉这样理解不全面)
			while ( (n >>>= 1) != 0); // n >>>= 1 无符号右移一位,直到为0
            return removed;
        }

  Then there is the rehash() method. Note that this is not an expansion, there are still operations before expansion.

 private void rehash() {
            //再清理一次
            expungeStaleEntries();

            // 计算是否需要resize
            if (size >= threshold - threshold / 4)
                resize();
        }

 

//这里就是全局清理了,就是一个遍历
private void expungeStaleEntries() {
            Entry[] tab = table;
            int len = tab.length;
            for (int j = 0; j < len; j++) {
                Entry e = tab[j];
                if (e != null && e.get() == null)
                    expungeStaleEntry(j);
            }
        }

 

 //扩容
 private void resize() {
            Entry[] oldTab = table;
            int oldLen = oldTab.length;
            int newLen = oldLen * 2;
            Entry[] newTab = new Entry[newLen];
            int count = 0;
            //循环
            for (int j = 0; j < oldLen; ++j) {
                Entry e = oldTab[j];
                if (e != null) {
                    ThreadLocal<?> k = e.get();
                    if (k == null) {
                        e.value = null; // Help the GC
                    } else {
                        int h = k.threadLocalHashCode & (newLen - 1);
                        while (newTab[h] != null)
                            h = nextIndex(h, newLen);
                        newTab[h] = e;
                        count++;
                    }
                }
            }
            //初始化
            setThreshold(newLen);
            size = count;
            table = newTab;
        }

Post this get method, there is nothing to say.

public T get() {
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null) {
            ThreadLocalMap.Entry e = map.getEntry(this);
            if (e != null) {
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                return result;
            }
        }
        return setInitialValue();
    }
// --------------------------------------------------------------------
private T setInitialValue() {
        // 初始化value,就是设置成null
        T value = initialValue();
        //获取属性,设置map,上面已经说过了
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
        return value;
    }

Then we look at the remove method.

 //remove方法
 private void remove(ThreadLocal<?> key) {
            Entry[] tab = table;
            int len = tab.length;
            int i = key.threadLocalHashCode & (len-1);
            for (Entry e = tab[i];e != null;e = tab[i = nextIndex(i, len)]) {
                if (e.get() == key) {
                    e.clear();
                    expungeStaleEntry(i);
                    return;
                }
            }
        }

        Here you can see that the traversal does not traverse all, but also traverses to the position of null, because ThreadLocalMap uses linear detection method, and at the same time, it also maintains its unique structure when clearing, so at most it will go to the next null! Big guys write code is rigorous!

        Let me explain one more problem at the end, if we create threadlocal like this

private static final ThreadLocal<String> threadLocal = new ThreadLocal<>();

Then, threadlocal has always been referenced in the method area, and the threads in the framework use the thread pool, not necessarily directly destroyed, so threadlocal may not be cleared, so you must remove it yourself to prevent data and expectations Inconsistent. As for the key of ThreadLocal being null, it should be the local variable threadlocal created in the method, so that threadlocal will be empty after the method runs.

Okay, the above is my understanding of threadlocal, if there are loopholes, please correct me~

No sacrifice,no victory~

 

Guess you like

Origin blog.csdn.net/zsah2011/article/details/109648019