Android: Simple understanding and use of ThreadLocal

1. Background

For ThreadLocal, there are generally two usage scenarios in daily development:

  • Each thread needs one 独享的对象: such as Looper in Android, commonly used tool classes in the backend (such as SimpleDateFormat)
  • Each thread needs to be saved 全局变量: we all know that the Java server controller acts as the interface response entry, the Service processes the business logic, and the Repository provides the database CRUD data interface. Shared data such as user information obtained in the interceptor can be placed in ThreadLocal , there is no need to pass parameters layer by layer.

1.1. Background and problems

Generally speaking, 当某些数据是以线程为作用域并且不同线程具有不同的数据副本的时候,就可以考虑采用ThreadLocal.

  • For example, for Handler, it needs to obtain the Looper of the current thread. Obviously Looper的作用域就是线程, 不同线程具有不同的Looperat this time, the storage of Looper in the thread can be easily realized through ThreadLocal.
    • If ThreadLocal is not used, then the system will 必须提供一个全局的哈希表allow the Handler to find the Looper of the specified thread. In this way, a class similar to LooperManager must be provided, but the system does not do this but chooses ThreadLocal, which is the benefit of ThreadLocal.
  • Another usage scenario of ThreadLocal is 复杂逻辑下的对象传递, such as the delivery of listeners. Sometimes the tasks in a thread are too complicated. This may be manifested as 函数调用栈比较深以及代码入口的多样性, in this case, 需要监听器能够贯穿整个线程的执行过程what can we do at this time? In fact, ThreadLocal can be used at this time. Using ThreadLocal can make the listener exist as a global object in the thread, and the listener can be obtained only through the get method inside the thread.
    • If ThreadLocal is not used, then we can think of the following two methods:
      • The first method is to pass the listener in the function call stack in the form of parameters,
      • The second method is to use the listener as a static variable for thread access.

Both of the above methods have limitations.

  • The problem with the first method is that when the function call stack is very deep, it is almost unacceptable to pass the listener object through the function parameter, which makes the program design look bad.
  • The second method is acceptable, but this state is not scalable. For example, if two threads are executing at the same time, then you need to provide two static listener objects. If 10 threads are executing concurrently Woolen cloth? Provide 10 static listener objects? This is obviously incredible, but with ThreadLocal, each listener object is stored in its own thread, and there is no problem with method 2 at all.

1.2, each thread needs an exclusive object

To get the timestamp, we usually need to convert it into the corresponding date format through the SimpleDateFormat class. Suppose we have the following tool class:

public class DateUtils {
    
    

    public static String format(long milliSeconds) {
    
    
      	SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");
        return dateFormat.format(new Date(milliSeconds));
    }
}

Now we simulate a multi-threaded environment through the thread pool:

public class ThreadLocalTest2 {
    
    

    private static ExecutorService threadPool = Executors.newFixedThreadPool(5);

    public static void main(String[] args) {
    
    

        for (int i = 0; i < 10; i++) {
    
    
            int finalI = i;
            threadPool.submit(() -> {
    
    
                String result = DateUtils.format(finalI * 1000);
                System.out.println(result);
            });
        }

        threadPool.shutdown();
    }

}

The output after running is as follows:

1970-01-01 08:00:03
1970-01-01 08:00:00
1970-01-01 08:00:02
1970-01-01 08:00:04
1970-01-01 08:00:01
1970-01-01 08:00:05
1970-01-01 08:00:08
1970-01-01 08:00:06
1970-01-01 08:00:09
1970-01-01 08:00:07

Everything works fine now, but since the format method is called every time 创建一个新的SimpleDateFormat对象, this is unnecessary. We can have the following modification:

public class DateUtils {
    
    

    private static SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");

    public static String format(long milliSeconds) {
    
    
        return dateFormat.format(new Date(milliSeconds));
    }
}

Now run the code again:

1970-01-01 08:00:02
1970-01-01 08:00:02
1970-01-01 08:00:02
1970-01-01 08:00:02
1970-01-01 08:00:07
1970-01-01 08:00:02
1970-01-01 08:00:09
1970-01-01 08:00:09
1970-01-01 08:00:07
1970-01-01 08:00:07

Judging from the results, it is obvious that this writing method has problems, data duplication, and concurrency problems. So how to solve this problem? Next, it's our turn today's protagonist, ThreadLocal, to come on stage!

class DateUtils {
    
    

    private static ThreadLocal<SimpleDateFormat> threadLocal = ThreadLocal.withInitial(() -> new SimpleDateFormat("yyyy-MM-dd hh:mm:ss"));

    public static String format(long milliSeconds) {
    
    
        return threadLocal.get().format(new Date(milliSeconds));
    }
}

Now run again:

1970-01-01 08:00:00
1970-01-01 08:00:01
1970-01-01 08:00:03
1970-01-01 08:00:05
1970-01-01 08:00:06
1970-01-01 08:00:04
1970-01-01 08:00:09
1970-01-01 08:00:02
1970-01-01 08:00:07
1970-01-01 08:00:08

In this way, each thread does not interfere with each other, because the SimpleDateFormat object used by each thread entering the format() method is exclusive to the thread and does not interfere with each other.

1.2, each thread needs an exclusive object

Suppose we have a UserInfo class to represent user information:

class UserInfo {
    
    
    int id;

    public UserInfo(int id) {
    
    
        this.id = id;
    }
}

There is another UseInfoHolder class that holds ThreadLocal objects:

class UserInfoHolder {
    
    

    static final ThreadLocal<UserInfo> holder = new ThreadLocal<>();
}

Construct three services, respectively representing the processing logic:

class Service1 {
    
    


    public void process() {
    
    
        UserInfo userInfo = new UserInfo(1);
        UserInfoHolder.holder.set(userInfo);
        new Service2().process();
    }
}

class Service2 {
    
    

    public void process() {
    
    
        System.out.println("in Service2 : " + UserInfoHolder.holder.get().id);
        new Service3().process();
    }
}

class Service3 {
    
    

    public void process() {
    
    
        System.out.println("in Service3 : " + UserInfoHolder.holder.get().id);
    }
}

In Service1, we set the value for ThreadLocal in UserInfoHolder; in Service2 and Service3, we can directly obtain the set UserInfo object through ThreadLocal in UserInfoHolder, so as to achieve sharing.

Finally write the main test method:

public class ThreadLocalTest3 {
    
    
    public static void main(String[] args) {
    
    
        new Service1().process();
    } 
}

The result of the operation is as follows:

in Service2 : 1
in Service3 : 1

2. Principle of ThreadLocal

We can use the following figure to represent the relationship between Thread, ThreadLocal and ThreadLocalMap:
insert image description hereIn the figure above, we can find that the use of the entire ThreadLocal involves threads ThreadLocalMap, although we call ThreadLocal.set(value)methods externally,

But the essence is to pass the method in the thread ThreadLocalMap, set(key,value)so we can roughly guess that the get method is also passed through this situation ThreadLocalMap. Then let's take a look at ThreadLocal中set与get方法the specific implementation and the specific structure of ThreadLocalMap.

2.1. Instructions for use

ThreadLocal provides thread local variables. These variables differ from their normal counterparts, 即每一个线程访问自身的局部变量时,都有它自己的,独立初始化的副本. This variable is usually a private static field associated with the thread, such as for ID or transaction ID. After reading the introduction, you may still not understand its main functions. Simply draw a picture to help everyone understand.
insert image description here
It can be seen from the figure that through ThreadLocal, each thread can obtain the private variables inside its own thread. Let's introduce it in detail through specific examples and look at the following code.

class ThreadLocalTest {
    
    
	//会出现内存泄漏的问题,下文会描述
    private static ThreadLocal<String> mThreadLocal = new ThreadLocal<>();

    public static void main(String[] args) {
    
    
        mThreadLocal.set("线程main");
        new Thread(new A()).start();
        new Thread(new B()).start();
        System.out.println(mThreadLocal.get());
    }

    static class A implements Runnable {
    
    

        @Override
        public void run() {
    
    
            mThreadLocal.set("线程A");
            System.out.println(mThreadLocal.get());
        }
    }

    static class B implements Runnable {
    
    

        @Override
        public void run() {
    
    
            mThreadLocal.set("线程B");
            System.out.println(mThreadLocal.get());
        }
    }
}

In the appeal code, we set the value of mThreadLocal to "thread main" in the main thread, set it to "thread A" in thread A, and set it to "thread B" in thread B. The printed results of the running program are as follows:

main
线程A
线程B

It can be seen from the above results that although they are 不同的线程accessed in 同一个变量mThreadLocal, they pass ThreadLocl 获取到的值却是不一样. It also verified that the picture we drew above is correct, so now, we already know the usage of ThreadLocal, so let's take a look at the internal principle.

2.2, ThreadLocal's set method

public void set(T value) {
    
    
        Thread t = Thread.currentThread();//获取当前线程
        ThreadLocalMap map = getMap(t);//拿到线程的LocalMap
        if (map != null)
            map.set(this, value);//设值 key->当前ThreadLocal对象。value->为当前赋的值
        else
            createMap(t, value);//创建新的ThreadLocalMap并设值
    }

When the method is called set(T value), the method will get the current thread internally ThreadLocalMap, and judge after getting it.

  • If it is not empty, the method ThreadLocalMapto call set(where key is the current ThreadLocal object, and value is the currently assigned value).
  • Conversely, let the current thread create a new ThreadLocalMapand set value, where the specific codes of the getMap() and createMap() methods are as follows:
ThreadLocalMap getMap(Thread t) {
    
    
        return t.threadLocals;
    }
    
void createMap(Thread t, T firstValue) {
    
    
        t.threadLocals = new ThreadLocalMap(this, firstValue);
    }

All data operations in ThreadLocal are related to ThreadLocalMap in the thread. At the same time, let's take a look at the related code of ThreadLocalMap

2.3, ThreadLocalMap internal structure

insert image description here
ThreadLocalMap is one of ThreadLocal 静态内部类. The official comment is very comprehensive. Here I roughly translated it.ThreadLocalMap是为了维护线程私有值创建的自定义哈希映射。其中线程的私有数据都是非常大且使用寿命长的数据

The specific code of ThreadLocalMap is as follows:

static class ThreadLocalMap {
    
    
		//存储的数据为Entry,且key为弱引用
        static class Entry extends WeakReference<ThreadLocal<?>> {
    
    
            /** The value associated with this ThreadLocal. */
            Object value;

            Entry(ThreadLocal<?> k, Object v) {
    
    
                super(k);
                value = v;
            }
        }
        //table初始容量
        private static final int INITIAL_CAPACITY = 16;
      
        //table 用于存储数据
        private Entry[] table;
        
	    //负载因子,用于数组容量扩容
        private int threshold; // Default to 0
        
		//负载因子,默认情况下为当前数组长度的2/3
        private void setThreshold(int len) {
    
    
            threshold = len * 2 / 3;
        }
	    //第一次放入Entry数据时,初始化数组长度,定义扩容阀值,
        ThreadLocalMap(ThreadLocal<?> firstKey, Object firstValue) {
    
    
            table = new Entry[INITIAL_CAPACITY];//初始化数组长度为16
            int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
            table[i] = new Entry(firstKey, firstValue);
            size = 1;
            setThreshold(INITIAL_CAPACITY);//阀值为当前数组默认长度的2/3
        }

It can be seen from the code that although the official statement is that ThreadLocalMap is a hash table, it is different from the internal structure of hash tables such as HashMap that we traditionally know.

ThreadLocalMap is only maintained internally Entry[] table,数组. Among them Entry实体中对应的key为弱引用(we will explain why weak references are used below), 在第一次放入数据时,会初始化数组长度(16), define the array expansion threshold (2/3 of the current default array length).

2.3.1. The set() method of ThreadLocalMap

private void set(ThreadLocal<?> key, Object value) {
    
    

		    //根据哈希值计算位置
            Entry[] tab = table;
            int len = tab.length;
            int i = key.threadLocalHashCode & (len-1);
            
            //判断当前位置是否有数据,如果key值相同,就替换,如果不同则找空位放数据。
            for (Entry e = tab[i];
                 e != null;
                 e = tab[i = nextIndex(i, len)]) {
    
    //获取下一个位置的数据
                ThreadLocal<?> k = e.get();
			//判断key值相同否,如果是直接覆盖 (第一种情况)
                if (k == key) {
    
    
                    e.value = value;
                    return;
                }
			//如果当前Entry对象对应Key值为null,则清空所有Key为null的数据(第二种情况)
                if (k == null) {
    
    
                    replaceStaleEntry(key, value, i);
                    return;
                }
            }
            //以上情况都不满足,直接添加(第三种情况)
            tab[i] = new Entry(key, value);
            int sz = ++size;
            if (!cleanSomeSlots(i, sz) && sz >= threshold)//如果当前数组到达阀值,那么就进行扩容。
                rehash();
        }

Through code analysis, it can be seen that ThreadLocalMap is set函数mainly divided into three main steps:

  • 1. Calculate the position of the current ThreadLocal in the table array, and then traverse backwards until the traversed Entry is null, then stop, traverse to the key of the Entry equal to the current threadLocal instance, and directly replace the value;
  • 2. If it is found that the Entry has expired (the key of the Entry is null), call the replaceStaleEntry function to replace it.
  • 3. After the traversal ends, if the two situations 1 and 2 do not appear, create a new Entry directly and save it to the position where there is no Entry at the end of the array.

2.3.1.1. In the first case, the Key values ​​are the same

If in the current array, if the key value of the Entry corresponding to the current position is the same as the key value of the newly added Entry, the overwriting operation is performed directly. The specific situation is shown in the figure below
insert image description here
if the current array. In the case of existence key值相同, the internal operation of ThreadLocal is 直接覆盖yes.

2.3.1.2. In the second case, if the Key value of the Entry corresponding to the current location is null

insert image description here
We can see from the figure. When we add a new Entry (key=19, value =200, index = 3), the old Entry already exists in the array (key = null, value = 19),

When this happens, the method will assign all the values ​​of the new Entry to the old Entry, and at the same time 将所有数组中key为null的Entry全部置为null(the big yellow data in the figure).

In the source code, when it is new Entry对应位置存在数据and key为nullthe situation is new, replaceStaleEntrythe method will be used. The specific code is as follows:

private void replaceStaleEntry(ThreadLocal<?> key, Object value,
                                       int staleSlot) {
    
    
            Entry[] tab = table;
            int len = tab.length;
            Entry e;

	        //记录当前要清除的位置
            int slotToExpunge = staleSlot;
            
            //往前找,找到第一个过期的Entry(key为空)
            for (int i = prevIndex(staleSlot, len);
                 (e = tab[i]) != null;
                 i = prevIndex(i, len))
                if (e.get() == null)//判断引用是否为空,如果为空,擦除的位置为第一个过期的Entry的位置
                    slotToExpunge = i;

		    //往后找,找到最后一个过期的Entry(key为空),
            for (int i = nextIndex(staleSlot, len);//这里要注意获得位置有可能为0,
                 (e = tab[i]) != null;
                 i = nextIndex(i, len)) {
    
    
                ThreadLocal<?> k = e.get();
                //在往后找的时候,如果获取key值相同的。那么就重新赋值。
                if (k == key) {
    
    
                	//赋值到之前传入的staleSlot对应的位置
                    e.value = value;
                    tab[i] = tab[staleSlot];
                    tab[staleSlot] = e;

                    //如果往前找的时候,没有过期的Entry,那么就记录当前的位置(往后找相同key的位置)
                    if (slotToExpunge == staleSlot)
                        slotToExpunge = i;
                        
                    //那么就清除slotToExpunge位置下所有key为null的数据
                    cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
                    return;
                }

			    //如果往前找的时候,没有过期的Entry,且key =null那么就记录当前的位置(往后找key==null位置)
                if (k == null && slotToExpunge == staleSlot)
                    slotToExpunge = i;
            }

            // 把当前key为null的对应的数据置为null,并创建新的Entry在该位置上
            tab[staleSlot].value = null;
            tab[staleSlot] = new Entry(key, value);

            //如果往后找,没有过期的实体, 
            //且staleSlot之前能找到第一个过期的Entry(key为空),
            //那么就清除slotToExpunge位置下所有key为null的数据
            if (slotToExpunge != staleSlot)
                cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
        }

replaceStaleEntryThe function is mainly divided into two traversals, with the currently expired Entry as the dividing line, one forward traversal and one backward traversal.

Four situations are mainly judged, and the specific situations are shown in the following chart: The data
insert image description herein the replaceStaleEntry method 清除key==null, and the specific methods are related expungeStaleEntry()to cleanSomeSlots()the methods.

2.3.1.3. In the third case, the current corresponding position is null

insert image description hereIn order to facilitate everyone and understand the situation of clearing the upper and lower data in the picture, I did not recalculate the position (I hope everyone pays attention!!!)

Seeing this, in order to facilitate everyone to avoid unnecessary reference to the code, I directly posted the code. code show as below.

tab[i] = new Entry(key, value);
int sz = ++size;
if (!cleanSomeSlots(i, sz) && sz >= threshold)
                rehash();

For the cleared key==nulldata, it is judged whether the length of the current data reaches the threshold (the default value is before expansion ), INITIAL_CAPACITY *2/3and INITIAL_CAPACITY = 16if it reaches the position of recalculating the data. Regarding the rehash() method, the specific code is as follows:

private void rehash() {
    
    
         expungeStaleEntries();

         // Use lower threshold for doubling to avoid hysteresis
         if (size >= threshold - threshold / 4)
                resize();
        }
        
 //清空所有key==null的数据
 private void expungeStaleEntries() {
    
    
         Entry[] tab = table;
         int len = tab.length;
         for (int j = 0; j < len; j++) {
    
    
             Entry e = tab[j];
             if (e != null && e.get() == null)
                 expungeStaleEntry(j);
            }
        }
 //重新计算key!=null的数据。新的数组长度为之前的两倍      
 private void resize() {
    
    
			//对原数组进行扩容,容量为之前的两倍
            Entry[] oldTab = table;
            int oldLen = oldTab.length;
            int newLen = oldLen * 2;
            Entry[] newTab = new Entry[newLen];
            int count = 0;
			//重新计算位置
            for (int j = 0; j < oldLen; ++j) {
    
    
                Entry e = oldTab[j];
                if (e != null) {
    
    
                    ThreadLocal<?> k = e.get();
                    if (k == null) {
    
    
                        e.value = null; // Help the GC
                    } else {
    
    
                        int h = k.threadLocalHashCode & (newLen - 1);
                        while (newTab[h] != null)
                            h = nextIndex(h, newLen);
                        newTab[h] = e;
                        count++;
                    }
                }
            }
			//重新计算阀值(负载因子)为扩容之后的数组长度的2/3
            setThreshold(newLen);
            size = count;
            table = newTab;
        }

It can be seen that when adding data, it will judge whether 扩容to operate, if expansion is required, it will 清除所有的key==null的数据, (that is, call expungeStaleEntries()the method, and recalculate the position in the data at the same time.

2.4, ThreadLocal's get() method

 public T get() {
    
    
        Thread t = Thread.currentThread();//获取当前线程
        ThreadLocalMap map = getMap(t);//拿到线程中的Map
        if (map != null) {
    
    
            //根据key值(ThreadLocal)对象,获取存储的数据
            ThreadLocalMap.Entry e = map.getEntry(this);
            if (e != null) {
    
    
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                return result;
            }
        }
        //如果ThreadLocalMap为空,创建新的ThreadLocalMap 
        return setInitialValue();
    }

In fact, the get method of ThreadLocal is actually very simple, which is to obtain the ThreadLocalMap object in the current thread, if not, create it, and if so, obtain the corresponding data according to the current key (current ThreadLocal object).

The method area is called internally ThreadLocalMap的getEntry()to obtain data, and we continue to look at the getEntry() method.

 private Entry getEntry(ThreadLocal<?> key) {
    
    
            int i = key.threadLocalHashCode & (table.length - 1);
            Entry e = table[i];
            if (e != null && e.get() == key)
                return e;
            else
                return getEntryAfterMiss(key, i, e);
        }

The interior of the getEntry() method is also very simple, and it is only based on the position calculated after the current key hash, to find whether there is data in the corresponding position in the array, if so, put the data back directly, if not, call the method, and we getEntryAfterMiss()continue look down.

 private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
    
    
            Entry[] tab = table;
            int len = tab.length;

            while (e != null) {
    
    
                ThreadLocal<?> k = e.get();
                if (k == key)//如果key相同,直接返回
                    return e;
                if (k == null)//如果key==null,清除当前位置下所有key=null的数据。
                    expungeStaleEntry(i);
                else
                    i = nextIndex(i, len);
                e = tab[i];
            }
            return null;//没有数据直接返回null
        }

From the above code, we can know that if it is obtained from the array, key==nullthe get method will also call expungeStaleEntry()the method internally to clear all key==nullthe data at the current position

ThreadLocal的set()还是get()方法That is to say, the data will be cleared no matter it is called now key==null.

3. ThreadLocal memory leak problem

Judging whether an object needs to be recycled in Java is all related to references. References in Java are divided into 4 categories.

  • 1. 强引用: As long as the reference exists, 垃圾回收器永远不会回收Object obj = new Object(); and such a strong reference of the obj object to the subsequent new Object, the object will be released only after the obj reference is released.
  • 2. 软引用: It is used to describe some objects that still exist but are not necessary. For objects associated with soft references, before the system is about to experience memory overflow exceptions, these objects will be included in the recycling range for the second recycling . (SoftReference)
  • 3. 弱引用: It is also used to describe non-essential objects, but its strength is weaker than soft references. Objects associated with weak references can only survive until the next garbage collection occurs. When the garbage collector is working, regardless of whether the current memory is sufficient, objects associated with weak references will be reclaimed. (WeakReference)
  • 4. 虚引用: Also known as ghost reference, it is the weakest kind of relationship. Whether an object has a reference will not affect its lifetime at all, and an instance object cannot be obtained through a virtual reference.

3.1. Why use weak references

If the key uses a strong reference, then when the object referencing the ThreadLocal is recycled, but the ThreadLocalMap still holds a strong reference to the ThreadLocal, if it is not manually deleted, the ThreadLocal will not be recycled, resulting in a memory leak.

3.2. Problems caused by weak references

We already know from the above that ThreadLocalMap uses the weak reference of ThreadLocal as the key, that is to say, if a ThreadLocal has no external strong reference to refer to it, then when the system GC, this ThreadLocal is bound to be recycled. In this way, ThreadLocalMapit will appear in the key为null, Entryand there is no way to access key为nullthese Entry的value,

If the current thread does not end for a long time, there key为nullwill Entry的valuealways be a strong reference chain: Thread Ref(当前线程引用) -> Thread -> ThreadLocalMap -> Entry -> value, then these entries will never be recycled, resulting in memory leaks.

However, the designer has also taken this into consideration. When the get(), set(), and methods are called, all objects in remove()the thread will be cleared , and the entire setting will be set , so that the Entry and value can be recycled in the next recycling.ThreadLocalMapEntryKey为null的ValueEntrynull

4. Summary

  • 1. The essence is realized in ThreadLocalthe operation threadThreadLocalMap本地线程变量的存储
  • 2. ThreadLocalMapIt is 数组the method used to store data, which key(弱引用)points to the current ThreadLocalobject and valueis the set value
  • 3. ThreadLocalMeasures have been taken for memory leaks, and all threads in the thread will be cleared when the method is ThreadLocalcalled.get(),set(),remove()ThreadLocalMapkey为null的Entry
  • 4. When using ThreadLocal, we still need to pay attention to avoid using static. After ThreadLocaldistributing and using ThreadLocal, we must 根据当前线程的生命周期judge 是否需要手动and clean ThreadLocalMapup key==null的Entry.

reference

1. ThreadLocal of Android Handler mechanism
2. Advanced Android: ThreadLocal
3. Introduction to ThreadLocal for Android developers

Guess you like

Origin blog.csdn.net/JMW1407/article/details/129114943