Understand the principle and usage scenarios of ThreadLocal in one article

foreword

Whether it is work or interview , we will ThreadLocaldeal with you, let’s chat with you today ThreadLocal~

  1. What is ThreadLocal? Why use ThreadLocal
  2. A ThreadLocal use case
  3. The principle of ThreadLocal
  4. Why not use thread id directly as the key of ThreadLocalMap
  5. Why does it cause a memory leak? Is it because of weak references?
  6. Why is Key designed as a weak reference? Can't strong reference?
  7. InheritableThreadLocal guarantees shared data between parent and child threads
  8. Application Scenarios and Notes for Using ThreadLocal

1. What is ThreadLocal? Why use ThreadLocal?

What is ThreadLocal?

ThreadLocal, a thread-local variable. If you create a ThreadLocalvariable, each thread that accesses this variable will have a local copy of the variable. When multiple threads operate on this variable, they are actually operating on the variable in their own local memory, thereby achieving thread isolation . Function, avoiding thread safety issues in concurrent scenarios.

// 创建一个ThreadLocal变量
static ThreadLocal<String> localVariable = new ThreadLocal<>();

Why use ThreadLocal

In a concurrent scenario, there will be a scenario where multiple threads modify a shared variable at the same time. This may present a linear security problem .

In order to solve linear security problems, you can use locking methods, such as using synchronizedor Lock. But the way of locking may cause the system to slow down. The schematic diagram of locking is as follows:

 

There is another solution, which is to use space for time, that is, use ThreadLocal. When using ThreadLocala class to access a shared variable, a copy of the shared variable is saved locally in each thread. When multiple threads modify a shared variable, they actually operate on a copy of the variable, thus ensuring linear safety.

 

2. A ThreadLocal use case

In daily development, ThreadLocalit often appears in the date conversion tool class. Let's take a look at a counterexample :

/**
 * 日期工具类
 */
public class DateUtil {

    private static final SimpleDateFormat simpleDateFormat =
            new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");

    public static Date parse(String dateString) {
        Date date = null;
        try {
            date = simpleDateFormat.parse(dateString);
        } catch (ParseException e) {
            e.printStackTrace();
        }
        return date;
    }
}

We run DateUtilthis utility class in a multi-threaded environment:

public static void main(String[] args) {
        ExecutorService executorService = Executors.newFixedThreadPool(10);

        for (int i = 0; i < 10; i++) {
            executorService.execute(()->{
                System.out.println(DateUtil.parse("2022-07-24 16:34:30"));
            });
        }
        executorService.shutdown();
    }

After running, I found an error:

If DateUtilyou add and run in the tool class ThreadLocal, you will not have this problem:

/**
 * 日期工具类
 */
public class DateUtil {

    private static ThreadLocal<SimpleDateFormat> dateFormatThreadLocal =
            ThreadLocal.withInitial(() -> new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));

    public static Date parse(String dateString) {
        Date date = null;
        try {
            date = dateFormatThreadLocal.get().parse(dateString);
        } catch (ParseException e) {
            e.printStackTrace();
        }
        return date;
    }

    public static void main(String[] args) {
        ExecutorService executorService = Executors.newFixedThreadPool(10);

        for (int i = 0; i < 10; i++) {
            executorService.execute(()->{
                System.out.println(DateUtil.parse("2022-07-24 16:34:30"));
            });
        }
        executorService.shutdown();
    }
}
复制代码

operation result:

Sun Jul 24 16:34:30 GMT+08:00 2022
Sun Jul 24 16:34:30 GMT+08:00 2022
Sun Jul 24 16:34:30 GMT+08:00 2022
Sun Jul 24 16:34:30 GMT+08:00 2022
Sun Jul 24 16:34:30 GMT+08:00 2022
Sun Jul 24 16:34:30 GMT+08:00 2022
Sun Jul 24 16:34:30 GMT+08:00 2022
Sun Jul 24 16:34:30 GMT+08:00 2022
Sun Jul 24 16:34:30 GMT+08:00 2022
Sun Jul 24 16:34:30 GMT+08:00 2022
复制代码

In the counterexample just now , why is it reporting an error? This is because it SimpleDateFormatis not linearly safe. When it appears as a shared variable, an error will be reported in a concurrent multithreading scenario.

Why is there no problem if you add it ThreadLocal? In the concurrent scenario, ThreadLocalhow is it guaranteed? Let's look at the core principles next ThreadLocal.

3. The principle of ThreadLocal

3.1 Memory structure diagram of ThreadLocal

In order to have a macro understanding, let's first look at ThreadLocalthe memory structure diagram below

From the memory structure diagram, we can see:

  • ThreadIn the class, there is a ThreadLocal.ThreadLocalMapmember variable of .
  • ThreadLocalMapAn array is maintained internally Entry, each Entryrepresenting a complete object, keywhich is ThreadLocalitself, and valueis ThreadLocala generic object value.

3.2 Key source code analysis

Compared with several key source codes, it is easier to understand~ Let’s go back to Threadthe class source code, and you can see that ThreadLocalMapthe initial value of the member variable isnull

public class Thread implements Runnable {
   // ThreadLocal.ThreadLocalMap是Thread的属性
   ThreadLocal.ThreadLocalMap threadLocals = null;
}

ThreadLocalMapThe key source code is as follows:

static class ThreadLocalMap {
    
    static class Entry extends WeakReference<ThreadLocal<?>> {
        /** The value associated with this ThreadLocal. */
        Object value;

        Entry(ThreadLocal<?> k, Object v) {
            super(k);
            value = v;
        }
    }
    //Entry数组
    private Entry[] table;
    
    // ThreadLocalMap的构造器,ThreadLocal作为key
    ThreadLocalMap(ThreadLocal<?> firstKey, Object firstValue) {
        table = new Entry[INITIAL_CAPACITY];
        int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
        table[i] = new Entry(firstKey, firstValue);
        size = 1;
        setThreshold(INITIAL_CAPACITY);
    }
}

ThreadLocalset()Key methods in the class :

 public void set(T value) {
        Thread t = Thread.currentThread(); //获取当前线程t
        ThreadLocalMap map = getMap(t);  //根据当前线程获取到ThreadLocalMap
        if (map != null)  //如果获取的ThreadLocalMap对象不为空
            map.set(this, value); //K,V设置到ThreadLocalMap中
        else
             createMap(t, value); //创建一个新的ThreadLocalMap
    }
   
     ThreadLocalMap getMap(Thread t) {
       return t.threadLocals; //返回Thread对象的ThreadLocalMap属性
    }

    void createMap(Thread t, T firstValue) { //调用ThreadLocalMap的构造函数
        t.threadLocals = new ThreadLocalMap(this, firstValue); this表示当前类ThreadLocal
    }
}
    

ThreadLocalget()key method in the class

    public T get() {
        Thread t = Thread.currentThread();//获取当前线程t
        ThreadLocalMap map = getMap(t);//根据当前线程获取到ThreadLocalMap
        if (map != null) { //如果获取的ThreadLocalMap对象不为空
            //由this(即ThreadLoca对象)得到对应的Value,即ThreadLocal的泛型值
            ThreadLocalMap.Entry e = map.getEntry(this);
            if (e != null) {
                @SuppressWarnings("unchecked")
                T result = (T)e.value; 
                return result;
            }
        }
        return setInitialValue(); //初始化threadLocals成员变量的值
    }
    
     private T setInitialValue() {
        T value = initialValue(); //初始化value的值
        Thread t = Thread.currentThread(); 
        ThreadLocalMap map = getMap(t); //以当前线程为key,获取threadLocals成员变量,它是一个ThreadLocalMap
        if (map != null)
            map.set(this, value);  //K,V设置到ThreadLocalMap中
        else
            createMap(t, value); //实例化threadLocals成员变量
        return value;
    }

So how to answer the implementation principle of ThreadLocal ? As follows, it is best to explain it together with the above structure diagram~

  • ThreadThe thread class has an ThreadLocal.ThreadLocalMapinstance variable of type threadLocals, that is, each thread has one of its own ThreadLocalMap.
  • ThreadLocalMapAn array is maintained internally Entry, each Entryrepresenting a complete object, keywhich is ThreadLocalitself, and valueis ThreadLocala generic value.
  • In the concurrent multi-threaded scenario, each thread Thread, when setting a value ThreadLocalin it , ThreadLocalMapstores it in its own memory, and uses a reference as a reference for each thread to find the corresponding value ThreadLocalin its own memory , thus achieving thread isolation .mapkey

After understanding these core methods, some friends may have doubts, ThreadLocalMapwhy should they be used ThreadLocalas keys? 线程IdIs it different to use it directly ?

4. Why not use the thread id directly as the key of ThreadLocalMap?

For example, the code is as follows:

public class TianLuoThreadLocalTest {

    private static final ThreadLocal<String> threadLocal1 = new ThreadLocal<>();
    private static final ThreadLocal<String> threadLocal2 = new ThreadLocal<>();
 
}

This scenario: a use class has two shared variables, that is to say, two ThreadLocalmember variables are used. If threads are idused , how to distinguish which member variable? Therefore, it still needs to be used as . Each object can be uniquely distinguished by attributes , and each ThreadLocal object can be uniquely distinguished by the name of the object ( example below ). Look at the code:ThreadLocalMapkeyThreadLocalThreadLocalKeyThreadLocalthreadLocalHashCodeThreadLocal

public class ThreadLocal<T> {
  private final int threadLocalHashCode = nextHashCode();
  
  private static int nextHashCode() {
    return nextHashCode.getAndAdd(HASH_INCREMENT);
  }
}

Then let's look at the next code example:

public class TianLuoThreadLocalTest {

    public static void main(String[] args) {
        Thread t = new Thread(new Runnable(){
            public void run(){
                ThreadLocal<TianLuoDTO> threadLocal1 = new ThreadLocal<>();
                threadLocal1.set(new TianLuoDTO("Hello 七度"));
                System.out.println(threadLocal1.get());
                ThreadLocal<TianLuoDTO> threadLocal2 = new ThreadLocal<>();
                threadLocal2.set(new TianLuoDTO("你好:ccc"));
                System.out.println(threadLocal2.get());
            }});
        t.start();
    }

}
// 运行结果
TianLuoDTO{name='Hello 七度'}
TianLuoDTO{name='你好:ccc'}

It might be clearer if you compare it with this picture:

5. Why does TreadLocal cause memory leaks?

5.1 What about memory leaks caused by weak references?

Let's take a look at the reference diagram of TreadLocal first:

Regarding the ThreadLocal memory leak, the more popular statement on the Internet is this:

ThreadLocalMapThe weak reference used as , when the variable is manually set , that is, there is no external strong reference to refer to it, when the system GC, ThreadLocalit must be recycled. In this case, there will be objects in the middle , and there is no way to access these objects . If the current thread does not end for a long time (such as the core thread of the thread pool), there will always be a strong reference chain for these objects : Thread variable -> Thread object -> ThreaLocalMap -> Entry -> value -> Object can never be recycled, causing a memory leak.keyThreadLocalnullThreadLocalThreadLocalThreadLocalMapkeynullEntrykeynullEntryvaluekeynullEntryvalue

When the ThreadLocal variable is manually set to nullthe reference chain diagram:

In fact, ThreadLocalMapthis situation has been taken into account in the design of . So some protective measures are also added: that is ThreadLocal, getthe set, , removemethod will clear ThreadLocalMapall keyactions in the thread .nullvalue

In the source code, it is reflected, such as ThreadLocalMapthe setmethod:

  private void set(ThreadLocal<?> key, Object value) {

      Entry[] tab = table;
      int len = tab.length;
      int i = key.threadLocalHashCode & (len-1);

      for (Entry e = tab[i];
            e != null;
            e = tab[i = nextIndex(i, len)]) {
          ThreadLocal<?> k = e.get();

          if (k == key) {
              e.value = value;
              return;
          }

           //如果k等于null,则说明该索引位之前放的key(threadLocal对象)被回收了,这通常是因为外部将threadLocal变量置为null,
           //又因为entry对threadLocal持有的是弱引用,一轮GC过后,对象被回收。
            //这种情况下,既然用户代码都已经将threadLocal置为null,那么也就没打算再通过该对象作为key去取到之前放入threadLocalMap的value, 因此ThreadLocalMap中会直接替换调这种不新鲜的entry。
          if (k == null) {
              replaceStaleEntry(key, value, i);
              return;
          }
        }

        tab[i] = new Entry(key, value);
        int sz = ++size;
        //触发一次Log2(N)复杂度的扫描,目的是清除过期Entry  
        if (!cleanSomeSlots(i, sz) && sz >= threshold)
          rehash();
    }

Such as the method of ThreadLocal get:

  public T get() {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        //去ThreadLocalMap获取Entry,方法里面有key==null的清除逻辑
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            return result;
        }
    }
    return setInitialValue();
}

private Entry getEntry(ThreadLocal<?> key) {
        int i = key.threadLocalHashCode & (table.length - 1);
        Entry e = table[i];
        if (e != null && e.get() == key)
             return e;
        else
          //里面有key==null的清除逻辑
          return getEntryAfterMiss(key, i, e);
    }
        
private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
        Entry[] tab = table;
        int len = tab.length;

        while (e != null) {
            ThreadLocal<?> k = e.get();
            if (k == key)
                return e;
            // Entry的key为null,则表明没有外部引用,且被GC回收,是一个过期Entry
            if (k == null)
                expungeStaleEntry(i); //删除过期的Entry
            else
                i = nextIndex(i, len);
            e = tab[i];
        }
        return null;
    }

5.2 The key is a weak reference, will GC recycling affect the normal work of ThreadLocal?

At this point, some friends may have doubts. Since ThreadLocalit keyis a weak reference , will the GC recycle keyit rashly, which will affect ThreadLocalthe normal use of it?

  • Weak references : Objects with weak references have a shorter lifetime. If an object has only weak references, the next GC will recycle the object (regardless of whether the current memory space is sufficient or not)

In fact, no, because ThreadLocal变量it is referenced, it will not be recycled by GC, unless it is manually removed ThreadLocal变量设置为null, we can run a demo to verify it:

  public class WeakReferenceTest {
    public static void main(String[] args) {
        Object object = new Object();
        WeakReference<Object> testWeakReference = new WeakReference<>(object);
        System.out.println("GC回收之前,弱引用:"+testWeakReference.get());
        //触发系统垃圾回收
        System.gc();
        System.out.println("GC回收之后,弱引用:"+testWeakReference.get());
        //手动设置为object对象为null
        object=null;
        System.gc();
        System.out.println("对象object设置为null,GC回收之后,弱引用:"+testWeakReference.get());
    }
}
运行结果:
GC回收之前,弱引用:java.lang.Object@7b23ec81
GC回收之后,弱引用:java.lang.Object@7b23ec81
对象object设置为null,GC回收之后,弱引用:null

The conclusion is, buddy let go of this doubt, haha~

5.3 Demo of ThreadLocal memory leak

Let me show you the next example of a memory leak. In fact, it is to use the thread pool to keep putting objects in it.

public class ThreadLocalTestDemo {

    private static ThreadLocal<TianLuoClass> tianLuoThreadLocal = new ThreadLocal<>();


    public static void main(String[] args) throws InterruptedException {

        ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(5, 5, 1, TimeUnit.MINUTES, new LinkedBlockingQueue<>());

        for (int i = 0; i < 10; ++i) {
            threadPoolExecutor.execute(new Runnable() {
                @Override
                public void run() {
                    System.out.println("创建对象:");
                    TianLuoClass tianLuoClass = new TianLuoClass();
                    tianLuoThreadLocal.set(tianLuoClass);
                    tianLuoClass = null; //将对象设置为 null,表示此对象不在使用了
                   // tianLuoThreadLocal.remove();
                }
            });
            Thread.sleep(1000);
        }
    }

    static class TianLuoClass {
        // 100M
        private byte[] bytes = new byte[100 * 1024 * 1024];
    }
}


创建对象:
创建对象:
创建对象:
创建对象:
Exception in thread "pool-1-thread-4" java.lang.OutOfMemoryError: Java heap space
	at com.example.dto.ThreadLocalTestDemo$TianLuoClass.<init>(ThreadLocalTestDemo.java:33)
	at com.example.dto.ThreadLocalTestDemo$1.run(ThreadLocalTestDemo.java:21)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

OOM appeared in the running result, tianLuoThreadLocal.remove();but after adding it, it will not OOM.

创建对象:
创建对象:
创建对象:
创建对象:
创建对象:
创建对象:
创建对象:
创建对象:
......
复制代码

We don't manually settianLuoThreadLocal the variable here null, but there will still be a memory leak . Because we use the thread pool, the thread pool has a long life cycle, so the thread pool will always hold the value tianLuoClassof the object value, even if the set tianLuoClass = null;reference still exists. It's like, you put each object objectinto a listlist, and then set it separately object, nullthe reason is the same, the object of the list still exists.

    public static void main(String[] args) {
        List<Object> list = new ArrayList<>();
        Object object = new Object();
        list.add(object);
        object = null;
        System.out.println(list.size());
    }
    //运行结果
    1

So the memory leak happened like this, and finally the memory was limited, so it was thrown out OOM. If we add it threadLocal.remove();, there will be no memory leak. why? Because threadLocal.remove();it will be cleared Entry, the source code is as follows:

    private void remove(ThreadLocal<?> key) {
      Entry[] tab = table;
      int len = tab.length;
      int i = key.threadLocalHashCode & (len-1);
      for (Entry e = tab[i];
          e != null;
          e = tab[i = nextIndex(i, len)]) {
          if (e.get() == key) {
              //清除entry
              e.clear();
            expungeStaleEntry(i);
            return;
        }
    }
}

Some friends said that since memory leaks are not necessarily due to weak references, why do they need to be designed as weak references? Let's explore:

6. Why should the Key of Entry be designed as a weak reference?

Through the source code, we can see Entrythat Keyit is designed as a weak reference ( the weak reference used ThreadLocalMap) . Why is it designed as a weak reference?ThreadLocalKey

 

Let's first recall the four references:

  • Strong reference : We usually newknow that an object is a strong reference. For example, Object obj = new Object();even in the case of insufficient memory, the JVM would rather throw an OutOfMemory error than recycle this object.
  • Soft references : If an object has only soft references, the memory space is sufficient, and the garbage collector will not reclaim it; if the memory space is insufficient, the memory of these objects will be reclaimed.
  • Weak references : Objects with weak references have a shorter lifetime. If only weak references exist for an object, the next GC will recycle the object (regardless of whether the current memory space is sufficient or not).
  • Phantom references : If an object holds only phantom references, it is as if it has no references and may be reclaimed by the garbage collector at any time. Phantom references are mainly used to track the activity of objects being reclaimed by the garbage collector.

Let's take a look at the official documentation first, why should it be designed as a weak reference:

To help deal with very large and long-lived usages, the hash table entries use WeakReferences for keys.
为了应对非常大和长时间的用途,哈希表使用弱引用的 key。

Let me move the reference diagram of ThreadLocal:

 

Let's discuss by situation:

  • If Keyyou use strong references: when ThreadLocalthe object is recycled, but ThreadLocalMapstill holds ThreadLocala strong reference, if it is not manually deleted, ThreadLocal will not be recycled, and the memory leak problem of Entry will occur.
  • If Keyyou use weak references: When ThreadLocalthe object is recycled, because ThreadLocalMapit holds a weak reference to ThreadLocal, ThreadLocal will be recycled even if it is not manually deleted. It will be cleared valueon the next ThreadLocalMapcall .set,get,remove

EntryTherefore, it can be found that using weak references Keycan provide an extra layer of protection: weak references ThreadLocalwill not easily leak memory, and the corresponding ones will be cleared valuein the next ThreadLocalMapcall .set,get,remove

In fact, the root cause of our memory leak was that memory that was no longer being used Entrywas not removed from the thread ThreadLocalMap. EntryGenerally, there are two ways to delete those that are no longer used :

  • One is, after using it ThreadLocal, call it manually remove()and Entry从ThreadLocalMapdelete it
  • Another way is: ThreadLocalMapthe automatic clearing mechanism to clear the expiration Entry. ( ThreadLocalMapEvery get(),set()time will trigger Entrythe clearing of the expiration)

7. InheritableThreadLocal guarantees shared data between parent and child threads

We know ThreadLocalthat threads are isolated, if we want parent and child threads to share data, how to do it? can be used InheritableThreadLocal. Let's take a look first demo:

public class InheritableThreadLocalTest {

   public static void main(String[] args) {
       ThreadLocal<String> threadLocal = new ThreadLocal<>();
       InheritableThreadLocal<String> inheritableThreadLocal = new InheritableThreadLocal<>();

       threadLocal.set("你好,七度");
       inheritableThreadLocal.set("你好,七度");

       Thread thread = new Thread(()->{
           System.out.println("ThreadLocal value " + threadLocal.get());
           System.out.println("InheritableThreadLocal value " + inheritableThreadLocal.get());
       });
       thread.start();
       
   }
}
// 运行结果
ThreadLocal value null
InheritableThreadLocal value 你好,七度

It can be found that in the child thread, the value of InheritableThreadLocal the type , but ThreadLocal the value of the type variable cannot be obtained.

We can understand that the value of the type cannot be obtained ThreadLocal , because it is thread-isolated. InheritableThreadLocal How is it done? What is the principle?

In Threada class, in addition to member variables threadLocals, there is another member variable: inheritableThreadLocals. These two types are the same:

public class Thread implements Runnable {
   ThreadLocalMap threadLocals = null;
   ThreadLocalMap inheritableThreadLocals = null;
 }

ThreadIn the method of the class init, there is a section of initialization settings:

 private void init(ThreadGroup g, Runnable target, String name,
                      long stackSize, AccessControlContext acc,
                      boolean inheritThreadLocals) {
      
        ......
        if (inheritThreadLocals && parent.inheritableThreadLocals != null)
            this.inheritableThreadLocals =
                ThreadLocal.createInheritedMap(parent.inheritableThreadLocals);
        /* Stash the specified stack size in case the VM cares */
        this.stackSize = stackSize;

        /* Set thread ID */
        tid = nextThreadID();
    }
 static ThreadLocalMap createInheritedMap(ThreadLocalMap parentMap) {
        return new ThreadLocalMap(parentMap);
    }

It can be found that when parent的inheritableThreadLocalsit is not nulldone , it will be parentassigned inheritableThreadLocalsto the previous thread inheritableThreadLocals. To put it bluntly, if the current thread inheritableThreadLocalsdoes not do anything null, copy one from the parent thread, similar to the other one ThreadLocal, but the data comes from the parent thread. Interested friends can study the source code~

8. Application Scenarios and Notes for Using ThreadLocal

ThreadLocalA very important point to note is that it needs to be called manually after use remove().

The ThreadLocalapplication scenarios mainly include the following:

  • Use the date tool class, when used SimpleDateFormat, use ThreadLocal to ensure linear safety
  • Globally store user information (user information is stored ThreadLocal, so the current thread can use it anywhere it needs it)
  • Guarantee the same thread, the obtained database connection Connectionis the same, use it ThreadLocalto solve the problem of thread safety
  • Use MDCto save log information.

Guess you like

Origin blog.csdn.net/gongzi_9/article/details/126754648