Multi-map in-depth analysis of ThreadLocal principle

Previous articles:

Java multithreaded programming - (1) - thread safety and lock Synchronized concept

Java multithreaded programming - (2) - reentrant locks and other basic features of Synchronized

Java multi-threaded programming - (3) - Introduction and use of thread local ThreadLocal

Java multi-thread programming - (4) - introduction and use of inter-thread communication mechanism

Java multi-threaded programming - (5) - using Lock objects to achieve synchronization and inter-thread communication

Java multithreaded programming - (6) - two commonly used thread counters CountDownLatch and cycle barrier CyclicBarrier

Java multithreaded programming - (7) - use thread pool to realize thread reuse and avoid some pits


ThreadLocal can be said to be a frequent visitor to written test interviews. Every interview will basically ask about the principle of ThreadLocal and the problem of OOM memory overflow caused by improper use. It is worth spending time to study its principle carefully. This article mainly learns the principle of ThreadLocal, and in the next article, we will deeply understand the principle and best practice of OOM memory overflow.

Write picture description here

ThreadLocal is easy for people to take it for granted that it is a "local thread". ThreadLocalVariableIn fact, ThreadLocal is not a Thread, but a local variable of Thread, maybe it is easier to understand by naming it .

When using ThreadLocal to maintain variables, ThreadLocal provides an independent copy of the variable for each thread that uses the variable, so each thread can change its own copy independently without affecting the corresponding copies of other threads.

The role of ThreadLocal is to provide local variables in the thread, which work within the life cycle of the thread, reducing the complexity of passing some public variables between multiple functions or components in the same thread.

From the thread's point of view, the target variable is like a thread's local variable, which is what "Local" in the class name means.

All ThreadLocal methods and inner classes

All ThreadLocal methods and internal class structures are as follows:

Write picture description here

ThreadLocal has four public methods, namely: get, set, remove, intiValue :

Write picture description here

In other words, we usually care about these four methods when we use them.

How does ThreadLocal maintain a copy of variables for each thread?

In fact, the implementation idea is very simple: there is a statically declared Map in the ThreadLocal class, which is used to store the variable copy of each thread. The key of the element in the Map is the thread object, and the value corresponds to the variable copy of the thread. We can provide a simple implementation version ourselves:

public class SimpleThreadLocal<T> {

    /**
     * Key为线程对象,Value为传入的值对象
     */
    private static Map<Thread, T> valueMap = Collections.synchronizedMap(new HashMap<Thread, T>());

    /**
     * 设值
     * @param value Map键值对的value
     */
    public void set(T value) {
        valueMap.put(Thread.currentThread(), value);
    }

    /**
     * 取值
     * @return
     */
    public T get() {
        Thread currentThread = Thread.currentThread();
        //返回当前线程对应的变量
        T t = valueMap.get(currentThread);
        //如果当前线程在Map中不存在,则将当前线程存储到Map中
        if (t == null && !valueMap.containsKey(currentThread)) {
            t = initialValue();
            valueMap.put(currentThread, t);
        }
        return t;
    }

    public void remove() {
        valueMap.remove(Thread.currentThread());
    }

    public T initialValue() {
        return null;
    }

    public static void main(String[] args) {

        SimpleThreadLocal<List<String>> threadLocal = new SimpleThreadLocal<>();

        new Thread(() -> {
            List<String> params = new ArrayList<>(3);
            params.add("张三");
            params.add("李四");
            params.add("王五");
            threadLocal.set(params);
            System.out.println(Thread.currentThread().getName());
            threadLocal.get().forEach(param -> System.out.println(param));
        }).start();

        new Thread(() -> {
            try {
                Thread.sleep(1000);
                List<String> params = new ArrayList<>(2);
                params.add("Chinese");
                params.add("English");
                threadLocal.set(params);
                System.out.println(Thread.currentThread().getName());
                threadLocal.get().forEach(param -> System.out.println(param));
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }).start();
    }
} 1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768

operation result:

Write picture description here

Although the ThreadLocal implementation version in the above code list is relatively simple and rough, its purpose is mainly to present the implementation ideas of the ThreadLocal class provided in the JDK.

ThreadLocal source code analysis

1. The location of thread local variables in Thread

Since it is a thread local variable , it should of course be stored in its own thread object. We can find the place where the thread local variable is stored from the source code of Thread:

public class Thread implements Runnable {
    /* Make sure registerNatives is the first thing <clinit> does. */
    private static native void registerNatives();
    static {
        registerNatives();
    }

    //省略其他代码

    /* ThreadLocal values pertaining to this thread. This map is maintained
     * by the ThreadLocal class. */
    ThreadLocal.ThreadLocalMap threadLocals = null;

    /*
     * InheritableThreadLocal values pertaining to this thread. This map is
     * maintained by the InheritableThreadLocal class.
     */
    ThreadLocal.ThreadLocalMap inheritableThreadLocals = null;
}12345678910111213141516171819

We can see that the thread local variable is stored in threadLocalsthe property of the Thread object, and threadLocalsthe property is an ThreadLocal.ThreadLocalMapobject.

ThreadLocalMap is a static internal class of ThreadLocal , as shown in the following figure:

Write picture description here

2. The relationship between Thread and ThreadLocalMap

The relationship between Thread and ThreadLocalMap, first look at the simple diagram below, you can see that Thread threadLocalsis ThreadLocalMap in ThreadLocal:

Write picture description here

At this point, you should be able to roughly feel the subtle relationship between the above three, and then look at a more complicated picture:

Write picture description here

It can be seen that there is one for each threadinstance ThreadLocalMap. In the above figure, 3 entries are stored in the ThreadLocalMap of a Thread. By default, a ThreadLocalMap initializes 16 entries, and each Entry object stores a ThreadLocal variable object.

To put it simply: there is only one ThreadLocalMap in a Thread, and there can be multiple ThreadLocal objects in a ThreadLocalMap, and one ThreadLocal object corresponds to an Entry in a ThreadLocalMap (that is to say: a Thread can be attached to multiple ThreadLocal objects) .

Looking at another picture on the Internet, it should be better understood, as shown below:

Write picture description here

The Map here is actually ThreadLocalMap.

3、ThreadLocalMap与WeakReference

ThreadLocalMapLiterally, it can be seen that this is a ThreadLocalmap that saves objects (in fact, it is used as a Key), but it is a ThreadLocal object that has been wrapped in two layers:

(1) The first layer of packaging is to use WeakReference<ThreadLocal<?>>to ThreadLocalturn the object into a weakly referenced object;

(2) The second layer of packaging is to define a special class Entry to extend WeakReference<ThreadLocal<?>>:

Write picture description here

Class Entry is obviously an entity that saves map key-value pairsThreadLocal<?> , which is the key, and the value of the thread local variable to be saved value. super(k)The constructor called WeakReferencemeans to ThreadLocal<?>convert the object into a weak reference object and use it as a key.

4. The constructor of ThreadLocalMap

Write picture description here

It can be seen that the implementation of the map ThreadLocalMap uses an array private Entry[] tableto store the entity of the key-value pair, the initial size is 16, and ThreadLocalMaphow to realize keythe mapping from to value:

int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);1

Use an staticatomic property of AtomicInteger nextHashCode, incremented each time HASH_INCREMENT = 0x61c88647, and then & (INITIAL_CAPACITY - 1)get private Entry[] tablethe index in the array .

public class ThreadLocal<T> {
    /**
     * ThreadLocals rely on per-thread linear-probe hash maps attached
     * to each thread (Thread.threadLocals and
     * inheritableThreadLocals).  The ThreadLocal objects act as keys,
     * searched via threadLocalHashCode.  This is a custom hash code
     * (useful only within ThreadLocalMaps) that eliminates collisions
     * in the common case where consecutively constructed ThreadLocals
     * are used by the same threads, while remaining well-behaved in
     * less common cases.
     */
    private final int threadLocalHashCode = nextHashCode();

    /**
     * The next hash code to be given out. Updated atomically. Starts at
     * zero.
     */
    private static AtomicInteger nextHashCode =
        new AtomicInteger();

    /**
     * The difference between successively generated hash codes - turns
     * implicit sequential thread-local IDs into near-optimally spread
     * multiplicative hash values for power-of-two-sized tables.
     */
    private static final int HASH_INCREMENT = 0x61c88647;

    /**
     * Returns the next hash code.
     */
    private static int nextHashCode() {
        return nextHashCode.getAndAdd(HASH_INCREMENT);
    }
    //省略其它代码
}1234567891011121314151617181920212223242526272829303132333435

In general, ThreadLocalMap is a collection similar to HashMap, except that it implements addressing by itself, and there is no put method in HashMap, but the set method and other differences.

The set method of ThreadLocal

Write picture description here

Since each thread instance has a ThreadLocalMap, when setting, first obtain the current thread according to Thread.currentThread(), and then call getMap(t) to obtain the ThreadLocalMap object according to the current thread t, if it is the first time to set the
value , the ThreadLocalMap object is a null value, so it will be initialized, that is, createMap(t,value)the method is called:

Write picture description here

That is to call the above-mentioned construction method for construction. Here, only a reference array of 16 elements is initialized, and 16 Entry objects are not initialized. But how many thread-local objects in a thread need to be saved, then how many Entry objects are initialized to save them.
At this point, we can think about why it is implemented in this way.

1. Why use ThreadLocalMap to save thread local objects?

The reason is that a thread may have many local objects. In this way, no matter how many local variables a thread owns, they are all stored in the same ThreadLocalMap. The initial size of ThreadLocalMap is 16 private Entry[] table. When it exceeds 2/3 of the capacity, it will be expanded.

Write picture description here

Then when returning to the situation where the map is not empty, map.set(this, value);the method will be called. We see that the reference of the current thread is used as the key to get it ThreadLocalMap, and then call to map.set(this, value);save it private Entry[] table:

    private void set(ThreadLocal<?> key, Object value) {

        // We don't use a fast path as with get() because it is at
        // least as common to use set() to create new entries as
        // it is to replace existing ones, in which case, a fast
        // path would fail more often than not.

        Entry[] tab = table;
        int len = tab.length;
        int i = key.threadLocalHashCode & (len - 1);

        for (Entry e = tab[i];
             e != null;
             e = tab[i = nextIndex(i, len)]) {
            ThreadLocal<?> k = e.get();

            if (k == key) {
                e.value = value;
                return;
            }

            if (k == null) {
                replaceStaleEntry(key, value, i);
                return;
            }
        }

        tab[i] = new Entry(key, value);
        int sz = ++size;
        if (!cleanSomeSlots(i, sz) && sz >= threshold)
            rehash();
    }1234567891011121314151617181920212223242526272829303132

It can be seen that set(T value)the method creates a ThreadLocalMap for each Thread object, and puts the value into the ThreadLocalMap, and the ThreadLocalMap is saved as a member variable of the Thread object. Then you can use the following figure to show the relationship of ThreadLocal when storing value.

Write picture description here

2. After understanding the general principle of the set method, we are studying a program as follows:

/**
  * 三个ThreadLocal
  */
private static ThreadLocal<String> threadLocal1 = new ThreadLocal<>();
private static ThreadLocal<String> threadLocal2 = new ThreadLocal<>();
private static ThreadLocal<String> threadLocal3 = new ThreadLocal<>();

//线程池变量指定一个线程
ExecutorService executorService = Executors.newFixedThreadPool(1);

executorService.execute(() -> {
    threadLocal1.set("123");
    threadLocal2.set("234");
    threadLocal3.set("345");
    Thread t = Thread.currentThread();
    System.out.println(Thread.currentThread().getName());
});
123456789101112131415161718

In this case, it is equivalent to a thread attached to three ThreadLocal objects . After executing the last set method, the debugging process is as follows:

Write picture description here

You can see that there are three objects in the table (Entry collection), and the values ​​of the objects are the object values ​​of the three threadLocals we set;

3. If you modify the code, change it to two threads:

private static final int THREAD_LOOP_SIZE = 2;

private static ThreadLocal<String> threadLocal1 = new ThreadLocal<>();
private static ThreadLocal<String> threadLocal2 = new ThreadLocal<>();
private static ThreadLocal<String> threadLocal3 = new ThreadLocal<>();

ExecutorService executorService = Executors.newFixedThreadPool(THREAD_LOOP_SIZE);

for (int i = 0; i < THREAD_LOOP_SIZE; i++) {
    executorService.execute(() -> {
        threadLocal1.set("123");
        threadLocal2.set("234");
        threadLocal3.set("345");
    });
}123456789101112131415

In this case, you can see the running debug diagram as follows:

Write picture description here

Then change to Thread2, check, due to multi-threading , thread 1 runs to the situation in the above picture, and thread 2 runs to the situation in the picture below, it can also be seen that they are different ThreadLocalMaps:

Write picture description here

If there are multiple threads and only one ThreadLocal variable is set, the result can be imagined, so I won't repeat it here!

In addition, there is one thing that needs to be reminded, the code is as follows:

private static final int THREAD_LOOP_SIZE = 1;
private static final int MOCK_DIB_DATA_LOOP_SIZE = 1000;
private static ThreadLocal<String> threadLocal = new ThreadLocal<>();

ExecutorService executorService = Executors.newFixedThreadPool(THREAD_LOOP_SIZE);

for (int i = 0; i < THREAD_LOOP_SIZE; i++) {
    for (int j = 0; j < MOCK_DIB_DATA_LOOP_SIZE; j++) {
        executorService.execute(() -> threadLocal.set(("123" + index).toString()));
    }
}1234567891011

operation result:

Write picture description here

It can be seen that the value of the ThreadLocal variable in this thread is always only one, that is, the previous value is overwritten! This is because the Entry object is keyed with the reference of the ThreadLocal variable, so the value before multiple assignments will be overwritten, so pay attention!

At this point, you should be able to clearly understand the relationship between Thread, ThreadLocal and ThreadLocalMap!

Get method of ThreadLocal

Write picture description here

After the analysis of the set method above, it should be much easier to understand the get method. First, get the ThreadLocalMap object. Since ThreadLocalMap uses the current ThreadLocal as the key, the parameter passed in is this, and then call the method to construct the index through this key getEntry(). Go to the table (Entry array) according to the index to find the thread local variable, find the Entry object according to the following, and then judge that the Entry object e is not empty and the reference of e is the same as the passed key, then return directly, if not found, call the getEntryAfterMiss()method . The call getEntryAfterMissindicates that the position directly hashed to is not found, then search down incrementally (circularly) along the hash table, starting from i, and keep looking down until an empty slot appears.

Write picture description here

ThreadLocal memory recovery

Two levels of automatic memory recovery involved in ThreadLocal:

1) Memory recovery at the ThreadLocal level:

Write picture description here

When the thread dies, all the local variables saved in the thread will be recycled. In fact, this refers to that in the Thread object of the thread will ThreadLocal.ThreadLocalMap threadLocalsbe recycled, which is obvious.

2) Memory recovery at the ThreadLocalMap level:

Write picture description here

If the thread can live for a long time, and there are many thread local variables saved by the thread (that is, there are many Entry objects), then it involves how to reclaim the memory of ThreadLocalMap during the life of the thread, otherwise, the more the Entry object If there are many, then the ThreadLocalMap will become larger and larger, taking up more and more memory. Therefore, for thread local variables that are no longer needed, the corresponding Entry objects should be cleaned up.

The way to use it is that the key of the Entry object is the packaging of the WeakReference. When the ThreadLocalMap private Entry[] tablehas been occupied by two-thirds threshold = 2/3(that is, the thread owns more than 10 local variables), it will try to recycle the Entry object. We You can see ThreadLocalMap.set()the following code in the method:

if (!cleanSomeSlots(i, sz) && sz >= threshold)
       rehash();12

cleanSomeSlots is to reclaim memory:

Write picture description here

A brief analysis of the OOM memory overflow problem that may be caused by ThreadLocal

We know that the ThreadLocal variable is maintained inside the Thread, so as long as our thread does not exit, the reference to the object will always exist. When the thread exits, the Thread class will do some cleaning work, which includes ThreadLocalMap, and the Thread calls the exit method as follows:

Write picture description here

However, when we use the thread pool, it means that the current thread may not exit (such as a fixed-size thread pool, the thread always exists). If this is the case, set some large objects to ThreadLocal (this large object is actually stored in the threadLocals property of Thread), so that memory overflow may occur.

One scenario is that if a thread pool is used and a fixed thread is set, a large object is stored in ThreadLocalMap when processing one business, and another thread is stored in a large object in ThreadLocalMap when processing another business, but this Since the thread is created by the thread pool, it will always exist and will not be destroyed. In this case, the objects stored in the ThreadLocalMap may not be used again when performing business before, but because the thread will not be closed, the Thread cannot be released. The ThreadLocalMap object in it causes memory overflow.

That is to say, when ThreadLocal is not used by the thread pool, there will be no memory leak under normal circumstances, but if the thread pool is used, it depends on the implementation of the thread pool. If the thread pool does not destroy the thread, then it will There is a memory leak. So when we use the thread pool, we must be careful when using ThreadLocal!

Summarize

Through the source code, we can see that each thread can independently modify its own copy without affecting each other, thus isolating threads and threads. It avoids security problems caused by threads accessing instance variables. At the same time, we can also draw the following conclusions:

(1) ThreadLocal is just a collection of ThreadLocalMap objects in the operation Thread;

(2) The ThreadLocalMap variable belongs to the internal properties of the thread, and different threads have completely different ThreadLocalMap variables;

(3) The value of the ThreadLocalMap variable in the thread is created when the ThreadLocal object performs set or get operations;

(4) The key to using the ThreadLocalMap of the current thread is to use the current ThreadLocal instance as the key to store the value;

(5) ThreadLocal mode completes data access isolation from at least two aspects, namely vertical isolation (different ThreadLocalMap between threads) and horizontal isolation (mutual isolation between different ThreadLocal instances);

(6) All local variables in a thread are actually stored in the same map attribute of the thread itself;

(7) When the thread dies, the thread local variables will automatically reclaim the memory;

(8) When the thread local variable is saved in the map through an Entry, the key of the Entry is a ThreadLocal wrapped by WeakReference, and the value is a thread local variable. The mapping from key to value is done through ThreadLocal.threadLocalHashCode & (INITIAL_CAPACITY - 1):

(9) When the local variables owned by the thread exceed 2/3 of the capacity (10 when the capacity is not expanded), it will involve the recycling of Entry in ThreadLocalMap;

For the problem of multi-thread resource sharing, the synchronization mechanism adopts the method of "exchanging time for space", while ThreadLocal adopts the method of "exchanging space for time". The former only provides a variable for different threads to queue up for access, while the latter provides a variable for each thread, so it can be accessed at the same time without affecting each other.


Reference article:

1、http://blog.csdn.net/shenlei19911210/article/details/50060223

2、http://www.cnblogs.com/digdeep/p/4510875.html


Article source:
Java multithreaded programming - (8) - multi-graph in-depth analysis of ThreadLocal principle

Guess you like

Origin blog.csdn.net/weixin_43811057/article/details/131735655