Explain ThreadLocal in detail

Reprinted from: https://www.cnblogs.com/zhangjk1993/archive/2017/03/29/6641745.html

For more, you can also read this: http://blog.csdn.net/jueshengtianya/article/details/15340311

foreword

ThreadLocal is mainly used to provide thread local variables, that is, variables are only visible to the current thread. This article mainly records the understanding of ThreadLocal. More articles on Java multithreading can be found  here .

thread local variable

In a multi-threaded environment, the concurrency problem occurs because different threads access the same shared variable at the same time, such as the following form

public class MultiThreadDemo {

    public static class Number {
        private int value = 0;

        public void increase() throws InterruptedException {
            value = 10;
            Thread.sleep(10);
            System.out.println("increase value: " + value);
        }

        public void decrease() throws InterruptedException {
            value = -10;
            Thread.sleep(10);
            System.out.println("decrease value: " + value);
        }
    }

    public static void main(String[] args) throws InterruptedException {
        final Number number = new Number();
        Thread increaseThread = new Thread(new Runnable() {
            @Override
            public void run() {
                try {
                    number.increase();
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        });

        Thread decreaseThread = new Thread(new Runnable() {
            @Override
            public void run() {
                try {
                    number.decrease();
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        });

        increaseThread.start();
        decreaseThread.start();
    }
}

In the above code, the increase thread and the decrease thread will operate the value in the same number, so the output result is unpredictable, because after the current thread modifies the variable but before the output, the variable may be modified by another thread. The following is a possible situation:

increase value: 10
decrease value: 10

One solution is to add the synchronized keyword to the  increase() and  decrease() method for synchronization. This approach is to wrap the value  assignment  and  printing  into an atomic operation, that is to say, both are performed at the same time or not at all. There will be no additional operations in between. Let's consider the problem from another angle. If the value only belongs to the increase thread or the decrease thread, instead of being shared by two threads, there will be no competition problem. A more common form is a local variable (this excludes the case where a local variable reference points to a shared object), as follows:

public void increase() throws InterruptedException {
    int value = 10;
    Thread.sleep(10);
    System.out.println("increase value: " + value);
}

No matter how the value is changed, it will not affect other threads, because every time the increase method is called, a value variable will be created, which is only visible to the thread that currently calls the increase method. With this idea, we can create a copy of a shared variable for each thread, the copy is only visible to the current thread (it can be considered a thread-private variable), then modifying the copy variable will not affect other threads . A simple idea is to use a Map to store a copy of each variable, using the id of the current thread as the key and the copy variable as the value. Here is an implementation:

public class SimpleImpl {

    public static class CustomThreadLocal {
        private Map<Long, Integer> cacheMap = new HashMap<>();

        private int defaultValue ;

        public CustomThreadLocal(int value) {
            defaultValue = value;
        }

        public Integer get() {
            long id = Thread.currentThread().getId();
            if (cacheMap.containsKey(id)) {
                return cacheMap.get(id);
            }
            return defaultValue;
        }

        public void set(int value) {
            long id = Thread.currentThread().getId();
            cacheMap.put(id, value);
        }
    }

    public static class Number {
        private CustomThreadLocal value = new CustomThreadLocal(0);

        public void increase() throws InterruptedException {
            value.set(10);
            Thread.sleep(10);
            System.out.println("increase value: " + value.get());
        }

        public void decrease() throws InterruptedException {
            value.set(-10);
            Thread.sleep(10);
            System.out.println("decrease value: " + value.get());
        }
    }

    public static void main(String[] args) throws InterruptedException {
        final Number number = new Number();
        Thread increaseThread = new Thread(new Runnable() {
            @Override
            public void run() {
                try {
                    number.increase();
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        });

        Thread decreaseThread = new Thread(new Runnable() {
            @Override
            public void run() {
                try {
                    number.decrease();
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        });

        increaseThread.start();
        decreaseThread.start();
    }
}

However, the above implementation will have the following problems:

  • The life cycle of the copy variable corresponding to each thread is not determined by the thread, but by the life cycle of the shared variable. In the above example, even if the thread finishes executing, as long as the  number variable exists, the thread's copy variable will still exist (stored in the cacheMap of number). But as a copy variable of a specific thread, the life cycle of the variable should be determined by the thread, and the variable should also be recycled after the thread dies.
  • Multiple threads may operate the cacheMap at the same time, and the cacheMap needs to be synchronized.

In order to solve the above problem, we change the way of thinking. Each thread creates a Map to store the copy variable in the current thread, and uses the instance of CustomThreadLocal as the key value. The following is an example:

public class SimpleImpl2 {

    public static class CommonThread extends Thread {
        Map<Integer, Integer> cacheMap = new HashMap<>();
    }

    public static class CustomThreadLocal {
        private int defaultValue;

        public CustomThreadLocal(int value) {
            defaultValue = value;
        }

        public Integer get() {
            Integer id = this.hashCode();
            Map<Integer, Integer> cacheMap = getMap();
            if (cacheMap.containsKey(id)) {
                return cacheMap.get(id);
            }
            return defaultValue;
        }

        public void set(int value) {
            Integer id = this.hashCode();
            Map<Integer, Integer> cacheMap = getMap();
            cacheMap.put(id, value);
        }

        public Map<Integer, Integer> getMap() {
            CommonThread thread = (CommonThread) Thread.currentThread();
            return thread.cacheMap;
        }
    }

    public static class Number {
        private CustomThreadLocal value = new CustomThreadLocal(0);

        public void increase() throws InterruptedException {
            value.set(10);
            Thread.sleep(10);
            System.out.println("increase value: " + value.get());
        }

        public void decrease() throws InterruptedException {
            value.set(-10);
            Thread.sleep(10);
            System.out.println("decrease value: " + value.get());
        }
    }


    public static void main(String[] args) throws InterruptedException {
        final Number number = new Number();
        Thread increaseThread = new CommonThread() {
            @Override
            public void run() {
                try {
                    number.increase();
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }

            }
        };

        Thread decreaseThread = new CommonThread() {
            @Override
            public void run() {
                try {
                    number.decrease();
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        };
        increaseThread.start();
        decreaseThread.start();
    }
}

In the above implementation, when the thread dies, the cacheMap in the thread will also be reclaimed, and the copy variables stored in it will also be reclaimed, and the cacheMap is private to the thread, and there will be no situation where multiple threads access a cacheMap at the same time . In Java, the implementation of the ThreadLocal class adopts this idea. Note that it is just an idea, and the actual implementation is not the same as the above.

Example of use

Java uses the ThreadLocal class to implement the thread local variable pattern. ThreadLocal uses the set and get methods to set and get variables. The following is the function prototype:

public void set(T value);
public T get();

Here is a complete example using ThreadLocal:

public class ThreadLocalDemo {
    private static ThreadLocal<Integer> threadLocal = new ThreadLocal<>();
    private static int value = 0;

    public static class ThreadLocalThread implements Runnable {
        @Override
        public void run() {
            threadLocal.set((int)(Math.random() * 100));
            value = (int) (Math.random() * 100);
            try {
                Thread.sleep(2000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.printf(Thread.currentThread().getName() + ": threadLocal=%d, value=%d\n", threadLocal.get(), value);
        }
    }

    public static void main(String[] args) throws InterruptedException {
        Thread thread = new Thread(new ThreadLocalThread());
        Thread thread2 = new Thread(new ThreadLocalThread());
        thread.start();
        thread2.start();
        thread.join();
        thread2.join();
    }
}

Here is one possible output:

Thread-0: threadLocal=87, value=15
Thread-1: threadLocal=69, value=15

We see that although it  threadLocal is a static variable, each thread has its own value and will not be affected by other threads.

Implementation

The realization idea of ​​ThreadLocal, we have said before, each thread maintains a ThreadLocalMap mapping table, the key of the mapping table is the ThreadLocal instance itself, and the value is the copy variable to be stored. A ThreadLocal instance doesn't store a value itself, it just provides a key to find a copy of the value in the current thread. As shown below:

Picture from  http://blog.xiaohansong.com/2016/08/06/ThreadLocal-memory-leak/

Let's look at the implementation of ThreadLocal from the following three aspects:

  • Data structure to store thread copy variables
  • How to access thread copy variables
  • How to Hash an instance of ThreadLocal

ThreadLocalMap

Threads use ThreadLocalMap to store per-thread copy variables, which is a static inner class in ThreadLocal. ThreadLocalMap is also implemented with the idea of ​​Hash, but the implementation method is not the same as HashMap. Let's first look at the relevant knowledge of the hash table:

hash table

Ideally, a hash table is a fixed-size array containing keys, which are mapped to different positions in the array using a hash function. The following is a schematic diagram of an ideal hash table:

Image from Data Structure and Algorithm Analysis: C Syntax Description

In an ideal state, the hash function can evenly distribute the keywords to different positions of the array, and there will be no situation where two keywords have the same hash value (assuming the number of keywords is less than the size of the array). However, in actual use, there are often situations where multiple keywords have the same hash value (mapped to the same position in the array), and we call this situation a hash collision. In order to resolve hash collisions, the following two methods are mainly used:

  • separate chaining
  • open addressing

Separate linked list method The
decentralized linked list method uses a linked list to resolve conflicts, and saves elements with the same hash value into a linked list. When querying, first find the linked list where the element is located, and then traverse the linked list to find the corresponding element. Here is a schematic diagram:

Image via  http://faculty.cs.niu.edu/~freedman/340/340notes/340hash.htm

Open addressing method
Open addressing method does not create a linked list. When the array element to which the keyword is hashed is already occupied by another keyword, it will try to find other elements in the array until an empty element is found. There are many ways to detect empty cells in an array. Here is one of the simplest -- linear detection. The linear detection method starts from the conflicting array unit, searches for the empty unit in turn, and if it reaches the end of the array, starts the search from the beginning (circular search). As shown below:

Image from  http://alexyyek.github.io/2014/12/14/hashCollapse/

For a comparison of the two methods, you can refer to  this article . The open address method is used in ThreadLocalMap to handle hash collisions, while the separate linked list method used in HashMap. The reason for adopting a different approach is mainly because the hash values ​​in the ThreadLocalMap are very evenly dispersed, and there are few conflicts. And ThreadLocalMap often needs to clear useless objects, it is more convenient to use pure arrays.

accomplish

We know that Map is a data structure in the form of key-value, so the elements stored in the hash array are also in the form of key-value. ThreadLocalMap uses the Entry class to store data, the following is the definition of the class:

static class Entry extends WeakReference <ThreadLocal <?>> {
    /** The value associated with this ThreadLocal. */
    Object value;

    Entry(ThreadLocal <?> k, Object v) {
        super(k);
        value = v;
    }
}

The Entry stores the ThreadLocal instance as the key and the copy variable as the value. Note that the reference to the ThreadLocal instance in the Entry is a weak reference, which is defined in the Reference class (the parent class of WeakReference). The following is the  super(k) final call code:

Reference(T referent) {
    this(referent, null);
}

Reference(T referent, ReferenceQueue <? super T> queue) {
    this.referent = referent;
    this.queue = (queue == null) ? ReferenceQueue.NULL : queue;
}

For weak references and why to use weak references, you can refer to  Java Theory and Practice: Blocking Memory Leaks with Weak References  and  In -depth Analysis of ThreadLocal Memory Leaks . Let's take a look at the set function of ThreadLocalMap

private void set(ThreadLocal <?> key, Object value) {

    // We don't use a fast path as with get() because it is at
    // least as common to use set() to create new entries as
    // it is to replace existing ones, in which case, a fast
    // path would fail more often than not.

    Entry[] tab = table;
    int len = tab.length;
    // 根据 ThreadLocal 的散列值,查找对应元素在数组中的位置
    int i = key.threadLocalHashCode & (len - 1);

    // 使用线性探测法查找元素
    for (Entry e = tab[i]; e != null; e = tab[i = nextIndex(i, len)]) {
        ThreadLocal <?> k = e.get();
        // ThreadLocal 对应的 key 存在,直接覆盖之前的值
        if (k == key) {
            e.value = value;
            return;
        }
        // key为 null,但是值不为 null,说明之前的 ThreadLocal 对象已经被回收了,当前数组中的 Entry 是一个陈旧(stale)的元素
        if (k == null) {
            // 用新元素替换陈旧的元素,这个方法进行了不少的垃圾清理动作,防止内存泄漏,具体可以看源代码,没看太懂
            replaceStaleEntry(key, value, i);
            return;
        }
    }
    // ThreadLocal 对应的 key 不存在并且没有找到陈旧的元素,则在空元素的位置创建一个新的 Entry。
    tab[i] = new Entry(key, value);
    int sz = ++size;
    // cleanSomeSlot 清理陈旧的 Entry(key == null),具体的参考源码。如果没有清理陈旧的 Entry 并且数组中的元素大于了阈值,则进行 rehash。
    if (!cleanSomeSlots(i, sz) && sz >= threshold)
        rehash();
}

There are a few things that need to be done about the set method:

  • int i = key.threadLocalHashCode & (len - 1);, here is actually a remainder operation on len-1. The reason why the remainder can be taken in this way is because the value of len is special, it is the nth power of 2, after subtracting 1, the low order becomes all 1, and the high order becomes all 0. For example, 16, after subtracting 1, the corresponding binary is: 00001111, so the part of other numbers greater than 16 will be subtracted by 0, and the part less than 16 will be retained, which is equivalent to taking the remainder.
  • Some stale Entry will be cleaned up in the replaceStaleEntry and cleanSomeSlots methods to prevent memory leaks
  • The value size of threshold is threshold = len * 2 / 3;
  • In the rehash method, the old Entry will be cleaned up first. If the number of elements is still greater than 3/4 of the threshold after cleaning, the expansion operation will be performed (the size of the array will be doubled)

    private void rehash() {
    expungeStaleEntries();
    // Use lower threshold for doubling to avoid hysteresis
    if (size >= threshold - threshold / 4)
        resize();
    }

Let's look at the getEntry (there is no get method, just getEntry) method:

private Entry getEntry(ThreadLocal <?> key) {
    int i = key.threadLocalHashCode & (table.length - 1);
    Entry e = table[i];
    if (e != null && e.get() == key)
        return e;
    else
        return getEntryAfterMiss(key, i, e);
}

Because the open addressing method is used in ThreadLocalMap, the hash value of the current key and the index of the element in the array do not necessarily correspond exactly. So when get, first check whether the array element corresponding to the hash value of the key is the element to be found, if not, then call  getEntryAfterMiss the method to find the following element.

private Entry getEntryAfterMiss(ThreadLocal <?> key, int i, Entry e) {
    Entry[] tab = table;
    int len = tab.length;

    while (e != null) {
        ThreadLocal < ? > k = e.get();
        if (k == key)
            return e;
        if (k == null)
            expungeStaleEntry(i);
        else
            i = nextIndex(i, len);
        e = tab[i];
    }
    return null;
}

Finally, look at the delete operation. Deleting is actually setting the key value of the Entry to null, and becoming an obsolete Entry. Then call to  expungeStaleEntry clean up stale Entry.

private void remove(ThreadLocal <?> key) {
    Entry[] tab = table;
    int len = tab.length;
    int i = key.threadLocalHashCode & (len - 1);
    for (Entry e = tab[i]; e != null; e = tab[i = nextIndex(i, len)]) {
        if (e.get() == key) {
            e.clear();
            expungeStaleEntry(i);
            return;
        }
    }
}

copy variable access

After talking about ThreadLocalMap, the access operation of copy variables is easy to understand. Below is the implementation of the set and get methods in ThreadLocal:

public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
}

public T get() {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T) e.value;
            return result;
        }
    }
    return setInitialValue();
}

The basic process of access is to first obtain the ThreadLocalMap of the current thread, pass the ThreadLocal instance into the Map as a key value, and then perform the related variable access work. The ThreadLocalMap in the thread is lazily loaded, and createMap will be called only when the variable is actually stored. The following is the implementation of createMap:

void createMap(Thread t, T firstValue) {
    t.threadLocals = new ThreadLocalMap(this, firstValue);
}

If you want to set the initial value for the copy variable of ThreadLocal, you need to override the initialValue method, as follows:

ThreadLocal <Integer> threadLocal = new ThreadLocal() {
    protected Integer initialValue() {
        return 0;
    }
};

ThreadLocal hash value

When an instance of ThreadLocal is created, its hash value has been determined, and the following is the implementation in ThreadLocal:

/**
 * ThreadLocals rely on per-thread linear-probe hash maps attached
 * to each thread (Thread.threadLocals and
 * inheritableThreadLocals).  The ThreadLocal objects act as keys,
 * searched via threadLocalHashCode.  This is a custom hash code
 * (useful only within ThreadLocalMaps) that eliminates collisions
 * in the common case where consecutively constructed ThreadLocals
 * are used by the same threads, while remaining well-behaved in
 * less common cases.
 */
private final int threadLocalHashCode = nextHashCode();

/**
 * The next hash code to be given out. Updated atomically. Starts at
 * zero.
 */
private static AtomicInteger nextHashCode =
    new AtomicInteger();

/**
 * The difference between successively generated hash codes - turns
 * implicit sequential thread-local IDs into near-optimally spread
 * multiplicative hash values for power-of-two-sized tables.
 */
private static final int HASH_INCREMENT = 0x61c88647;

/**
 * Returns the next hash code.
 */
private static int nextHashCode() {
    return nextHashCode.getAndAdd(HASH_INCREMENT);
}

We see that threadLocalHashCode is a constant, which is  nextHashCode() generated by the function. nextHashCode() The function is actually accumulating 0x61c88647 each time on the basis of an AtomicInteger variable (initial value is 0), using AtomicInteger to ensure that each addition is an atomic operation. And 0x61c88647 is more magical, it can make the hashcode evenly distributed in the Nth power array of size 2. Write a program to test it:

public static void main(String[] args) {
    AtomicInteger hashCode = new AtomicInteger();
    int hash_increment = 0x61c88647;
    int size = 16;
    List <Integer> list = new ArrayList <> ();
    for (int i = 0; i < size; i++) {
        list.add(hashCode.getAndAdd(hash_increment) & (size - 1));
    }
    System.out.println("original:" + list);
    Collections.sort(list);
    System.out.println("sort:    " + list);
}

Let's set the size to 16, 32 and 64 to test it out:

// size=16
original:[0, 7, 14, 5, 12, 3, 10, 1, 8, 15, 6, 13, 4, 11, 2, 9]
sort:    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

// size=32
original:[0, 7, 14, 21, 28, 3, 10, 17, 24, 31, 6, 13, 20, 27, 2, 9, 16, 23, 30, 5, 12, 19, 26, 1, 8, 15, 22, 29, 4, 11, 18, 25]
sort:    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]

// size=64
original:[0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 6, 13, 20, 27, 34, 41, 48, 55, 62, 5, 12, 19, 26, 33, 40, 47, 54, 61, 4, 11, 18, 25, 32, 39, 46, 53, 60, 3, 10, 17, 24, 31, 38, 45, 52, 59, 2, 9, 16, 23, 30, 37, 44, 51, 58, 1, 8, 15, 22, 29, 36, 43, 50, 57]
sort:    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]

It can be seen that with the change of size, the hashcode can always be evenly distributed. In fact, this is Fibonacci Hashing, you can refer to  this article for details . So although the hashcode of ThreadLocal is fixed, when the hash table in ThreadLocalMap is resized (to double the original size) and then re-hashed, the hashcode can still be evenly distributed in the hash table.

Application scenarios

Excerpted from  Java Concurrent Programming: An in-depth look at ThreadLocal

The most common usage scenarios of ThreadLocal are to solve database connection, Session management, etc. Such as

private static ThreadLocal < Connection > connectionHolder = new ThreadLocal < Connection > () {
    public Connection initialValue() {
        return DriverManager.getConnection(DB_URL);
    }
};

public static Connection getConnection() {
    return connectionHolder.get();
}
private static final ThreadLocal threadSession = new ThreadLocal();

public static Session getSession() throws InfrastructureException {
    Session s = (Session) threadSession.get();
    try {
        if (s == null) {
            s = getSessionFactory().openSession();
            threadSession.set(s);
        }
    } catch (HibernateException ex) {
        throw new InfrastructureException(ex);
    }
    return s;
}

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325064152&siteId=291194637