HashMap insecurity (JDK1.7) in a multithreaded

This article is based on analysis of HashMap JDK1.7

HashMap problems caused by multiple threads

  1. Multithreading put operation, get the operation result in an infinite loop.
  2. Multithreading put operation, resulting in the loss of the element.

Infinite loop scenario to reproduce

public class HashMapTest extends Thread {

    private static HashMap<Integer, Integer> map = new HashMap<>(2);
    private static AtomicInteger at = new AtomicInteger();

    @Override
    public void run() {
        while (at.get() < 1000000) {
            map.put(at.get(), at.get());
            at.incrementAndGet();
        }
    }


    public static void main(String[] args) {
        HashMapTest t0 = new HashMapTest();
        HashMapTest t1 = new HashMapTest();
        HashMapTest t2 = new HashMapTest();
        HashMapTest t3 = new HashMapTest();
        HashMapTest t4 = new HashMapTest();
        HashMapTest t5 = new HashMapTest();
        t0.start();
        t1.start();
        t2.start();
        t3.start();
        t4.start();
        t5.start();

        for (int i = 0; i < 1000000; i++) {
            Integer integer = map.get(i);
            System.out.println(integer);
        }
    }
}
复制代码

Repeated several times, this happens it means that the cycle of death:

From the foregoing, Thread-7 due to the HashMap expansion leads to an infinite loop.

HashMap analysis

Expanding key capacity Source

1    void transfer(Entry[] newTable, boolean rehash) {
2        int newCapacity = newTable.length;
3        for (Entry<K,V> e : table) {
4            while(null != e) {
5                Entry<K,V> next = e.next;
6                if (rehash) {
7                    e.hash = null == e.key ? 0 : hash(e.key);
8                }
9                int i = indexFor(e.hash, newCapacity);
10               e.next = newTable[i];
11               newTable[i] = e;
12               e = next;
13           }
14       }
15   }
复制代码

Normal course of expansion

We first look at the single-threaded case, the normal process of rehash:

  1. We assume that the hash algorithm is simple key mod about the size of the table (ie the length of the array).
  2. It is the old hash table top, size = 2 wherein the HASH table, the key = 3,5,7 mod 2 after all conflict table [1] of this position.
  3. Next expansion HASH table, resize = 4, then all of the <key, value> re-distributed hashing process is as follows:

In the single-threaded case, everything looks very nice, the expansion process is quite smooth. Then look at the expansion of concurrency.

Expansion in concurrency

  1. There are two threads, each marked with a red and blue.

  2. 5 line of code to perform thread scheduling CPU 1 was suspended (finished execution, is next acquired 7), to perform a thread 2, and thread 2 code above were completed. Let's look at the state of this time

  1. CPU 1 then switches to the thread up, lines 4-12 execution (fifth row has finished execution), the first value of 3 healthy disposed Entry:

Note :: thread has completed two complete execution, and now all of Entry table which are up to date, that next 7 is the next 3,3 is null; now for the first time cycle has ended, 3 have been placed properly.

  1. To see what happens next:
    • e=next=7;
    • e! = null, the cycle continues
    • next=e.next=3
    • e.next next 7 pointing 3
    • 7 is placed in this Entry, now shown in FIG:

  1. After placement of 7, and then run the code:
    • e=next=3;
    • Judgment is not empty, the cycle continues
    • next = e.next herein is for the next 3 null
    • e.next = 7, next to the 3 points 7.
    • This placement 3 Entry, at this time the state of FIG.

In fact, at this time there have been an infinite loop, and 3 mobile section nod location, point 7 of this Entry; before that next 7 points to 3 but also the Entry.

  1. Code is then performed down, e = next = null, this time will terminate the loop condition is determined. The expansion is over. But if there is a subsequent query (either iterative query or expansion), will hang dead in the table [3] this position. Now back to the beginning of the article to see the Demo, is linked to death in the expansion phase of the transfer of this method above.

The key cause of the problem: * If two adjacent Entry expansion after expansion or assigned to the same table position, BUG endless loop occurs . In a complex production environment, although this situation is not common, but may encounter.

Multithreading put operation, resulting in the loss of elements

Let's introduce the problem of missing elements. This time we choose 3,5,7 order to demonstrate:

  1. If a line 5 to the execution of code in the thread was suspended CPU scheduling:

  1. Thread two complete execution:

  1. The next time a thread of execution, this is first placed 7 Entry:

  1. Then place the 5 Entry:

  1. As the next 5 to null, the expansion operation is completed at this time, this leads 3 Entry missing.

Improved JDK 8

In JDK 8 uses a bit bucket chain + / red-black tree embodiment, when the length of a chain of more than 8 bits, when the tub, this list will be converted into a red-black tree

HashMap will not lead to a multi-threaded put an infinite loop (JDK 8 with head and tail of the list to ensure order as before; JDK 7 rehash will be inverted list elements), but there will be data loss and other defects (concurrency problem itself). Therefore multiple threads or recommend the use ConcurrentHashMap

Why not thread safe

HashMap problems that may occur during concurrent mainly two aspects:

  1. If multiple threads additive element, and assuming the presence of exactly two collided put the key (according to the same bucket of the hash value calculation) using the put method, depending on the implementation HashMap then, the two key is added to the same array position, so that the occurrence of a thread which will eventually put the data is overwritten

  2. If multiple threads exceeds the number of elements in the detector array size * loadFactor, a plurality of threads simultaneously on the Node array such expansion occurs, the element positions are recalculated and copying data, but ultimately only one thread will be the expansion array assigned to the table, that is to say will be lost to other threads, and each thread can also put data loss

Guess you like

Origin juejin.im/post/5d8c261de51d4577ea077e94