Foreword
Previous article I talked about the relevant source implementation of HashMap, and we know that it is thread safe, when used in a concurrent environment, HashMap is possible when the expansion will generate a circular linked list, resulting in the formation of get infinite loop time out. That this we have to introduce a concurrent environment using HashMap - ConcurrentHashMap, following its class diagram.
JDK1.7 Realization
Segment is a reentrant lock, the lock role in ConcurrentHashMap; HashEntry for storing the key data.
A ConcurrentHashMap contains a Segment array. Segment structure HashMap and the like, a list structure and the array. A Segment contains a HashEntry array, each element of a linked list structure is HashEntry each Segment HashEntry a guard element array, when the data array HashEntry modified, it must first obtain the corresponding Segment lock.
By using segmented ConcurrentHashMap lock technology, the data storage segments by, and to each piece of data with a lock, when a thread holding the lock of access wherein a data segment, the other segment of data can also be accessed by other threads, You can achieve true concurrent access.
static final class Segment extends ReentrantLock implements Serializable {
private static final long serialVersionUID = 2249069246763182397L;
static final int MAX_SCAN_RETRIES =
Runtime.getRuntime().availableProcessors() > 1 ? 64 : 1;
transient volatile HashEntry[] table;
transient int count;
transient int modCount;
transient int threshold;
final float loadFactor;
... ...
}复制代码
1. Storage structure
static final class HashEntry<K,V> {
final int hash;
final K key;
volatile V value;
volatile HashEntry<K,V> next;
}
复制代码
static final class Segment<K,V> extends ReentrantLock implements Serializable {
private static final long serialVersionUID = 2249069246763182397L;
static final int MAX_SCAN_RETRIES =
Runtime.getRuntime().availableProcessors() > 1 ? 64 : 1;
transient volatile HashEntry<K,V>[] table;
transient int count;
transient int modCount;
transient int threshold;
final float loadFactor;
}
final Segment<K,V>[] segments;复制代码
public ConcurrentHashMap(int initialCapacity,float loadFactor, int concurrencyLevel)复制代码
- Initial capacity: the capacity to represent all of the initial segment array, containing a total of how many hashentry. If initialCapacity is not a power of 2, the exponentiation will initialCapacity greater than 2.
- Load factor: Default 0.75.
- Concurrency level: how many threads can allow concurrent. concurrencyLevel how much, how many segment there are, of course, will take a value equal to a power greater than 2.
static final int DEFAULT_CONCURRENCY_LEVEL = 16;复制代码
Next we look at a few key functions ConcurrentHashMap in, get, put, rehash (expansion), size method, and see how he is to achieve concurrency.
2. get operation
- The key, calculate the hashCode;
- Step 1 The calculated hashCode positioning segment, if the segment is not null nor && segment.table null, proceeds to step 3, otherwise it returns null, corresponding to the key value does not exist;
- The positioning table corresponding hashCode hashEntry, traversing hashEntry, if the key is present, returns the corresponding key value;
- Step 3 yet to find the end of the corresponding key value, return null, corresponding to the key value does not exist.
3. put operation
- Calibration parameter, value is not null, throws a null pointer exception is null;
- Calculation of the hashCode key;
- Positioning segment, if the segment does not exist, create a new segment;
- The method of call segment is put in the corresponding insert made segment.
segment of the put method implementation
2. HashEntry targeting specific array HashEntry;
3. HashEntry linked list traversal, if the key to be inserted already exists:
- To update the corresponding key value (onlyIfAbsent!), Update oldValue = newValue, skip to step 5;
- Otherwise, jump directly to Step 5;
5. release the lock, return oldValue.
- First: HashEntry expansion array;
- Second: positioning a position corresponding to the elements added, then place HashEntry array.
4. size operation
/**
* The number of elements. Accessed only either within locks
* or among other volatile reads that maintain visibility.
*/
transient int count;复制代码
static final int RETRIES_BEFORE_LOCK = 2;
public int size() {
// Try a few times to get accurate count. On failure due to
// continuous async changes in table, resort to locking.
final Segment<K,V>[] segments = this.segments;
int size;
boolean overflow; // true if size overflows 32 bits
long sum; // sum of modCounts
long last = 0L; // previous sum
int retries = -1; // first iteration isn't retry
try {
for (;;) {
// 超过尝试次数,则对每个 Segment 加锁
if (retries++ == RETRIES_BEFORE_LOCK) {
for (int j = 0; j < segments.length; ++j)
ensureSegment(j).lock(); // force creation
}
sum = 0L;
size = 0;
overflow = false;
for (int j = 0; j < segments.length; ++j) {
Segment<K,V> seg = segmentAt(segments, j);
if (seg != null) {
sum += seg.modCount;
int c = seg.count;
if (c < 0 || (size += c) < 0)
overflow = true;
}
}
// 连续两次得到的结果一致,则认为这个结果是正确的
if (sum == last)
break;
last = sum;
}
} finally {
if (retries > RETRIES_BEFORE_LOCK) {
for (int j = 0; j < segments.length; ++j)
segmentAt(segments, j).unlock();
}
}
return overflow ? Integer.MAX_VALUE : size;
}复制代码
ConcurrentHashMap how to determine the statistical process Segment cout changed?
JDK 1.8 changes
Using a linked list data structure array + + red-black tree implementation embodiment. When the list (bucket) is the number of nodes more than 8, the storage will be converted into a red-black tree data structure, purpose of this design is to improve the same by reading a list of conflict where a large efficiency.
Java8 mainly made the following optimization:
- The Segment discarded, and the direct use of Node (inherited from Map.Entry) as a table element.
- When modified, no longer used ReentrantLock lock, lock directly with the built-synchronized, Java8 built-in lock to optimize a lot more than the previous version, compared ReentrantLock, and performance is not bad.
- size optimization method, increasing CounterCell inner classes, for parallel computation of the number of elements in each bucket.
- A negative number indicates expansion is initializing or -1 is being initialized, -N expressed N - 1 threads are expansion
- 0 positive number indicates has not been initialized. Other positive number indicates a lower expansion size.
static class Node<K,V> implements Map.Entry<K,V> {
final int hash;
final K key;
volatile V val;
volatile Node<K,V> next;
}复制代码
CAS operation
There are three core ConcurrentHashMap CAS operation
- tabAt: obtaining node array at the position i
- casTabAt: setting the node array location i
- setTabAt: using the node position set in the volatile i.
//获取索引i处Node
static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {
return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
}
//利用CAS算法设置i位置上的Node节点(将c和table[i]比较,相同则插入v)
static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
Node<K,V> c, Node<K,V> v) {
return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
}
//利用volatile设置节点位置i的值,仅在上锁区被调用
static final <K,V> void setTabAt(Node<K,V>[] tab, int i, Node<K,V> v) {
U.putObjectVolatile(tab, ((long)i << ASHIFT) + ABASE, v);
} 复制代码
initTable () method
private final Node<K,V>[] initTable() {
Node<K,V>[] tab; int sc;
while ((tab = table) == null || tab.length == 0) {
//如果一个线程发现sizeCtl<0,意味着另外的线程
//执行CAS操作成功,当前线程只需要让出cpu时间片
if ((sc = sizeCtl) < 0)
Thread.yield();
else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
//CAS方法把sizectl置为-1,表示本线程正在进行初始化
try {
if ((tab = table) == null || tab.length == 0) {
//DEFAULT_CAPACITY 默认初始容量是 16
int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
@SuppressWarnings("unchecked")
//初始化数组,长度为 16 或初始化时提供的长度
Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
//将这个数组赋值给 table,table 是 volatile 的
table = tab = nt;
//如果 n 为 16 的话,那么这里 sc = 12
//其实就是 0.75 * n
sc = n - (n >>> 2);
}
} finally {
sizeCtl = sc;
}
break;
}
}
return tab;
}复制代码
InitTable sizeCtl call determines the value, if the value of -1 indicates initializes calls yield () to wait.
If the value is 0, then the first call CAS algorithm to set to -1, then initialized.
Therefore, the first thread of execution of a put operation performs Unsafe.compareAndSwapInt method modifies sizeCtl -1, and only one thread can modify successfully, so that other threads wait for a CPU time slice table initialization completion by Thread.yield ().
In summary, you can know the initialization is single-threaded operation.
put () method
public V put(K key, V value) {
return putVal(key, value, false);
}
/** Implementation for put and putIfAbsent */
final V putVal(K key, V value, boolean onlyIfAbsent) {
//不允许key、value为空
if (key == null || value == null) throw new NullPointerException();
//返回 (h ^ (h >>> 16)) & HASH_BITS;
int hash = spread(key.hashCode());
int binCount = 0;
//循环,直到插入成功
for (Node[] tab = table;;) {
Node f; int n, i, fh;
if (tab == null || (n = tab.length) == 0)
//table为空,初始化table
tab = initTable();
else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
//索引处无值
if (casTabAt(tab, i, null,
new Node(hash, key, value, null)))
break; // no lock when adding to empty bin
}
else if ((fh = f.hash) == MOVED)// MOVED=-1;
//检测到正在扩容,则帮助其扩容
tab = helpTransfer(tab, f);
else {
V oldVal = null;
//上锁(hash值相同的链表的头节点)
synchronized (f) {
if (tabAt(tab, i) == f) {
if (fh >= 0) {
//遍历链表节点
binCount = 1;
for (Node e = f;; ++binCount) {
K ek;
// hash和key相同,则修改value
if (e.hash == hash &&
((ek = e.key) == key ||
(ek != null && key.equals(ek)))) {
oldVal = e.val;
//仅putIfAbsent()方法中onlyIfAbsent为true
if (!onlyIfAbsent)
//putIfAbsent()包含key则返回get,否则put并返回
e.val = value;
break;
}
Node pred = e;
//已遍历到链表尾部,直接插入
if ((e = e.next) == null) {
pred.next = new Node(hash, key,
value, null);
break;
}
}
}
else if (f instanceof TreeBin) {// 树节点
Node p;
binCount = 2;
if ((p = ((TreeBin)f).putTreeVal(hash, key,
value)) != null) {
oldVal = p.val;
if (!onlyIfAbsent)
p.val = value;
}
}
}
}
if (binCount != 0) {
//判断是否要将链表转换为红黑树,临界值和HashMap一样也是8
if (binCount >= TREEIFY_THRESHOLD)
//若length<64,直接tryPresize,两倍table.length;不转树
treeifyBin(tab, i);
if (oldVal != null)
return oldVal;
break;
}
}
}
addCount(1L, binCount);
return null;
}复制代码
static final int spread(int h) {
return (h ^ (h >>> 16)) & HASH_BITS;
}
复制代码
int index = (n - 1) & hash复制代码
4. If f is null, the table illustrate this position of the first element is inserted, the insertion node Node using Unsafe.compareAndSwapObject method.
- If the CAS succeeds, the Node node has been inserted, break out of, then addCount (1L, binCount) method checks the current capacity of the need for expansion.
- If the CAS fails, the other thread is inserted ahead of the nodes, the nodes spin again try to insert in this position.
6. Node remaining cases the new node is inserted in the manner of a list or a red-black tree to the appropriate location, the built-in locks for synchronous process complicated, the code above.
Synchronize on the node F, the node is inserted before again using tabAt (tab, i) == f is determined, is modified to prevent other threads.
- If f.hash> = 0, f is explained first node of the list structure, traversing the list, if the node finds the corresponding node, the modified value, or added to the tail node in the linked list.
- If f is TreeBin node type, red and black root node f described, the traversing element in the tree structure, for updating or adding nodes.
- If the list of nodes binCount> = TREEIFY_THRESHOLD (default 8), put into a red-black tree structure list.
Transfer list red-black tree: treeifyBin ()
private final void treeifyBin(Node[] tab, int index) {
Node b; int n, sc;
if (tab != null) {
// MIN_TREEIFY_CAPACITY 为 64
// 所以,如果数组长度小于 64 的时候,其实也就是 32 或者 16 或者更小的时候,会进行数组扩容
if ((n = tab.length) < MIN_TREEIFY_CAPACITY)
// 后面我们再详细分析这个方法
tryPresize(n << 1);
// b 是头结点
else if ((b = tabAt(tab, index)) != null && b.hash >= 0) {
// 加锁
synchronized (b) {
if (tabAt(tab, index) == b) {
// 下面就是遍历链表,建立一颗红黑树
TreeNode hd = null, tl = null;
for (Node e = b; e != null; e = e.next) {
TreeNode p =
new TreeNode(e.hash, e.key, e.val,
null, null);
if ((p.prev = tl) == null)
hd = p;
else
tl.next = p;
tl = p;
}
// 将红黑树设置到数组相应位置中
setTabAt(tab, index, new TreeBin(hd));
}
}
}
}
}复制代码
Expansion: tryPresize ()
The expansion here is to do double expansion, the expansion capacity of the array factor of two.
// 首先要说明的是,方法参数 size 传进来的时候就已经翻了倍了
private final void tryPresize(int size) {
// c:size 的 1.5 倍,再加 1,再往上取最近的 2 的 n 次方。
int c = (size >= (MAXIMUM_CAPACITY >>> 1)) ? MAXIMUM_CAPACITY :
tableSizeFor(size + (size >>> 1) + 1);
int sc;
while ((sc = sizeCtl) >= 0) {
Node<K,V>[] tab = table; int n;
// 这个 if 分支和之前说的初始化数组的代码基本上是一样的
if (tab == null || (n = tab.length) == 0) {
n = (sc > c) ? sc : c;
if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
try {
if (table == tab) {
@SuppressWarnings("unchecked")
Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
table = nt;
sc = n - (n >>> 2); // 0.75 * n
}
} finally {
sizeCtl = sc;
}
}
}
else if (c <= sc || n >= MAXIMUM_CAPACITY)
break;
else if (tab == table) {
int rs = resizeStamp(n);
if (sc < 0) {
Node<K,V>[] nt;
if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
transferIndex <= 0)
break;
// 2. 用 CAS 将 sizeCtl 加 1,然后执行 transfer 方法
// 此时 nextTab 不为 null
if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
transfer(tab, nt);
}
// 1. 将 sizeCtl 设置为 (rs << RESIZE_STAMP_SHIFT) + 2)
// 调用 transfer 方法,此时 nextTab 参数为 null
else if (U.compareAndSwapInt(this, SIZECTL, sc,
(rs << RESIZE_STAMP_SHIFT) + 2))
transfer(tab, null);
}
}
}复制代码
As for the source transfer () method of analysis here, I do not, it's probably function is to migrate the original tab element array to the new nextTab array.
get () method
public V get(Object key) {
Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
int h = spread(key.hashCode());
if ((tab = table) != null && (n = tab.length) > 0 &&
(e = tabAt(tab, (n - 1) & h)) != null) {//tabAt(i),获取索引i处Node
// 判断头结点是否就是我们需要的节点
if ((eh = e.hash) == h) {
if ((ek = e.key) == key || (ek != null && key.equals(ek)))
return e.val;
}
// 如果头结点的 hash<0,说明正在扩容,或者该位置是红黑树
else if (eh < 0)
return (p = e.find(h, key)) != null ? p.val : null;
//遍历链表
while ((e = e.next) != null) {
if (e.hash == h &&
((ek = e.key) == key || (ek != null && key.equals(ek))))
return e.val;
}
}
return null;
}复制代码
Node<K,V> find(int h, Object k) {
Node<K,V> e = this;
if (k != null) {
do {
K ek;
if (e.hash == h &&
((ek = e.key) == k || (ek != null && k.equals(ek))))
return e;
} while ((e = e.next) != null);
}
return null;
}复制代码
- If the location is null, null is returned directly on it
- If the node at the location just what we need, it can return the value of the node
- If the hash value of the node position is less than 0, the expansion being described, or a red-black tree
- If the above three do not meet that list, you can traverse comparison
summary
I'm here to basically put ConcurrentHashMap in the realization of JDK 1.7 and about 1.8 stroked again, and a detailed analysis of several important ways: initialization, put, get. In the JDK1.8 ConcurrentHashMap great changes occurring, by the use of CAS + achieve synchronized substituted locking mechanism in the segment Segment 1.7 originally, to support higher concurrency.
This is only my second learning ConcurrentHashMap, if you want to better understand and grasp the subtle realization of ConcurrentHashMap, the individual feels the need to look after a few more, I believe that every time a new harvest.