Java source code analysis and performance optimization in the data structure HashMap

Storage structure first, HashMap is based on the hash table storage. It has an internal array, when the element to be stored, which first calculates a hash value of the key to find the corresponding element in the array based upon a hash index. If this is not the position of the element, directly into the current to the element, if the element (herein referred to as A), put the current element linked to the front of the element A, and the current into the array elements. So Hashmap, the array is actually saved is the first node of the list. Below is a chart of Baidu Encyclopedia:

 

As shown above, each element is an Entry object, and stores therein the key element value, and a pointer for pointing to the next object. All the same key hash value (ie, conflict) with lists string them together, this is the zipper method.

Internal variables

// default capacity 
static  Final  int DEFAULT_INITIAL_CAPACITY = 16 ;
 // maximum capacity 
static  Final  int MAXIMUM_CAPACITY <<. 1 = 30 ;
 // default load factor of 
static  Final  a float DEFAULT_LOAD_FACTOR = 0.75f ;
 // hash table 
transient the Entry <K, V > [] Table;
 // number of keys 
transient  int size;
 // expansion threshold 
int threshold;
 // load factor of the hash array 
Final  a float loadFactor;

In the above variables, Capacity is the length of the hash table, i.e. table size, default is 16. Load factor loadFactor is "full extent" hash table, JDK documentation had this to say:

The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed ( that is, internal data structures are rebuilt ) so that the hash table has approximately twice the number of buckets.
generally means: loading factor prior to expansion hash table can hold multiple full measure. When the hash table after "key on the" number exceeds the current capacity (Capacity) and product loading factor, re-hashing the hash table (i.e., reconstruction of the internal data structures), and the capacity of the hash table becomes about double the original.

As it can be seen from the above-defined variable, a default loading factor DEFAULT_LOAD_FACTOR 0.75. The higher the value, the higher the utilization of space, but the query speed (including get and put) operation will be slower. After loading factor understand, but also be able to understand the threshold, it is actually equal to the capacity load factor *.

Constructor

public HashMap(int initialCapacity, float loadFactor) {
  if (initialCapacity < 0)
    throw new IllegalArgumentException("Illegal initial capacity: " +
                      initialCapacity);
  if (initialCapacity > MAXIMUM_CAPACITY)
    initialCapacity = MAXIMUM_CAPACITY;
  if (loadFactor <= 0 || Float.isNaN(loadFactor))
    throw new IllegalArgumentException("Illegal load factor: " +
                      loadFactor);
 
  // the Find Power of A 2> = initialCapacity 
  int Capacity. 1 = ;
   the while (Capacity <initialCapacity) // calculated capacity is greater than the specified minimum power of 2 
    Capacity = <<. 1 ;
 
  this.loadFactor = loadFactor;
  threshold = (int)Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
  Table = new new the Entry [Capacity];   // to allocate space hash table 
  useAltHashing = sun.misc.VM.isBooted () && 
      (Capacity > = Holder.ALTERNATIVE_HASHING_THRESHOLD);
  init();
}

 

There are several constructors, this will eventually call the above. It accepts two parameters, one for the initial capacity, there is a loading factor. Beginning first determine the value of co illegal, there are problems, then it will throw an exception. The following is important to calculate the capacity, and its logic is larger than the calculated minimum power initialCapacity 2. In fact, the purpose is to make energy capacity greater than equal to the specified initial capacity, but this value had to be exponentially 2, which is a key issue. The reason for this is mainly to map the hash value. First look at HashMap method related hash:

final int hash(Object k) {
  int h = 0;
  if (useAltHashing) {
    if (k instanceof String) {
      return sun.misc.Hashing.stringHash32((String) k);
    }
    h = hashSeed;
  }
  h ^= k.hashCode();
  // This function ensures that hashCodes that differ only by
  // constant multiples at each bit position have a bounded
  // number of collisions (approximately 8 at default load factor).
  h ^= (h >>> 20) ^ (h >>> 12);
  return h ^ (h >>> 7) ^ (h >>> 4);
}
static int indexFor(int h, int length) {
  return h & (length-1);
}

hash () method to recalculate the hash value of the key, with a bit more complex operation, specific logic I do not know, anyway, certainly is a better way to reduce any conflicts.

The following indexFor () subscript is worth to an element in the hash table based on a hash. In general modulo hash table is a long table with the obtained hash value. When the length (i.e. capacity) is a power of 2, h & (length-1) is the same effect. Also, a power of two must be even, then subtract 1 is an odd number, the last digit must be a binary 1. So h & (length-1) may be the last bit is 1, it may be 0, can be evenly hash. If the length is odd, then the length-1 is an even number, the last digit is 0. At this h & (length-1) may only last bit is 0, all resulting index is an even number, then the hash table is wasted half the space. Therefore, the HashMap capacity (Capacity), if a certain power of 2. You can see the default DEFAULT_INITIAL_CAPACITY = 16 and MAXIMUM_CAPACITY = 1 << 30 are like.

Entry object
HashMap key-value pairs are encapsulated Entry object, this is an internal class HashMap in, look at its implementation:

static class Entry<K,V> implements Map.Entry<K,V> {
  final K key;
  V value;
  Entry<K,V> next;
  int hash;
 
  Entry(int h, K k, V v, Entry<K,V> n) {
    value = v;
    next = n;
    key = k;
    hash = h;
  }
 
  public final K getKey() {
    return key;
  }
 
  public final V getValue() {
    return value;
  }
 
  public final V setValue(V newValue) {
    V oldValue = value;
    value = newValue;
    return oldValue;
  }
 
  public final boolean equals(Object o) {
    if (!(o instanceof Map.Entry))
      return false;
    Map.Entry e = (Map.Entry)o;
    Object k1 = getKey();
    Object k2 = e.getKey();
    if (k1 == k2 || (k1 != null && k1.equals(k2))) {
      Object v1 = getValue();
      Object v2 = e.getValue();
      if (v1 == v2 || (v1 != null && v1.equals(v2)))
        return true;
    }
    return false;
  }
 
  public final int hashCode() {
    return (key==null  ? 0 : key.hashCode()) ^
        (value==null ? 0 : value.hashCode());
  }
 
  public final String toString() {
    return getKey() + "=" + getValue();
  }
  void recordAccess(HashMap<K,V> m) {
  }
  void recordRemoval(HashMap<K,V> m) {
  }
}

Realization of this class is simple and straightforward. Providing getKey (), getValue () calls and other methods for determining equivalent is required key and value are the same.

put the operation to put up in order to get, so look at the put () method:

public V put(K key, V value) {
  if (key == null)
    return putForNullKey(value);
  int hash = hash(key);
  int i = indexFor(hash, table.length);
  for (Entry<K,V> e = table[i]; e != null; e = e.next) {
    Object k;
    if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
      V oldValue = e.value;
      e.value = value;
      e.recordAccess(this);
      return oldValue;
    }
  }
 
  modCount++;
  addEntry(hash, key, value, i);
  return null;
}

In this method, first determine key whether null, is then invoked putForNullKey () method, which shows HashMap allows key is null (actually value can). If it is not null, and calculates a hash value obtained index in the table. Then corresponds to the already existing list query the same key, if it exists already updated values directly (value). Otherwise, the call addEntry () method for insertion.

Look putForNullKey () method:

private V putForNullKey(V value) {
  for (Entry<K,V> e = table[0]; e != null; e = e.next) {
    if (e.key == null) {
      V oldValue = e.value;
      e.value = value;
      e.recordAccess(this);
      return oldValue;
    }
  }
  modCount++;
  addEntry(0, null, value, 0);
  return null;
}

You can see, Key labeled 0 is inserted, there is also the updated value is null directly to the next, or the call addEntry () insert.

 

 

The following is achieved the addEntry () method:

void addEntry(int hash, K key, V value, int bucketIndex) {
  if ((size >= threshold) && (null != table[bucketIndex])) {
    resize(2 * table.length);
    hash = (null != key) ? hash(key) : 0;
    bucketIndex = indexFor(hash, table.length);
  }
 
  createEntry(hash, key, value, bucketIndex);
}
void createEntry(int hash, K key, V value, int bucketIndex) {
  Entry<K,V> e = table[bucketIndex];
  table[bucketIndex] = new Entry<>(hash, key, value, e);
  size++;
}

For more learning materials, you can scan the next Fanger Wei code.

Guess you like

Origin www.cnblogs.com/lemonrel/p/11705922.html