HashMap source code in-depth

  When it comes to HashMap, we are familiar with it. HashMap is also a container I often use, which is often used to store some attribute values. HashMap is often used as a local variable, or in environments with less concurrency, because HashMap is not thread-safe. After learning Java for so long, I just know how to use it and don't know its internal principles. Only by understanding the internal coding and implementation methods can I write efficient code and functions. Then we will further study the source code of HashMap. . . . .

  1. Constructor

  The HashMap<K,V> class inherits the Map<K,V> interface and implements the AbstractMap<K,V> abstract class. This is the HashMap constructor that all constructors will eventually call. The internal structure of HashMap is an array, and the initialization capacity is used to specify the length of the array.

/**
 * @params initialCapacity initialize size, loadFactor fill factor
 */
public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);

    this.loadFactor = loadFactor;
    threshold = initialCapacity;
    init();
}

  We rarely use the above constructor directly, and do not directly specify the filling factor and the initialized capacity length. The following constructors call the above constructor with default values ​​and return an empty HashMap.

/**
 * Function description: Construct an empty HashMap using the incoming initialized capacity value and the default fill factor
 */ 
public HashMap( int initialCapacity) {
     // use the final constructor 
    this (initialCapacity, DEFAULT_LOAD_FACTOR);
}

/**
 * Construct an empty HashMap, use the default initialized capacity value, the default initialized capacity is 16, and the corresponding fill factor is 0.75
 * (16) and the default load factor (0.75).
 */ 
public HashMap() {
     // Call the final constructor 
    this (DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
}

  Second, the initialization of HashMap

  Initialize the array of HashMap, the length of the array is an index of 2, and the threshold is the minimum value of capacity * loadFactor and MAXIMUM_CAPACITY 

/**
 * Initialize the internal array of HashMap
 */ 
private  void inflateTable( int toSize) {
     // Find an index greater than or equal to toSize 2 as the array fill value 
    int capacity = roundUpToPowerOf2(toSize);
     // Compare 
    threshold = ( int ) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1 );
    table = new Entry[capacity];
    initHashSeedAsNeeded(capacity);
}

  3. Resize and transfer of HashMap

  When HashMap is initialized, capacity-filling length and loadFactor-filling factor are set. When the filled elements in the array are greater than the value of (capacity * loadFactor), the array size of the stored elements in HashMap will be expanded, and the new one will be the original array length of 2 A new array with twice the length, and the elements in the original array are mapped to the new array by re-hash. The resize(int newCapacity) function used to expand the array

/**
 * Expand the size of the array, create a new array that is twice the size of the original table, and transfer the data of the original array to the new array by hashing
 */
void resize(int newCapacity) {
    Entry[] oldTable = table;
    int oldCapacity = oldTable.length;
    if (oldCapacity == MAXIMUM_CAPACITY) {
        threshold = Integer.MAX_VALUE;
        return;
    }

    Entry[] newTable = new Entry[newCapacity];
    transfer (newTable, initHashSeedAsNeeded (newCapacity));
    table = newTable;
    threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}

  Array transfer function transfer(Entry[] newTable, boolean rehash), when resize occurs, it is the function that needs to be called

/**
 * Hash all the data in the current array into a new array
 */ 
void transfer(Entry[] newTable, boolean rehash) {
     int newCapacity = newTable.length;
     // Traverse the values ​​in the current array and put them in the new array 
    for (Entry<K,V> e : table) {
         while ( null != e) {
            Entry<K,V> next = e.next;
            if (rehash) {
                e.hash = null == e.key ? 0 : hash(e.key);
            }
            int i = indexFor(e.hash, newCapacity);
            e.next = newTable [i];
            newTable [i] = e;
            e = next;
        }
    }
}

   Add the element put, add it to the array, if it appears

/**
 * Add elements, if the array is empty at this point will create a new array
 */
public V put(K key, V value) {
    if (table == EMPTY_TABLE) {
        inflateTable(threshold);
    }
    if (key == null )
         return putForNullKey(value);
     int hash = hash(key);
     int i = indexFor(hash, table.length); 
   // Loop through the linked list under the corresponding coordinates, if both key and value are encountered The same, replace the original value
for (Entry<K,V> e = table[i]; e != null ; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; } } modCount++; addEntry(hash, key, value, i); return null; }

  Let's introduce a more important function, that is the hash() function. Why is this function important? Because a well-designed hash function can hash objects better, it also reduces collisions during hashing.

/**
 * hash function, used to calculate the hash value for hashing
 */
final int hash(Object k) {
    int h = hashSeed;
    if (0 != h && k instanceof String) {
        return sun.misc.Hashing.stringHash32((String) k);
    }

    h ^= k.hashCode();

    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

  Summarize:

  To sum up, through the study of HashMap source code, I have a further understanding of its internal implementation. To know the truth, it is necessary to know the reason. This is the original intention and starting point of my study. As long as I can calm down and study carefully and think more, I can do better. Learning how to use is the first step, learning how to implement is a further step, and the ultimate goal is to absorb good design ideas and apply them to your own design. 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325358027&siteId=291194637