Talking about the understanding of HashMap

Talking about the understanding of HashMap:

1. Implemented interface

HashMap, Hashtable and TreeMap are implementation classes of the Map interface. They are all container types that store and manipulate data in the form of key-value pairs <key, value>.

2. The underlying storage structure

HashMap is based on a hash table for storage, which is composed of an array + a linked list before JDK1.7. After JDK1.8, there is a big change from array + linked list + red-black tree in resolving hash conflicts. When the length of the linked list is greater than the threshold (8 by default) and the length of the array is greater than 64, the linked list is converted into a red-black tree. To reduce search time.

3. Storage method

  • Each element (k, v key-value pair) in HashMap will be encapsulated into an internal class Entry object. The object stores key (key), value (value), next (next element), hash (hash value);
  • Save Entry objects in the array;
  • Each time an element (Entry object) is stored, first use the hash method to calculate the hashcode of the key, and then obtain the position index of the element in the array through (length-1) &hash. If the position of the array has already stored an element (to generate a hash) Conflict), continue to judge whether the contents of the key are equal, if they are equal, overwrite the original value; if they are not equal, judge whether the current node type is TreeNode<k,v> tree node, if it is a tree node, create a tree node and insert red Black tree; if not, create an ordinary Node<k,v> and add it to the end of the linked list. Determine that the length of the linked list is greater than the threshold (default is 8) and the length of the array is greater than 64. If it is satisfied, the linked list is converted into a red-black tree; if it is not satisfied, the array is expanded.

4. Expansion method:

  • When the number of elements in the HashMap exceeds the array length × load factor, the array will be expanded; (array length: the default is 16, but it can also be customized by the parameter construction method, generally a power of 2, if not, HashMap It will also be converted to a power of 2 as the initialization length of the array; load factor: the default is 0.75, it can also be customized in the position of parameter 2 when calling the HashMap parameter construction method, it is recommended not to exceed 0.75)
  • The length of the linked list is greater than the threshold (default is 8) and the length of the array is greater than 64. If it is met, the linked list is converted into a red-black tree; if it is not met, the array is expanded.
  • When expanding, call the resize() method to double the size of the original array (that is, expand to twice the original length). At this time, the data storage location will be recalculated according to the hashcode of the original data key.

5. Features

  • Non-thread-safe: Since the internal methods of HashMap are not decorated with synchronized synchronization locks, they are non-thread-safe;
  • High execution efficiency: Due to non-thread safety issues, naturally its execution efficiency is higher than Hashtable;
  • Disorder: When storing data, it will use the hash value of the key and the length of the array to perform% modulo operation to calculate the position index stored in the array, so it is disordered.
  • The key is unique and the value can be repeated: Null key and Null value are allowed. Only one Null Key is allowed, and there can be multiple Null Values;

6. Influencing factors

  • When building a HashMap instance, there are two important parameters that affect its performance: initial capacity and load factor
  • Initial capacity: used to specify the length of the hash table array, the default is 16, because 16 is an integer power of 2, in the case of a small amount of data, it can reduce hash collisions and improve performance. When storing a large amount of data, it is best to predict the amount of data first, and preset the initial capacity in advance according to the power of 2.
  • Load factor: used to indicate the degree of fullness of the elements in the hash table, the default is 0.75, the larger it means the more elements are allowed to be filled, the higher the space utilization of the hash table, but the chance of hash collisions Increase; conversely, the smaller the chance of conflict, but it will cause a waste of space.
  • Therefore, when setting the initial capacity, the initial capacity and its load factor should be considered, and the initial capacity should be estimated to minimize the number of rehash reconstruction operations of internal data structures and reduce expansion operations.

7. Common methods

① put(k,v) function: add element
② get(k) function: get the value value according to the key key (when trying to access a non-existent key-value pair, a null pointer exception will be raised)
③ containskey (specified key) function : Determine whether the specified key exists in the current collection, and the return value is boolean
④ containsvalue (specified value) Function: Determine whether the specified value exists in the current collection, and the return value is boolean
⑤ remove (specified key) Function: According to the specified Key to delete the kv key-value pair and return the deleted value value
⑥ replace (specified key, the new value after modification) Function: modify the kv key-value pair according to the specified key, and return the value before the modification
⑦ keySet() Function :Get all the keys in the collection, and return the Set type collection
⑧ values() Function: Get all the value values ​​in the collection, and return the Collection type collection

Guess you like

Origin blog.csdn.net/weixin_51529267/article/details/113742127