[Java collection source code] Several questions about HashMap

1. What is the underlying data structure of HashMap?

Answer: The bottom layer of HashMap is the data structure of array + linked list + red-black tree

  • The main function of the array is to facilitate quick search, the time complexity is O(1), the default size is 16, the subscript index of the array is calculated by the hashcode of the key, and the array element is called Node
  • When the hashcodes of multiple keys are the same, but the key values ​​are different, a single Node will be converted into a linked list, and the query complexity of the linked list is O(n)
  • When the length of the linked list is greater than or equal to 8 and the size of the array exceeds 64, the linked list will be transformed into a red-black tree. The query complexity of the red-black tree is O(log(n)). Simply put, the worst query times are equivalent The maximum depth of the red-black tree.

2. In order to solve the hash conflict, what are the possible methods?

answer:

  1. Good hash algorithm
  2. Automatic expansion, when the array size is almost full, automatic expansion can be used to reduce hash conflicts;
  3. When hash conflict occurs, use a linked list to resolve;
  4. When the hash conflict is serious, the linked list will be automatically converted into a red-black tree, which improves the traversal speed.

3. How does HashMapHashMap expand?

  • Expansion time:
    1. Initialization: When put, it is found that the array is empty, and the initial expansion is performed. The default expansion size is 16;
    2. Expansion: After the put is successful, when it is found that the size of the existing array is larger than the expansion threshold, the expansion is performed, and the expansion is twice the size of the old array;
  • The threshold for expansion is threshold, and the threshold will be recalculated every time expansion. The threshold is equal to the size of the array * impact factor (0.75).
  • After the new array is initialized, the values ​​of the old array need to be copied to the new array. Linked lists and red-black trees have their own copy methods.

4. What to do when hash conflicts?

Answer: Hash conflicts often occur when different keys calculate the same hashcode.

  • If there is only one element in the bucket or it is already a linked list, the new element is directly appended to the end of the linked list;
  • If the elements in the bucket are already linked lists, and the number of linked lists is greater than or equal to 8, there are two situations at this time:
    1. If the size of the array is less than 64 at this time, and the array is expanded again, the linked list will not be converted into a red-black tree;
    2. If the array size is greater than 64, the linked list will be converted into a red-black tree.

Here not only the number of linked lists is judged to be greater than or equal to 8, but also the size of the array is judged. The array capacity is less than 64. There is no immediate conversion. The guess is that the red-black tree occupies a lot more space than the linked list, and the conversion is also time-consuming, so the array When the capacity is small, the conflict is serious. We can try to expand the capacity first to see if we can solve the conflict problem by expanding the capacity.

5. Why should the linked list be transformed into a red-black tree when the number of linked lists is greater than or equal to 8?

Answer: These are actually two questions

  1. Why convert to a red-black tree?
    When there are too many linked lists, the traversal may be time-consuming, and it can be transformed into a red-black tree, which can reduce the time complexity of the traversal. However, there will be space and time-consuming costs for converting into red-black trees.
  2. Why is the number of nodes greater than or equal to 8?
    Calculated by the Poisson distribution formula, under normal circumstances, the concept of 8 in the number of linked lists is less than one in ten million, so under normal circumstances, the linked lists will not be converted into red-black trees. The purpose of this design is to prevent Under abnormal circumstances, such as when there is a problem with the hash algorithm, the number of linked lists is easily greater than or equal to 8, and it can still be traversed quickly.

6. When will the red-black tree become a linked list?

Answer: When the number of nodes is less than or equal to 6, the red-black tree will automatically be converted into a linked list. The main consideration is the space cost of the red-black tree. When the number of nodes is less than or equal to 6, traversing the linked list is also fast, so red-black The tree will become a linked list again.

7. When HashMap is put, if the key already exists in the array, what should I do if I don't want to overwrite the value? What should I do if I want to return to the default value when the value obtained is empty?

answer:

  • If the array has a key but you don't want to overwrite the value, you can choose the putIfAbsent method. This method has a built-in variable onlyIfAbsent. If the built-in is true, it will not be overwritten. The put method we usually use, the built-in onlyIfAbsent is false, which allows overwriting.

  • When taking the value, if it is empty and you want to return to the default value, you can use the getOrDefault method. The first parameter of the method is key, and the second parameter is the default value you want to return, such as map.getOrDefault("2","0") , When there is no key 2 value in the map, it will return 0 by default instead of empty.

8. Is it feasible to delete with the following code?

HashMap<String,String > map = Maps.newHashMap();
map.put("1","1");
map.put("2","2");
map.forEach((s, s2) -> map.remove("1"));

Answer: No, it will report an error ConcurrentModificationException. The source code of forEach is as follows:

public void forEach(BiConsumer<? super K, ? super V> action) {
    
    
        Node<K,V>[] tab;
        if (action == null)
            throw new NullPointerException();
        if (size > 0 && (tab = table) != null) {
    
    
            int mc = modCount; // 开始循环之前modCount被赋值给mc
            for (int i = 0; i < tab.length; ++i) {
    
    
                for (Node<K,V> e = tab[i]; e != null; e = e.next)
                    action.accept(e.key, e.value);
            }
            if (modCount != mc) // 删除时remove方法会修改modCount,但mc没变
                throw new ConcurrentModificationException();
        }
    }

It is recommended to use the iterator to delete, the principle is the same as the principle of ArrayList iterator,

9. Describe the process of HashMap get and put

Answer: For details, please refer to [Java Container Source Code] HashMap's most detailed 4D source code analysis (JDK8) . If you are not sure about it, you can draw a picture.

Guess you like

Origin blog.csdn.net/weixin_43935927/article/details/108733304