java study notes - the container (a)

 

 reference: 

    "Java core technology"

    https://mp.weixin.qq.com/s/SHJzWpZ0MscuJhPLRwWQxg

    https://github.com/LRH1993/android_interview/blob/master/java/basis/hashmap.md

 

-   HashSet  与  TreeSet

  In JAVA, a base class at the beginning of the Abstract, for programmers to develop their own containers. For example HashSet inherited from AbstractSet,

    The fact AbstractSet and just implement Collection Set interface, not the actual content, just to expand.

 

  -   HashSet

    The HashMap is an internal HashSet, additive elements, iterator, size and other common functions, internal direct call is a map of the correlation function.

    

  - different from TreeSet and HashSet, which is an orderly storage. Its interior is achieved using a red-black tree. Find the time complexity is Log2N, to be slightly slower than a hash lookup.

  PS: To sort tree set, the storage element must be able to compare, you must implement Comparable interface element, or a set configuration when Comparator.

 

 

 

 

 -  HashMap 

  Storing the hash key / value pairs, is calculated according to the hash key, then the key-value pairs stored in a suitable location.

  The power of that increase will be based on the storage expansion array elements, and too many elements in the same hash value will be changed to red-black tree stored in the list storage, greatly improving the search efficiency.

   Constructor overloads the four parameters are initialCapacity initial capacity, load factor loadFactor  

   It calculates a threshold value by two parameters, threshold = initialCapacity * loadFactor, meaning more than the threshold value (i.e., a certain percentage of the maximum capacity), the array will be the line expansion.

   Without specific parameters are given, the initial default capacity is 16, load factor is 0.75  

   After a given size, the size of the array will be set equal to a given value larger than the first integer power of 2

   HashMap array (source code for the table) is not completed initialization in the constructor, but was put in the initialization process, which also avoids wasting unused memory Map

   

  - hash () function

  

 

   When calculating the hash value, the first call to hashCode Object () to get a hash value, then an unsigned right, let high 16 and low 16-bit mixed, can participate in the operation, allowing more even distribution of hash.

  When the hash value is determined, the need for capacity modulo operation, but higher bit operation efficiency, capacity and size of n are integer powers of 2, so it can be (n-1) & hash more efficiently complete the modulo operation.

 - common method

  get (Object key) acquired key corresponding to the key value, if the map is not null is returned, the key may be null

  getOrDefault (Object key, V value) returns a value corresponding key, if not, a default value is returned value.

  put (K key, V value) is inserted into a key-value pair if the key already exists, the previous data will be overwritten.

      Method returns the previous value, if not before, null is returned. The method of keys and values ​​may be null 

  putAll (Map <? exdtends K,? extends V> entries) to add map all entries come in 

   Key or value if there is a query containsKey (Object key) containsValue (Object value) returned map

  remove (Object key) Removes key key corresponding to the key elements

 

 - special episode: word counter

  As key string corresponding to a word, the value +1 each occurrence. But this will be a problem, when the word first appeared, the corresponding value is null, the operation can not be completed plus one. There are three solutions:

  (1) map.put (word, map.getOrDefault (word, 0) +1); with getOrDefault function, the default value is set to 0, the null pointer exception avoided

  (2) {map.putIfAbsent (word, 0); map.put (word, map.get (word) +1);} once every "is set to null 0" operation, but this efficiency is very low .

  (3) map.merge (word, 1, Integer :: sum); meaning this method, the operation is the first two parameters of the specific rules. Here is the meaning of the word and the value 1 and operation.

 

 - HashMap part of the source code

   - put () method

    The method call is put inside putVal method, onlyIfAbsent parameter is true, it does not allow the same key to overwrite operation.

 

   putVal The logic is as follows:

    If the table has not been initialized, initialization is performed by a resize () 

    If the table has been initialized, and there is no element corresponding to the hash position, is inserted into the corresponding position

    If the table has been initialized, and has a corresponding location element hash clashes.

      If the key existing elements of the same, will be covered by the operation, but first here just made the original key-value pairs

       Otherwise, if it is under the tree, call the complete method putTreeVal added.

      Otherwise, the list is under construction by the end of the list to find the cycle, adding elements.

    At this point we hold "key-value pairs are covered," if it is not Null, return its value

  At the end of the map to be put is determined whether the element exceeds a threshold, than is required for a resize () Expansion

  When inserted in the list structure, the list will record the number of elements, if more than TREEIFY_THRESHOLD (value 8), treeifyBin () method attempts will be converted into a red-black tree structure by

  In treeifyBin () method, if the length is less than the array MIN_TREEIFY_CAPACITY (value 64) of the array expansion. If the length exceeds 64, then converted into red-black tree structure storage.

  在 树结构中 若节点数少于6 ,将变回链表结构

 

 

 - resize() 方法

final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;

        //1、table已经初始化,且容量 > 0
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                //如果旧的容量已近达到最大值,则不再扩容,阈值直接设置为最大值
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                //如果旧的容量不小于默认的初始容量,则进行扩容,容量扩张为原来的二倍
                newThr = oldThr << 1; // double threshold
        }
        //2、阈值大于0 threshold 使用 threshold 变量暂时保存 initialCapacity 参数的值
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        //3 threshold 和 table 皆未初始化情况,此处即为首次进行初始化
        //也就在此处解释了构造方法中没有对threshold 和 初始容量进行赋值的问题
        else {               // zero initial threshold signifies using defaults
            //如果阈值为零,表示使用默认的初始化值
            //这种情况在调用无参构造的时候会出现,此时使用默认的容量和阈值
            newCap = DEFAULT_INITIAL_CAPACITY;
            //此处阈值即为 threshold=initialCapacity*loadFactor
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        // newThr 为 0 时,按阈值计算公式进行计算,容量*负载因子
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }

        //更新阈值
        threshold = newThr;

        //更新数组桶
        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;

        //如果之前的数组桶里面已经存在数据,由于table容量发生变化,hash值也会发生变化,需要重新计算下标
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                //如果指定下标下有数据
                if ((e = oldTab[j]) != null) {
                    //1、将指定下标数据置空
                    oldTab[j] = null;
                    //2、指定下标只有一个数据
                    if (e.next == null)
                        //直接将数据存放到新计算的hash值下标下
                        newTab[e.hash & (newCap - 1)] = e;
                    //3、如果是TreeNode数据结构
                    else if (e instanceof TreeNode)

                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    //4、对于链表,数据结构
                    else { // preserve order
                        //如果是链表,重新计算hash值,根据新的下标重新分组
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

 

  resize() 函数主要逻辑如下:

    如果数组已初始化过,并且元素数量大于0

      如果容量已经到了最大值,不能再扩容了。就将 threshold 阈值改为 Integer.MAX_VALUE ,之后都不再扩容了

      否则,如果容量的二倍还没到最大值 并且 容量超过了默认初始容量(16),就将容量和阈值都乘2.  

    如果元素数量不大于0,阈值大于0,则用阈值 threshold 取代原本的容量 

    如果以上都不成立,则说明映射还未初始化。用默认的 16 和0.75 进行容量初始化。

    

    如果扩容,数组长度变化,hash值也会发生变化,对所有元素进行遍历,重新计算hash值,并根据其是在链表中还是红黑树中采取不同的操作将其

      放入到对应位置。

     ps:在重新计算hash值时,因为容量是2的整数次幂,扩容也是原有容量乘二,所以可以知道,新的hash值,要么和原来相同,要么是加上oldCap

    

 

         (图摘自https://github.com/LRH1993/android_interview/blob/master/java/basis/hashmap.md

    所以不用再次使用hash()函数计算,而是用原哈希值 & oldCap ,只判断新增的最高位变化即可。

 -   get() 方法

  get 方法内部调用了 getNode() 方法

  

 

  getNode() 方法的逻辑如下:

    找到key 的 hash值对应的位置,如果数组中对应位置处没有元素,则返回null

      如果第一个元素就是要找的元素,就返回第一个元素

      如果第一个元素不是所查元素, 考虑元素结构

        如果是树结构, 用 getTreeNode() 方法进行红黑树查找

        如果是链表结构,则依次查找,直到找到或找遍链表。

 

 - remove() 方法

  其中根据key 找到元素的过程与 get 中一致

   在删除操作中,如果是树结构 还是链表结构,进行不同操作

  

public V remove(Object key) {
        Node<K,V> e;
        return (e = removeNode(hash(key), key, null, false, true)) == null ?
            null : e.value;
    }

    final Node<K,V> removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node<K,V>[] tab; Node<K,V> p; int n, index;

        //根据key和key的hash值,查找到对应的元素
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (p = tab[index = (n - 1) & hash]) != null) {
            Node<K,V> node = null, e; K k; V v;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                node = p;
            else if ((e = p.next) != null) {
                if (p instanceof TreeNode)
                    node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
                else {
                    do {
                        if (e.hash == hash &&
                            ((k = e.key) == key ||
                             (key != null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        p = e;
                    } while ((e = e.next) != null);
                }
            }

            //如果查找的了元素node,移除即可
            if (node != null && (!matchValue || (v = node.value) == value ||
                                 (value != null && value.equals(v)))) {
                //如果是TreeNode,通过树进行移除
                if (node instanceof TreeNode)
                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
                //如果是第一个节点,移除第一个节点,将index下标的位置指向第二个节点
                else if (node == p)
                    tab[index] = node.next;
                else
                   // 如果不是链表头, p 存储的就是 欲删除节点的前一个
            // 将 p 的 next 指向 node 的next,就完成了node 的删除
p.next = node.next; ++modCount; --size; afterNodeRemoval(node); return node; } } return null; }

 

 

-  映射视图

  集合框架不认为映射本身是一个集合。

  Map  的三个方法:

    Set<Map.Entry<K,V>> entrySet()  返回映射中键值对的一个 集视图

    Set<K> keySet()  返回所有键的 集视图

    Collection<V> values()  返回所有值的集视图

    以上三个方法返回的集,可以删除元素,映射中对应的元素也会删除。但不能添加元素,会抛出 UnsupportedOperationException  

   

    除了以上三种得到的集存在限制,将Collection 用 toArray() 转换的数组 也存在一定限制:

    数组类型只能为 Object ,即时知道具体类型,也不能通过强制转换得到对应数组,会抛出 ClassCastException

    如果想要返回具体类型的数组,需要使用变式,  toArray(new String[0]) ,提供一个具体类型的数组,长度可以为0也可以指定具体长度。

Guess you like

Origin www.cnblogs.com/xfdmyghdx/p/10513244.html