HashMap深入学习

HashMap深入学习 

一、简单介绍

    HashMap是一种比较常用的容器,其结构是数组加链表。外层数组,每个数组元素是链表。用链表解决hash冲突。结合了数组易寻址及链表易插入删除的优点。

二、HahMap的结构

    1、主要属性:

    /**
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 16;

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * The table, resized as necessary. Length MUST Always be a power of two.
     */
    transient Entry[] table;

    /**
     * The number of key-value mappings contained in this map.
     */
    transient int size;

    /**
     * The next size value at which to resize (capacity * load factor).
     * @serial
     */
    int threshold;

    /**
     * The load factor for the hash table.
     *
     * @serial
     */
    final float loadFactor;

    /**
     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
     */
    transient volatile int modCount;
分类 属性名 详细
类变量(default static final) DEFAULT_INITIAL_CAPACITY 默认初始容量,初始化不设置初始值以16为初始值,初始值必须是2的幂值。(table数组的size)
 类变量(default static final)  MAXIMUM_CAPACITY 最大容量,1>>>30(table数组size)
  类变量(default static final)  DEFAULT_LOAD_FACTOR  默认比例因子 0.75,当初始化没有传值时使用。(怎么计算的待研究
实例变量(default  transient Entry[])  table  数组,用来存放hashMap单元数据,每个数组存一个链表结构。数组大小为2的幂值,大小不够可以resize
 实例变量(default  transient int)  size  HashMap数据总数
实例变量 (default)  threshold  table resize 的阈值=capacity * load factor
 实例变量(final float)  loadFactor  加载因子 size/capacity, 默认为0.75
 实例变量(transient volatile int)  modCount

 有新增entry就会加1,remove也加1。

用来做简单的并发控制。

     

     2、主要方法实现

       

          1、初始化

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and load factor.
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);

        // Find a power of 2 >= initialCapacity
        int capacity = 1;
        while (capacity < initialCapacity)
            capacity <<= 1;

        this.loadFactor = loadFactor;
        threshold = (int)(capacity * loadFactor);
        table = new Entry[capacity];
        init();
    }
            入参:              (1)initialCapacity               初始容量,指外层数组的初始大小。              (2)loadFactor               加载因子,当 size/capacity>loadFactor 需要resize扩大数组大小,默认为0.75。                 注:数组容量为大于传入初始容量的最小2的幂值。           2、hash方法
    /**
     * Applies a supplemental hash function to a given hashCode, which
     * defends against poor quality hash functions.  This is critical
     * because HashMap uses power-of-two length hash tables, that
     * otherwise encounter collisions for hashCodes that do not differ
     * in lower bits. Note: Null keys always map to hash 0, thus index 0.
     */
    static int hash(int h) {
        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }
         入参是key值的hashcode,目的是让hash表更松散,具体原理 待研究。         因为hashmap表长度是2的幂,经过indexfor的计算,就是取hashcode的末几位,hash方法的作用,是让末几位受到其他位差异的影响更多,让hashmap更松散。

    

     3、indexfor方法

    /**
     * Returns index for hash code h.
     */
    static int indexFor(int h, int length) {
        return h & (length-1);
    }

        因为hashmap表长度是2的幂,经过indexfor的计算,就是取hashcode的末几位

     

   4、get方法

    /**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
     * key.equals(k))}, then this method returns {@code v}; otherwise
     * it returns {@code null}.  (There can be at most one such mapping.)
     *
     * <p>A return value of {@code null} does not <i>necessarily</i>
     * indicate that the map contains no mapping for the key; it's also
     * possible that the map explicitly maps the key to {@code null}.
     * The {@link #containsKey containsKey} operation may be used to
     * distinguish these two cases.
     *
     * @see #put(Object, Object)
     */
    public V get(Object key) {
        if (key == null)
            return getForNullKey();
        int hash = hash(key.hashCode());
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                return e.value;
        }
        return null;
    }

       HashMap允许key为null,key为null放在数组第一位。

      获取方法就是,先根据key算出数组下标,取到链表,再遍历链表获取到key相同的值,获取其value值。

      遍历的时候,会先比较hash值,我理解这是为了提高性能,因为遍历的时候大多数情况应该是不等的,而大多数的不等通过hash值得比较就能过滤掉。不用通过equals比较,因为hash值是算好的,equals还得再算一遍,有些复杂的可能性能消耗不少。

   

     5、put方法     

    /**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
     *         (A <tt>null</tt> return can also indicate that the map
     *         previously associated <tt>null</tt> with <tt>key</tt>.)
     */
    public V put(K key, V value) {
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key.hashCode());
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }
    根据key算出table下标;取到对应链表,遍历链表,获取key值相同的entry更新value值返回oldValue;如果没有新增entry,返回null。       6、addEntry 方法
    /**
     * Adds a new entry with the specified key, value and hash code to
     * the specified bucket.  It is the responsibility of this
     * method to resize the table if appropriate.
     *
     * Subclass overrides this to alter the behavior of put method.
     */
    void addEntry(int hash, K key, V value, int bucketIndex) {
	Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
        if (size++ >= threshold)
            resize(2 * table.length);
    }
   根据指定的hash值、key、value、bucketIndex(table下标)新增entry;如果大小超过阈值(capacity*loadFactor),table阔成原来的两倍。      7、resize方法
    /**
     * Rehashes the contents of this map into a new array with a
     * larger capacity.  This method is called automatically when the
     * number of keys in this map reaches its threshold.
     *
     * If current capacity is MAXIMUM_CAPACITY, this method does not
     * resize the map, but sets threshold to Integer.MAX_VALUE.
     * This has the effect of preventing future calls.
     *
     * @param newCapacity the new capacity, MUST be a power of two;
     *        must be greater than current capacity unless current
     *        capacity is MAXIMUM_CAPACITY (in which case value
     *        is irrelevant).
     */
    void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }

        Entry[] newTable = new Entry[newCapacity];
        transfer(newTable);
        table = newTable;
        threshold = (int)(newCapacity * loadFactor);
    }
    如果旧表容量已经到MAXIMUM_CAPACITY, threshold设置成最大int值,table不做扩大。    否则根据传入新capacity,创建新table,将旧table数据迁移到新table,重新计算threshold(阈值)。

   

   8、transfer方法,将oldTable数据迁移到newTable

  

    /**
     * Transfers all entries from current table to newTable.
     */
    void transfer(Entry[] newTable) {
        Entry[] src = table;
        int newCapacity = newTable.length;
        for (int j = 0; j < src.length; j++) {
            Entry<K,V> e = src[j];
            if (e != null) {
                //oldtable数据设置为null,我理解能够快速垃圾回收。
                src[j] = null;
                do {
                    Entry<K,V> next = e.next;
                    int i = indexFor(e.hash, newCapacity);
                    e.next = newTable[i];
                    newTable[i] = e;
                    e = next;
                } while (e != null);
            }
        }
    }

   遍历旧table的entry,重新计算index,存入到new table。

 9、HashIterator内部类,HashMap迭代器基类。

  

private abstract class HashIterator<E> implements Iterator<E> {
        Entry<K,V> next;	// next entry to return
        int expectedModCount;	// For fast-fail
        int index;		// current slot
        Entry<K,V> current;	// current entry

        HashIterator() {
            expectedModCount = modCount;
            if (size > 0) { // advance to first entry
                Entry[] t = table;
                while (index < t.length && (next = t[index++]) == null)
                    ;
            }
        }

        public final boolean hasNext() {
            return next != null;
        }

        final Entry<K,V> nextEntry() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            Entry<K,V> e = next;
            if (e == null)
                throw new NoSuchElementException();

            if ((next = e.next) == null) {
                Entry[] t = table;
                while (index < t.length && (next = t[index++]) == null)
                    ;
            }
	    current = e;
            return e;
        }
        
        //可以用这个方法一边遍历一边删除。
        public void remove() {
            if (current == null)
                throw new IllegalStateException();
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            Object k = current.key;
            current = null;
            HashMap.this.removeEntryForKey(k);
            expectedModCount = modCount;
        }

    }

    private final class ValueIterator extends HashIterator<V> {
        public V next() {
            return nextEntry().value;
        }
    }

    private final class KeyIterator extends HashIterator<K> {
        public K next() {
            return nextEntry().getKey();
        }
    }

    private final class EntryIterator extends HashIterator<Map.Entry<K,V>> {
        public Map.Entry<K,V> next() {
            return nextEntry();
        }
    }

    // Subclass overrides these to alter behavior of views' iterator() method
    Iterator<K> newKeyIterator()   {
        return new KeyIterator();
    }
    Iterator<V> newValueIterator()   {
        return new ValueIterator();
    }
    Iterator<Map.Entry<K,V>> newEntryIterator()   {
        return new EntryIterator();
    }

   三、优势

        1、功能方面:

         (1)根据key值寻找对应value,应用范围非常广。

         2、性能方面

         (1)结合数组寻址快速,链表插入删除快速的优点,寻找删除快速。

  

  四、劣势

       1、功能

        (1)无序

       2、性能

       3、可靠性

       (1)非线程安全的,modCout参数有简单的防并发功能。

      

  五、应用场景和使用注意

       (1)大小已知时,HashMap初始化需要传入长度参数,因为不传参数,HashMap的bucket个数不够会做resize比较耗性能。

猜你喜欢

转载自dsnstudy.iteye.com/blog/2304357