[Source code analysis] The difference between HashMap and HashTable (source code analysis and interpretation)


[Source code analysis] The difference between HashMap and HashTable (source code analysis and interpretation)

Foreword: 
It's another great weekend, but unfortunately I got up a bit late today. Let's take a look at HashMap and HashTable to see what the difference between them is.

Let's first come to a more confusing definition:

Copy code

Hashtable instance has two parameters that affect its performance: initial capacity  and  load factor . The capacity is the number of buckets in the hash table, and the initial capacity is the capacity when the hash table is created. Note that the state of the hash table is open: in the event of a "hash collision", a single bucket will store multiple entries, and these entries must be searched in order. The load factor is a measure of how full the hash table can be before its capacity is automatically increased. The two parameters, initial capacity and load factor, are only hints to the implementation. The specific details about when and whether to call the rehash method depend on the implementation.

  HashTable is an implementation of Map interface based on hash table. This implementation provides all optional mapping operations and allows the use of null values ​​and null keys. (Except for being asynchronous and allowing the use of null, the HashMap class is roughly the same as Hashtable.) This class does not guarantee the order of mapping, especially it does not guarantee that the order will last forever.  This implementation assumes that the hash function distributes the elements appropriately among the buckets, which can provide stable performance for basic operations (get and put). The time it takes to iterate the collection view is proportional to the "capacity" (the number of buckets) of the HashMap instance and its size (the number of key-value mapping relationships). So, if iterative performance is important, don't set the initial capacity too high (or set the load factor too low).
  

Copy code

 

One, examples of proof

Copy code

 1      2     public static void main(String[] args) { 3         Map<String, String> map = new HashMap<String, String>(); 4         map.put("a", "aaa"); 5         map.put("b", "bbb"); 6         map.put("c", "ccc"); 7         map.put("d", "ddd"); 
 8         Iterator<String> iterator = map.keySet().iterator(); 9         while (iterator.hasNext()) {10             Object key = iterator.next();11             System.out.println("map.get(key) is :" + map.get(key));12         }13 14         Hashtable<String, String> tab = new Hashtable<String, String>();15         tab.put("a", "aaa");16         tab.put("b", "bbb");17         tab.put("c", "ccc");18         tab.put("d", "ddd");  
19         Iterator<String> iterator_1 = tab.keySet().iterator();20         while (iterator_1.hasNext()) {21             Object key = iterator_1.next();22             System.out.println("tab.get(key) is :" + tab.get(key));23         }24     }25 }

Copy code

First of all, there is such a piece of code above, so what is its output? 

As you can see, HashMap is output in the normal order, while the order of HashTable output is a bit strange.

2, The source code analysis
sees the above results, then let's take a look at the source code of HashMap and HashTable respectively.

First of all, I want to instill some ideas , And then according to these defined rules (summarized by the predecessors) and then go to the source code to find out.

1) HashTable is synchronized, HashMap is an asynchronous
HashTable put and get methods:

Copy code

 1 public synchronized V put(K key, V value) { 2         // Make sure the value is not null 3         if (value == null) { 4             throw new NullPointerException(); 5         } 6  7         // Makes sure the key is not already in the hashtable. 8         Entry<?,?> tab[] = table; 9         int hash = key.hashCode();10         int index = (hash & 0x7FFFFFFF) % tab.length;11         @SuppressWarnings("unchecked")12         Entry<K,V> entry = (Entry<K,V>)tab[index];13         for(; entry != null ; entry = entry.next) {14             if ((entry.hash == hash) && entry.key.equals(key)) {15                 V old = entry.value;16                 entry.value = value;17                 return old;18             }19         }20 21         addEntry(hash, key, value, index);22         return null;23     }

Copy code

 

Copy code

 1 public synchronized V get(Object key) { 2         Entry<?,?> tab[] = table; 3         int hash = key.hashCode(); 4         int index = (hash & 0x7FFFFFFF) % tab.length; 5         for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) { 6             if ((e.hash == hash) && e.key.equals(key)) { 7                 return (V)e.value; 8             } 9         }10         return null;11     }

Copy code

HashMap中put和get方法:

1 public V put(K key, V value) {2       return putVal(hash(key), key, value, false, true);3 }
1 public V get(Object key) {2         Node<K,V> e;3         return (e = getNode(hash(key), key)) == null ? null : e.value;4 }

从以上代码中就能显而易见的看到HashTable中的put和get方法是被synchronized修饰的, 这种做的区别呢? 
由于非线程安全,效率上可能高于Hashtable. 如果当多个线程访问时, 我们可以使用HashTable或者通过Collections.synchronizedMap来同步HashMap。


2)HashTable与HashMap实现的接口一致,但HashTable继承自Dictionary,而HashMap继承自AbstractMap;
HashTable:

 

 HashMap:
 

 

3)HashTable不允许null值(key和value都不可以) ,HashMap允许null值(key和value都可以)。

 在1中我们可以看到HashTable如果value为null就会直接抛出: throw new NullPointerException();
 那么再看看HashMap put value 具体做了什么?

Copy code

public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
}

Copy code

由此可见, 并没有value值进行强制的nullCheck.

4)HashTable有一个contains(Object value)功能和containsValue(Object value)功能一样。
这里我们可以直接对比HashMap和HashTable有关Contains的方法:

HashTable中的contains方法在HashMap中就被取消了, 那么我们来具体看下HashTable中的contains方法的作用: 

Copy code

 1 public synchronized boolean contains(Object value) { 2         if (value == null) { 3             throw new NullPointerException(); 4         } 5  6         Entry<?,?> tab[] = table; 7         for (int i = tab.length ; i-- > 0 ;) { 8             for (Entry<?,?> e = tab[i] ; e != null ; e = e.next) { 9                 if (e.value.equals(value)) {10                     return true;11                 }12             }13         }14         return false;15 }

Copy code

然后再看下HashTable中的containsValue方法:

1 public boolean containsValue(Object value) {2         return contains(value);3 }

这里就很明显了, contains方法其实做的事情就是containsValue, 里面将value值使用equals进行对比, 所以在HashTable中直接取消了contains方法而是使用containsValue代替.

5)HashTable使用Enumeration进行遍历,HashMap使用Iterator进行遍历。


首先是HashTable中:

 View Code

然后是HashMap中:

 View Code

废弃的接口:Enumeration
Enumeration接口是JDK1.0时推出的,是最好的迭代输出接口,最早使用Vector(现在推荐使用ArrayList)时就是使用Enumeration接口进行输出。虽然Enumeration是一个旧的类,但是在JDK1.5之后为Enumeration类进行了扩充,增加了泛型的操作应用。

Enumeration接口常用的方法有hasMoreElements()(判断是否有下一个值)和 nextElement()(取出当前元素),这些方法的功能跟Iterator类似,只是Iterator中存在删除数据的方法,而此接口不存在删除操作。

为什么还要继续使用Enumeration接口
Enumeration和Iterator接口功能相似,而且Iterator的功能还比Enumeration多,那么为什么还要使用Enumeration?这是因为java的发展经历了很长时间,一些比较古老的系统或者类库中的方法还在使用Enumeration接口,因此为了兼容,还是需要使用Enumeration。

下面给出HashTable和HashMap的几种遍历方式:

 Person.java

 Test.java

6)HashTable中hash数组默认大小是11,增加的方式是 old*2+1。HashMap中hash数组的默认大小是16,而且一定是2的指数。

HashMap:

1 /**2      * The default initial capacity - MUST be a power of two.3      */4     static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

HashTable:通常,默认加载因子是 0.75, 这是在时间和空间成本上寻求一种折衷。加载因子过高虽然减少了空间开销,但同时也增加了查找某个条目的时间(在大多数 Hashtable 操作中,包括 get 和 put 操作,都反映了这一点)。

1  // 默认构造函数。2 public Hashtable() {3     // 默认构造函数,指定的容量大小是11;加载因子是0.754     this(11, 0.75f);5 }

 

7)哈希值的使用不同
HashTable:,HashTable直接使用对象的hashCode

1 int hash = key.hashCode();2 int index = (hash & 0x7FFFFFFF) % tab.length;

HashMap:HashMap重新计算hash值,而且用与代替求模:

Copy code

1 int hash = hash(k);2 int i = indexFor(hash, table.length);3 static int hash(Object x) {4 h ^= (h >>> 20) ^ (h >>> 12);5      return h ^ (h >>> 7) ^ (h >>> 4);6 }7 static int indexFor(int h, int length) {8 return h & (length-1);9 }

Copy code

 

3,其他关联
3.1HashMap与HashSet的关系

a、HashSet底层是采用HashMap实现的:

1 public HashSet() {2     map = new HashMap<E,Object>();3 }

b. When the add method of HashSet is called, a row (key-value pair) is actually added to the HashMap. The key of the row is the object added to the HashSet, and the value of the row is a constant of Object type.

1 private static final Object PRESENT = new Object(); public boolean add(E e) { 
2     return map.put(e, PRESENT)==null; 
3 } 
4 public boolean remove(Object o) { 
5     return map.remove(o)==PRESENT; 
6 }

3.2 The relationship between HashMap and ConcurrentHashMap

Regarding this part of the content, I suggest you go through the source code. It ConcurrentHashMap is also a thread-safe collection class. It HashTableis also different from the other. The main difference is the granularity of locking and how to lock. ConcurrentHashMap The granularity of locking HashTableis a little more fine. Divide the data into segments, and assign a lock to each segment of data. When a thread occupies the lock to access one segment of data, the data in other segments can also be accessed by other threads.

For more information, please refer to: http://www.hollischuang.com/archives/82


4. HashTable source code is available

 

 View Code

 

 

 

Category:  Source code reading

Good article should  pay attention to my  favorite article  

Is a flower considered romantic

Powered by .NET 5.0.0-rc.2.20475.5 on Kubernetes


Guess you like

Origin blog.51cto.com/7592962/2543739