hash table
Hash tables, also known as hash tables, are a technique that can perform insertions, deletions, and lookups in constant time, but the ordering relationship between elements is often unsupported.
Generally speaking, hashing of strings is the most frequent. In Java, hashCode is a method under java.lang.Object, which is available to all Objects, and is declared as:
public native int hashCode();
There is also a hash method also declared as native:
public static native int identityHashCode(Object x);
Execute the following code snippet, the result will return true
//ss is an instance of a class int i = ss.hashCode(); int ii = System.identityHashCode(ss); System.out.println(i == ii);
Commonly used String hash source code:
//The hash method of the String class public int hashCode() { int h = hash; if (h == 0 && value.length > 0) { char val[] = value; for (int i = 0; i < value.length; i++) { h = 31 * h + val[i]; } hash = h; } return
Hash method for key in LinkedHashMap (not hashCode):
//hash in LinkedHashMap static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }
In a hash table, a good hash method can make full use of the allocated memory and is not prone to collisions.
Ways to resolve conflicts:
1. Separate links
Combine elements with the same hash value into a linked list and put them into an array, that is, the entire hash table is an array of linked lists
2. Open address
That is, elements with the same hash value are allowed to occupy the unoccupied array positions. As for the methods of allocating positions, there are three main methods:
2.1. Linear detection method: that is, to find the next free position next to each other
2.2. Square detection method: Assuming that the position i allocated by an element has been occupied, the element tries to occupy position i^2
2.3. Double hashing: that is, hash the occupied position i again (the secondary hashing method is generally different from the hashing method used for the first time), and get a new position
In addition, there are improved algorithms such as perfect hashing, cuckoo hashing, and hopscotch hashing.
Cuckoo Hash: The basic idea is to create N hash tables, and use N hash functions to hash the elements to be inserted, so that the element can get N expected insertion positions in the N tables, and from the first A table starts to try to insert, if the expected position of the first table is already occupied, then 1. Replace the element originally in this position, and let the element change to another vacant position; 2. Continue to find the next table. Cuckoo hashing can be implemented in parallel, improving efficiency through multithreading
E.g:
If the separation link method is implemented by a singly linked list (a simple implementation written by myself will be given at the end of the article) , the performance of all basic operations will generally decline when faced with a large amount of data, but this problem can be optimized. There are three main methods:
When initializing, select the appropriate initialized array length according to the amount of data
Choose to expand (reHash) when the amount of data slowly increases, and re-hash the location at the same time (for unpredictable increases in the amount of data)
Elements with the same hash value are stored in a double-linked list (LinkedList is a double-linked list) model
The main differences between HashMap, HashSet, LinkedHashMap, HashTable and ConcurrentHashMap in Java
HashMap:
Implemented using the separation chaining method, it is not thread-safe, thread competition will occur when reHash is performed, and key and Value can be null
HashSet:
First look at a source code:
//hashset source code private transient HashMap<E,Object> map; /** * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has * default initial capacity (16) and load factor (0.75). */ public HashSet() { map = new HashMap<>(); }
HashSet is actually implemented based on HashMap, with high access speed, unsafe thread, and nullable
LinkedHashMap:
The double-linked list structure is used to maintain the structure of the insertion order. Basically, the double-linked list is used to maintain the insertion sequence on the basis of HashMap. The thread is unsafe, nullable, and orderly comparison example:
//hashmap and linkedhashmap order comparison HashMap<String,Integer> s = new HashMap<String,Integer>(); for(int i=0 ;i<10 ;i++) { s.put("aaa"+i, i); } System.out.println(s); LinkedHashMap<String,Integer> ss = new LinkedHashMap<String,Integer>(); for(int i=0 ;i<10 ;i++) { ss.put("aaa"+i, i); } System.out.println(ss);
result:
HashTable:
The thread-safe version of HashMap locks the entire table, and the key and value cannot be null
ConcurrentHashMap:
package com.ryo.structure.hash; /** * <p>Hash table of separate linked models * <p>The implementation here is to form a singly linked list of elements hashed to the same position<br>The default length is 103, * But you can also create a custom size hash table by passing an integer greater than 103 to the constructor, * Passing in an integer less than 103 will create a table of size 103 by default<br>Entry is not implemented * <br>The function of eliminating the load factor expansion is not implemented, and the initialization of variable length is used to solve this problem * @author shiin * @param <K> key * @param <V> value */ @SuppressWarnings("unchecked") public class SCHashMap<K ,V> implements HashMap<K ,V>{ private static final int DEFAULT_CAPACITY = 103; private HashNode<K ,V>[] map; private int size; private int currentCapacity; public SCHashMap() { this(DEFAULT_CAPACITY); } public SCHashMap(int capacity) { if(capacity < DEFAULT_CAPACITY) capacity = DEFAULT_CAPACITY; map = new HashNode[capacity]; size = 0; currentCapacity = capacity; } @Override public boolean contains(K key) { HashNode<K ,V> node = map[hash(key)]; while(node != null) { if(node.key.equals(key)) { return true; } node = node.next; } return false; } @Override public V get(K key) { HashNode<K ,V> node = map[hash(key)]; while(node != null) { if(node.key.equals(key)) { return node.value; } node = node.next; } return null; } @Override public int put(K key, V value) { int index = hash(key); if(map[index] == null) { map[index] = new HashNode<K ,V>(key ,value); } else { HashNode<K ,V> node = map[index]; HashNode<K ,V> prev = null; while(node != null) { if(node.key.equals(key)) { node.value = value; return 1; } prev = node; node = node.next; } node = new HashNode<K ,V>(key ,value); prev.next = node; } size++; return 1; } @Override public int remove(K key) { HashNode<K ,V> node = map[hash(key)]; HashNode<K ,V> prev = null; while(node != null) { if(node.key.equals(key)) { if(prev == null) map[hash(key)] = node.next; else prev.next = node.next; node = null;//GC size--; return 0; } prev = node; node = node.next; } return 0; } @Override public int size() { return this.size; } @Override public void clearAll() { map = new HashNode[currentCapacity]; size = 0; } @Override public boolean isEmpty() { return size == 0; } /** * Hash method for key * @param key key value * @return hashed storage location */ private int hash(K key) { if(key == null) return 0; int hash = key.hashCode() % currentCapacity; if(hash < 0) hash += currentCapacity; return hash; } private static class HashNode<K ,V>{ K key; V value; HashNode<K ,V> next; HashNode(K key ,V value ,HashNode<K ,V> next){ this.key = key; this.value = value; this.next = next; } HashNode(K key,V value){ this(key ,value ,null); } } }