1 unordered_map
In C ++ 98, the bottom of the STL is provided to a series of red-black tree associated container , the query efficiency can reach O (n), need to compare the red-black tree that is twice the height of the worst case
But if large red-black tree is then O (log N) query efficiency is still a bit slow. Thus
** C ++ 11 provides a series of unordered containers, ** query efficiency can be achieved O (1);
4 unordered series of containers,
unordered_map
unordered_set
unordered_multimap
unordered_multiset
Introduction unordered_map
1 unordered_map storage <key, value> pair of key structure, can be quickly indexed by the key value to the value to be queried, (similar to the array subscript)
A key value that uniquely identifies the element 2 unordered_map, key - value can be different types of
3 unordered_map bottom particular order does not exist
4 unordered_map can quickly find through key value, and the map as ordered, it must be traversed to find, so time efficiency is very low.
5 unordered_map achieved operator [] operator overloading, thus allowing the use direct access key as a parameter value
6 iterator is at least forward iterators.
2 unordered_map Interface Description
unordered_map <key_type,value_type> mp;
Different types of key configured unordered_map objects.
bool empty () const // const type function (similar to constant const type), the type of the return value const
size_t size () const // returns the number of effective internal key-value pairs
3 iterator (absence of reverse iterators, the STL + Hash bucket chain to achieve, i.e., open linear probing + hash)
the begin
End
CBEGIN
CEND
4 operator []
have to return the corresponding key value, do not create a default value
它会构造一个键值对 key --V()
Successful return V (), indicating that it has failed to return value exists
5 iterator find (const T & key) // Returns the key location iterator, nullptr a return does not exist;
size_t count (const T & key), the key returns the hash bucket number is the key to the key, i.e., a bucket number of nodes in the list
insert
erase
Clear void
void the swap (unordered_map <,> & MP) exchange two unordered_map J
size_t bucket_count () const // hash bucket number
The number size_t bucket_size (size_t n) const n-th bucket elements (nodes)
size_t bucket (const k & key); bucket number where the return key elements
Why unordered_map query rate is higher than the map, that is because the underlying structure using a hash
Hash
In the red-black trees or AVL tree, key mapping relationship with the value does not exist, and therefore need to compare multiple times.
The idea of hash structures: Can not compare directly to check the value of the hash would like to think that way
So they came up with the hash structure.
Hash think, by a function only to be a corresponding value, namely to establish a mapping relationship ----- that is a hash function.
Common hash function
1 direct custom law
2 In addition I stay
Hash function design principles:
a simple
2 hash function calculation results should be distributed evenly
3 must contain a key which is known
Also known as a hash function Hash hash function, hash structure is called a hash table or hash table.
Hash function hash (key) = key% capacity
Hash collision:
For more key elements, calculated by a hash function addresses a situation may appear
For example; 1,2,3,12,13,18,
Capacity is 10,
2 and 12 calculate address conflict
3 results conflict with the 12
Hash conflict resolution
Overall points open and closed hash hash
1 hash closed (open address method)
Description: When a hash collision occurs, if a hash table is not filled up, then the linear detection method, to find a next "empty position", and wherein the elements into conflict.
Linear probing:
Premise: 1 clashes, two hash table is not full
Starting from the current address, find an empty position until the next date (you can then head back to the table to find) to find, insert a new element
Delete closed hash of
When using closed hashing hash collision, really can not just delete a certain element, a hash of dealing with conflict and closing the equivalent of recurrence,
If this really delete an element, it may cause, depending on the element to introduce other elements can no longer be found .
We therefore provide a status information for each element to describe the presence or absence of this element.
enum state {EMPTY, EXIST, DELETE }
Capacity hash closing mechanism
As Hezeng Rong
In the detection process, we found that the number of elements in the hash table reaches a certain level, hash collision will be very obvious. The more conflict accumulates down. Therefore, we need a certain period, hash tables Capacity
When Capacity
Hash (hash table) a load factor A
A = the number of existing / content hash table
Found in the hash closed (open address method), when the load factor of 0.7, is large, the hit rate is not linear probing, so that when a> = 0.7, the compatibilizer quickly! ! !
How Capacity:
研究也发现,散列表的大小为素数时,线性冲突的概率比较低
Conventional expansion, the capacity to expand generally in linux 2 times, 1.5 times VS typically expand. We unified expand it twice.
The use of prime numbers are 2 tables quickly get a prime target for expansion of relations in size.
Our hash table that white is a vector,
so you can use vector expansion ideas.
1 application space
2 transfer element
3 to release the old space
Note that: After the expansion, after expansion, capacity will expand the hash table, so can not be directly replaced,
but after a need to re-use a hash function to recalculate, during insertion.
Linear probing advantages:
Thinking Simple
Disadvantages:
1 low utilization of space
2 is formed of conflict, easy to accumulate, leading to the search efficiency is affected.
Secondary probe
The purpose is to solve quadratic probing when linear hash collision occurs, easy to accumulate problems.
Thinking: widening probe spacing
When a hash function is calculated to obtain a relative address,
if there is a conflict, then this calculation
H = H0 - i ^ 2
a person H = H0 + i ^ 2
Objectively speaking, the second probe is not ye drops, when the load factor reaches a certain level, it will still lead to increased conflict.
Studies showed a loading factor of no more than 0.5, the insert elements must not conflict, so when the load factor for expansion is preferably more than 0.5.
So obvious, in the second probe, only the space utilization is lower than 0.5 in order to ensure the efficiency of closed hashing.
Open hash
Open chain, also known as hash address method (open-chain method)
Ideas: + Hash bucket list
Figure glance it
Open hash insert, query, remove elements, not just into a linked list to deal with it, the easier it. '
Then open the query hash to a certain element exists or not, it only needs to traverse the linked list, in general, although the need to traverse a hash bucket list, but more because the number of hash buckets, it is generally believed that traverse time efficiency O (1)
Then, open the best efficiency of the hash is what?
Of course, each bucket is only the case with or without a data bar. That does not need to traverse.
So in other words, when the number of barrels bucket_count> = _size; highest efficiency
The number of elements is greater than the number of barrels immediately upon expansion, can always guarantee maximum efficiency, so true is O (1)
So how open hash expansion? You can also use the idea of vector?
In fact, you can, but this efficiency is not Zeyang.
Vector 1 is stored in a linked list of addresses , occurs copy is, in fact, is to copy the address later, that is to say before the occurrence of similar copy,
After the release of the old vector, you can not find the original data;
so we still have a a Insert,
2 When Insert new elements , we must try to avoid a create a node, then a release on a node, that is, try to make good use of the existing nodes to create good,