Unordered associative containers series

1 unordered_map

In C ++ 98, the bottom of the STL is provided to a series of red-black tree associated container , the query efficiency can reach O (n), need to compare the red-black tree that is twice the height of the worst case

But if large red-black tree is then O (log N) query efficiency is still a bit slow. Thus
** C ++ 11 provides a series of unordered containers, ** query efficiency can be achieved O (1);

4 unordered series of containers,

unordered_map
unordered_set
unordered_multimap
unordered_multiset

Introduction unordered_map

1 unordered_map storage <key, value> pair of key structure, can be quickly indexed by the key value to the value to be queried, (similar to the array subscript)

A key value that uniquely identifies the element 2 unordered_map, key - value can be different types of

3 unordered_map bottom particular order does not exist

4 unordered_map can quickly find through key value, and the map as ordered, it must be traversed to find, so time efficiency is very low.

5 unordered_map achieved operator [] operator overloading, thus allowing the use direct access key as a parameter value

6 iterator is at least forward iterators.

2 unordered_map Interface Description

unordered_map <key_type,value_type> mp;

Different types of key configured unordered_map objects.

bool empty () const // const type function (similar to constant const type), the type of the return value const

size_t size () const // returns the number of effective internal key-value pairs

3 iterator (absence of reverse iterators, the STL + Hash bucket chain to achieve, i.e., open linear probing + hash)
the begin
End
CBEGIN
CEND

4 operator []
have to return the corresponding key value, do not create a default value

它会构造一个键值对  key --V()

Successful return V (), indicating that it has failed to return value exists

5 iterator find (const T & key) // Returns the key location iterator, nullptr a return does not exist;

size_t count (const T & key), the key returns the hash bucket number is the key to the key, i.e., a bucket number of nodes in the list

insert
erase

Clear void
void the swap (unordered_map <,> & MP) exchange two unordered_map J

size_t bucket_count () const // hash bucket number

The number size_t bucket_size (size_t n) const n-th bucket elements (nodes)

size_t bucket (const k & key); bucket number where the return key elements

Why unordered_map query rate is higher than the map, that is because the underlying structure using a hash

Hash

In the red-black trees or AVL tree, key mapping relationship with the value does not exist, and therefore need to compare multiple times.

The idea of ​​hash structures: Can not compare directly to check the value of the hash would like to think that way

So they came up with the hash structure.

Hash think, by a function only to be a corresponding value, namely to establish a mapping relationship ----- that is a hash function.

Common hash function

1 direct custom law

2 In addition I stay

Hash function design principles:
a simple
2 hash function calculation results should be distributed evenly
3 must contain a key which is known

Also known as a hash function Hash hash function, hash structure is called a hash table or hash table.

Hash function hash (key) = key% capacity

Hash collision:

For more key elements, calculated by a hash function addresses a situation may appear

For example; 1,2,3,12,13,18,
Capacity is 10,
2 and 12 calculate address conflict
3 results conflict with the 12

Hash conflict resolution

Overall points open and closed hash hash

1 hash closed (open address method)

Description: When a hash collision occurs, if a hash table is not filled up, then the linear detection method, to find a next "empty position", and wherein the elements into conflict.

Linear probing:
Premise: 1 clashes, two hash table is not full

Starting from the current address, find an empty position until the next date (you can then head back to the table to find) to find, insert a new element

Delete closed hash of

When using closed hashing hash collision, really can not just delete a certain element, a hash of dealing with conflict and closing the equivalent of recurrence,

If this really delete an element, it may cause, depending on the element to introduce other elements can no longer be found .

We therefore provide a status information for each element to describe the presence or absence of this element.
enum state {EMPTY, EXIST, DELETE }

Capacity hash closing mechanism

As Hezeng Rong
In the detection process, we found that the number of elements in the hash table reaches a certain level, hash collision will be very obvious. The more conflict accumulates down. Therefore, we need a certain period, hash tables Capacity

When Capacity

Hash (hash table) a load factor A
A = the number of existing / content hash table

Found in the hash closed (open address method), when the load factor of 0.7, is large, the hit rate is not linear probing, so that when a> = 0.7, the compatibilizer quickly! ! !

How Capacity:

研究也发现,散列表的大小为素数时,线性冲突的概率比较低

Conventional expansion, the capacity to expand generally in linux 2 times, 1.5 times VS typically expand. We unified expand it twice.

The use of prime numbers are 2 tables quickly get a prime target for expansion of relations in size.

Our hash table that white is a vector,
so you can use vector expansion ideas.

1 application space

2 transfer element

3 to release the old space

Note that: After the expansion, after expansion, capacity will expand the hash table, so can not be directly replaced,
but after a need to re-use a hash function to recalculate, during insertion.

Linear probing advantages:
Thinking Simple

Disadvantages:
1 low utilization of space
2 is formed of conflict, easy to accumulate, leading to the search efficiency is affected.

Secondary probe

The purpose is to solve quadratic probing when linear hash collision occurs, easy to accumulate problems.

Thinking: widening probe spacing

When a hash function is calculated to obtain a relative address,
if there is a conflict, then this calculation

H = H0 - i ^ 2
a person H = H0 + i ^ 2

Objectively speaking, the second probe is not ye drops, when the load factor reaches a certain level, it will still lead to increased conflict.

Studies showed a loading factor of no more than 0.5, the insert elements must not conflict, so when the load factor for expansion is preferably more than 0.5.

So obvious, in the second probe, only the space utilization is lower than 0.5 in order to ensure the efficiency of closed hashing.

Open hash

Open chain, also known as hash address method (open-chain method)

Ideas: + Hash bucket list

Here Insert Picture Description

Figure glance it

Open hash insert, query, remove elements, not just into a linked list to deal with it, the easier it. '

Then open the query hash to a certain element exists or not, it only needs to traverse the linked list, in general, although the need to traverse a hash bucket list, but more because the number of hash buckets, it is generally believed that traverse time efficiency O (1)

Then, open the best efficiency of the hash is what?

Of course, each bucket is only the case with or without a data bar. That does not need to traverse.

So in other words, when the number of barrels bucket_count> = _size; highest efficiency

The number of elements is greater than the number of barrels immediately upon expansion, can always guarantee maximum efficiency, so true is O (1)

So how open hash expansion? You can also use the idea of ​​vector?

In fact, you can, but this efficiency is not Zeyang.

Vector 1 is stored in a linked list of addresses , occurs copy is, in fact, is to copy the address later, that is to say before the occurrence of similar copy,

After the release of the old vector, you can not find the original data;
so we still have a a Insert,

2 When Insert new elements , we must try to avoid a create a node, then a release on a node, that is, try to make good use of the existing nodes to create good,

Published 90 original articles · won praise 13 · views 10000 +

Guess you like

Origin blog.csdn.net/weixin_44030580/article/details/104774677