[C++] The use of unordered_map and unordered_set

Article directory


foreword

Unordered series associative containers :
In C++98 , STL provides a series of associative containers whose bottom layer is a red-black tree structure, and the query efficiency can reach O(logN), that is, in the worst case, it is necessary to compare the height of the red-black tree. When the tree When there are many nodes in , the query efficiency is not ideal. The best query is to find elements with a small number of comparisons, so in C++11 , STL provides 4 more
Associative containers of the unordered series, these four containers are basically similar to the associative containers of the red-black tree structure, except that
Its underlying structure is different.
1.unordered_map

 Let's compare the difference between unordered_map and map:

Seeing this, everyone has discovered that, in fact, their functions are exactly the same, but the bottom layer is different. Note: map and set use bidirectional iterators, unordered series use unidirectional iterators

map / set: the bottom layer is a red-black tree

unordered_map / unordered_set: the bottom layer is a hash table

Why design them both? Because hash lookup efficiency is very high!

Let's go to the use part:


1. The use of unordered_map

#include <unordered_map>
#include <unordered_set>
#include <iostream>
using namespace std;

void unordered_map_test()
{
	unordered_map<string, int> ump;
	ump.insert(make_pair("left", 1));
	ump.insert(make_pair("right", 2));
	ump.insert(make_pair("string", 3));
	ump.insert(make_pair("list", 4));
	ump.insert(make_pair("list", 5));
	for (auto& e : ump)
	{
		cout << e.first << ":" << e.second << endl;
	}
	cout << endl;
	unordered_map<string, int>::iterator it = ump.begin();
	while (it != ump.end())
	{
		cout << it->first << ":" << it->second << endl;
		++it;
	}
}
int main()
{
	unordered_map_test();
}

 We can see that the use of unordered is exactly the same as that of map. The repeated data will not be inserted. Of course, there are also unordered_multimap series that support repeated data insertion. So what is the main difference? Let's take a look:

 It can be seen that the main difference is that the unordered series is out of order. Below we give the performance tests of the map and unordered series:

void unordered_map_test2()
{
	const size_t N = 1000000;

	unordered_set<int> us;
	set<int> s;

	vector<int> v;
	v.reserve(N);
	srand(time(0));
	for (size_t i = 0; i < N; ++i)
	{
		v.push_back(rand());
		//v.push_back(rand()+i);
		//v.push_back(i);
	}

	size_t begin1 = clock();
	for (auto e : v)
	{
		s.insert(e);
	}
	size_t end1 = clock();
	cout << "set insert:" << end1 - begin1 << endl;

	size_t begin2 = clock();
	for (auto e : v)
	{
		us.insert(e);
	}
	size_t end2 = clock();
	cout << "unordered_set insert:" << end2 - begin2 << endl;


	size_t begin3 = clock();
	for (auto e : v)
	{
		s.find(e);
	}
	size_t end3 = clock();
	cout << "set find:" << end3 - begin3 << endl;

	size_t begin4 = clock();
	for (auto e : v)
	{
		us.find(e);
	}
	size_t end4 = clock();
	cout << "unordered_set find:" << end4 - begin4 << endl << endl;

	cout << s.size() << endl;
	cout << us.size() << endl << endl;;

	size_t begin5 = clock();
	for (auto e : v)
	{
		s.erase(e);
	}
	size_t end5 = clock();
	cout << "set erase:" << end5 - begin5 << endl;

	size_t begin6 = clock();
	for (auto e : v)
	{
		us.erase(e);
	}
	size_t end6 = clock();
	cout << "unordered_set erase:" << end6 - begin6 << endl << endl;
}

 

Let's start with 10000 random numbers as an example:

 We can see that all functions of the hash series are faster than ordinary ones. Let's compare 100,000 random numbers:

 It can be seen that the unordered series is still faster, and another 1,000,000:

 From the above results, we can see why c++ adds a new hash series. The following is the data measured by another machine:

 Summary: In terms of various scenarios, the overall performance of the unordered series is better, especially find is the best.

Second, the use of unordered_set

Similarly, let's see if there are any functional differences between set and hash set:

 As we said before, set is a two-way iterator and hash series is a one-way iterator. Except for the iterator difference, other functions are almost the same. Let's demonstrate how to use the basic functions:

void unordered_set_test()
{
	unordered_set<int> ust;
	ust.insert(1);
	ust.insert(7);
	ust.insert(4);
	ust.insert(9);
	ust.insert(3);
	for (auto& e : ust)
	{
		cout << e << " ";
	}
	cout << endl;
	unordered_set<int>::iterator it = ust.begin();
	while (it != ust.end())
	{
		cout << *it << " ";
		++it;
	}
}
int main()
{
	unordered_set_test();
}

Exercises: 

 The same difference with set is unordered. The above is the use of all content hash series. Next, we use the hash series to practice a question:

Intersection I of two numbers:

Ritko Link: Rikko

 The intersection only needs to find the same elements of the two arrays, and the title tells us that each element is unique so there are no duplicate elements.

class Solution {
public:
    vector<int> intersection(vector<int>& nums1, vector<int>& nums2) {
       unordered_set<int> us1(nums1.begin(),nums1.end());
       unordered_set<int> us2(nums2.begin(),nums2.end());
       vector<int> v;
       for (auto& e:us1)
       {
           if (us2.find(e)!=us2.end())
           {
                v.push_back(e);
           }
       }
       return v;
    }
};

First we put the numbers of the two arrays into the hash set respectively, then we traverse the first hash set (you can also traverse the second one), and then we let the second hash set find the first If the value in a hash set is found, it means that the number is an intersection and put it in the array, and then finally return the array.


Summarize

The container of the hash series is almost the same as the previous container. It is easier to use the STL container frequently. In the next article, we will explain the underlying principle of the hash table and implement it. Therefore, if you want to realize the underlying layer, you still need to know what the container is How to use you can realize the function.

Guess you like

Origin blog.csdn.net/Sxy_wspsby/article/details/130786386