[C++] Hash Open Hash | Encapsulation of unordered series containers

Foreword: In the previous blog, we implemented Hash by means of closed hashing. In fact, in the bottom layer of the STL library unordered_set and unordered_map, Hash is implemented by means of open hashing. Therefore, this blog will be implemented by means of open hashing. Hash, and encapsulate unordered_set and unordered_map.

1. Open hash

1. The concept of open hashing

The open hash method is also called the chain address method (zipper method, hash bucket). First, the hash function is used to calculate the hash address for the key code set. The key codes with the same address belong to the same sub-set, and each word set is called a Buckets, the elements in each bucket are linked through a singly linked list, and the head nodes of each linked list are stored in the hash table .

insert image description here

2. Open hash structure

First, we need to use vector to store the nodes of each linked list, and then each node has a data field and a pointer next field. Then we can write the constructor of HashNode and use the pair type to construct a HashNode.

template <class K, class V>
struct HashNode
{
    
    
	HashNode(const pair<K, V>& kv)
		:_kv(kv), _next(nullptr)
	{
    
    
	}
	pair<K, V> _kv;
	HashNode<K, V>* _next;
};
 
template <class K, class V, class Hash = HashFunc<K>>
class HashTable
{
    
    
public:
	typedef HashNode<K, V> Node; 
private:
		vector<Node*> _table;
		size_t _size = 0;
};

3. Insert

First we implement the main logic of insertion, and then optimize it step by step.

We create a node according to kv, and then calculate the mapping position according to the modulus of the functor, and then insert the head of the linked list.

bool Insert(const pair<K, V>& kv)
{
    
            
    Hash hash;
	size_t hashi = hash(kv.first) % _table.size();
	//头插
	Node* newNode = new Node(kv);
	newNode->_next = _table[hashi];
	_table[hashi] = newNode;
	++_size;
	return true;
}

Inserting data into a hash table must first ensure the uniqueness of the data, so we need to perform deduplication processing first, and at this time we implement the Find function by the way.

4. Find Find

Find the mapping position according to the key value. If the position is not empty, traverse the linked list. If the key value is found, return the cur node. If not, traverse backward until cur is empty.

Node* Find(const K& key)
{
    
    
	if (_table.size() == 0) return nullptr;
	Hash hash;
	size_t hashi = hash(key) % _table.size();
	//向桶中进行查找 
	Node* cur = _table[hashi];
	while (cur)
	{
    
    
		if (cur->_kv.first == key)
		{
    
    
			return cur;
		}
		cur = cur->_next;
	}
	return nullptr;
}

5. Insert expansion

The main logic of insertion is realized, and the deduplication judgment is also realized, and the next step is to expand the table.

If the size of the hash table is 0 or the load factor of the hash is reached, it needs to be expanded.

Let's take a look at how much the load factor controls in the STL library:

insert image description here

The load factor designed in the STL library is: when the number of elements inserted in the table > the size of the hash table , that is, when the load is 1, expand the capacity, and expand the size of the table to next_size.

Pay attention to moving data for expansion, because the number of data on each bucket with hashing is different. After expansion, each element in the bucket may be mapped to a different new location, so we cannot reuse Insert like a closed hash, and we need to link the nodes to the new table again.

When moving, let the nodes in the original table be linked to the new table one by one :

//扩容  ---  如果插入的数据等于表的大小
if (_size == _table.size())
{
    
    
	size_t newSize = _table.size() == 0 ? 10 : _table.size() * 2;
	vector<Node*> newTable;
	newTable.resize(newSize, nullptr);
	//将旧表中的节点移动映射到新表
	Hash hash;
	for (size_t i = 0; i < _table.size(); i++)
	{
    
    
		Node* cur = _table[i];
		while (cur)
		{
    
    
			Node* next = cur->_next;
			size_t hashi = hash(cur->_kv.first) % _table.size();
			cur->_next = newTable[hashi];
			newTable[hashi] = cur;
			cur = next;
		}
		//将旧表i位置处结点清空
		_table[i] = nullptr;
	}
	_table.swap(newTable);
}

insert image description here
insert image description here

It is found that a function is called when expanding the capacity in the source code next_size. Wouldn’t it be enough to simply multiply the size by 2 for capacity expansion? Why do we need to calculate the size specially?

Because the size of the hash table is best to be a prime number, if it is a prime number, the probability of conflict in the result of the mapping is small , because there are many non-prime factors, and the same position conflicts after mapping. After the size of the hash table is designed as a prime number, the number of conflicts in individual buckets in the hash table can actually be too large.

See this article for details: Algorithm Analysis: Why is the size of the hash table a prime number

Now we also add this functionality:

The lower_bound (returns the first subscript greater than or equal to n)/upper_bound (returns the first subscript greater than n) is used in the library. In fact, just use the for loop to traverse directly.

inline size_t __stl_next_prime(size_t n)
{
    
    
	static const size_t __stl_num_primes = 28;
	static const size_t  __stl_prime_list[__stl_num_primes] =
	{
    
    
	  53,         97,         193,       389,       769,
	  1543,       3079,       6151,      12289,     24593,
	  49157,      98317,      196613,    393241,    786433,
	  1572869,    3145739,    6291469,   12582917,  25165843,
	  50331653,   100663319,  201326611, 402653189, 805306457,
	  1610612741, 3221225473, 4294967291
	};
	//取下一次扩容的大小:
	for (size_t i = 0; i < __stl_next_prime; i++)
	{
    
    
		if (__stl_prime_list[i] > n)
			return __stl_prime_list[i];
	}
	return (size_t)-1;
}

6. Erase delete

Although we have implemented the Find function, the delete function cannot be completed by using Find alone.

For example, in the following situation, we need to know the prev node to delete the intermediate node in the singly linked list.

insert image description here

bool Erase(const K& key)
{
    
    
	if (_table.size() == 0) return false;
	Hash hash;
	size_t hashi = hash(key) % _table.size();
	Node* pre = nullptr;
	Node* cur = _table[hashi];
	while (cur)
	{
    
    
		if (cur->_kv.first == hash(key))
		{
    
    
			//如果删除的是链中第一个元素 --- 即头删
			if (pre == nullptr)
			{
    
    
				_table[hashi] = cur->_next;
			}
			//2.中间删除
			else
			{
    
    
				pre->_next = cur->_next;
			}
			delete cur;
			--_size;
			return true;
		}
		pre = cur;
		cur = cur->_next;
	}
	return false;
}

7. Destructor

Note that the destructor will be called when the life cycle of the hash table ends. The vector we use will automatically release the contents of the table, but the vector stores the linked list. When we release the bucket (linked list), we also need to release it. So we have to manually write a destructor.

~HashTable()
{
    
    
	for (size_t i = 0; i < _table.size(); ++i)
	{
    
    
		Node* cur = _table[i];
		while (cur)
		{
    
    
			Node* next = cur->_next;
			delete cur;
			cur = next;
		}
		_table[i] = nullptr;
	}
}

8. Other function interfaces

//表的长度
size_t BucketSize()
{
    
    
	return _table.size();
}
//数据个数
size_t Size()
{
    
    
	return _size;
}
//桶的数量
size_t BucketNum()
{
    
    
	size_t Num = 0;
	for (size_t i = 0; i < BucketSize(); i++)
	{
    
    
		if (_table[i]) Num++;
	}
	return Num;
}
//最长的桶
size_t MaxBucketLenth()
{
    
    
	size_t Max_len = 0;
	size_t temp = 0;
	for (size_t i = 0; i < BucketSize(); i++)
	{
    
    
		if (_table[i])
		{
    
    
			size_t len = 1;
			Node* cur = _table[i]->_next;
			while (cur)
			{
    
    
				len++;
				cur = cur->_next;
			}
			if (len > Max_len)
			{
    
    
				Max_len = len;
				temp = i;
			}
		}
	}
	printf("Max_len_i:[%u]\n", temp);
	return Max_len;
}

9. Performance testing

void TestHT()
{
    
    
	int n = 18000000;
	vector<int> v;
	v.reserve(n);
	srand((unsigned int)time(0));
	for (int i = 0; i < n; ++i)
	{
    
    
		v.push_back(rand()+i);  // 重复少
		//v.push_back(rand());  // 重复多
	}
	size_t begin1 = clock();
	HashTable<int, int> ht;
	for (auto e : v)
	{
    
    
		ht.Insert(make_pair(e, e));
	}
	size_t end1 = clock();
 
	cout << "数据个数:" << ht.Size() << endl;
	cout << "表的长度:" << ht.BucketSize() << endl;
	cout << "桶的个数:" << ht.BucketNum() << endl;
	cout << "平均每个桶的长度:" << (double)ht.Size() / (double)ht.BucketNum() << endl;
	cout << "最长的桶的长度:" << ht.MaxBucketLenth() << endl;
	cout << "负载因子:" << (double)ht.Size() / (double)ht.BucketSize() << endl;
}

insert image description here

It is found that after setting the size of the hash table to a prime number, even if the load factor reaches 0.9, the longest bucket is only 2. So the lookup efficiency of the hash table is O(1).

Next, we compare the search efficiency of the red-black tree and the hash table (find 10 million data)

insert image description here

The insertion efficiency of the hash table is low because it takes a lot of time to expand and move data.

Next, we use set and onordered_set (the bottom layer corresponds to the red-black tree and hash table), insert 10 million random numbers into them, compare their performance, and compare the efficiency of direct insertion and advance expansion of onordered_set before inserting.

insert image description here

The test code is as follows:

void test_op()
{
    
    
	int n = 10000000;   //1千万个数据
	vector<int> v;
	v.reserve(n);
	srand((unsigned int)time(0));
	for (int i = 0; i < n; ++i)
	{
    
    
		//v.push_back(i);
		v.push_back(rand()^ 1311 * 144+i);
	}
 
	size_t begin1 = clock();
	set<int> s;
	for (auto e : v)
	{
    
    
		s.insert(e);
	}
	size_t end1 = clock();
 
	size_t begin2 = clock();
 
	unordered_set<int> us;
	us.reserve(n);
 
	for (auto e : v)
	{
    
    
		us.insert(e);
	}
	size_t end2 = clock();
 
	cout << "有效数据个数:" << s.size() << endl;
	cout << "\nInsert 插入:" << endl;
	cout << "set : " << end1 - begin1 << endl;
	cout << "unordered_set : " << end2 - begin2 << endl;
 
	size_t begin3 = clock();
	for (auto e : v)
	{
    
    
		s.find(e);
	}
	size_t end3 = clock();
 
	size_t begin4 = clock();
	for (auto e : v)
	{
    
    
		us.find(e);
	}
	size_t end4 = clock();
 
	cout << "\nFind 查找:" << endl;
	cout << "set :" << end3 - begin3 << endl;
	cout << "unordered_set :" << end4 - begin4 << endl;
 
 
	size_t begin5 = clock();
	for (auto e : v)
	{
    
    
		s.erase(e);
	}
	size_t end5 = clock();
	size_t begin6 = clock();
	for (auto e : v)
	{
    
    
		us.erase(e);
	}
	size_t end6 = clock();
	cout << "\nErase 删除:" << endl;
 
	cout << "set erase:" << end5 - begin5 << endl;
	cout << "unordered_set erase:" << end6 - begin6 << endl;
}

The above is the basic implementation of our hashing and hashing. After realizing the above functions, we can encapsulate unordered_map/unordered_set.

2. Packaging

1. Package internal structure

The first is to change the data type stored in each node in the HashTable. For example, the key is stored in unordered_set, and the pair type is stored in unordered_map, so we change the type stored in the node to T. If it is set, T corresponds to key, if it is a map, then T corresponds to the pair structure .

template <class T>
struct HashNode
{
    
    
	HashNode(const T& data)
		:_data(data), _next(nullptr)
	{
    
    }
 
	T _data;
	HashNode<T>* _next;
};

Therefore, the type inserted by Insert should also be changed to the T template type, and when the value in the type is used, use the functor to retrieve the compared data.

Then we will write the unordered_set(map) class

The bottom layer of unordered is to call the HashTable we wrote, so directly use HashTable to define member variables and pass in template parameters. (The following abbreviated set and map correspond to the unordered_set(map) implemented by the Hash method)

Note that because set is a Key model, only one template parameter can be set; while map is a KV model, two template parameters need to be set corresponding to the two data types in the pair. Therefore, at the bottom layer, we pass in two template parameters of HashTalbe, and use the second template parameter to determine the type of bottom storage. If it is a set, use the functor to get the key, and if it is a map, use the functor to get the pair. first.

Therefore, before passing in the parameters, we must first write the functor set(map)KeyOfT, so that the underlying data can be retrieved.

//****   set   *********
template<class K, class Hash = HashFunc<K>>
class unordered_set
{
    
    
public:
private:
	struct setKeyOfT
	{
    
    
		const K& operator()(const K& key)
		{
    
    
			return key;
		}
	};
    //两个模板参数都传入K
	HashTable<K, K, Hash, setKeyOfT> _ht;
};
//****   map   *********
template<class K, class V, class Hash = HashFunc<K>>
class unordered_map
{
    
    
public:
private:
	//让HashTable取出pair中的K  --- 内部类
	struct mapKeyOfT
	{
    
    
		const K& operator()(const pair<K, V>& kv)
		{
    
    
			return kv.first;
		}
	};
	HashTable<K, pair<K, V>, Hash, mapKeyOfT> _ht;
};

2. Implement the interface

The next step is to design member functions for our encapsulated map and set. In fact, we only encapsulate one layer, and the essence is to call functions such as Insert and Erase in HashTable.

// ******  set  ********
bool insert(const K& kv)
{
    
    
	return _ht.Insert(kv);
}
 
bool erase(const K& kv)
{
    
    
	return _ht.Erase(kv);
}
 
// ******  map  ********
bool insert(const pair<K, V>& kv)
{
    
    
	return _ht.Insert(kv);
}
bool erase(const K& k)
{
    
    
	return _ht.Erase(k);
}

Note that in the bottom layer of Insert and Erase, which involves key value operations, we need to use two layers of functors to obtain values.

Three. Generator

1. Definition of iterator

There are iterator interfaces (begin(), end()) in HashTable, and the structure of HashTable is also used in the iterator, so we need to declare the HashTable before implementing the iterator (note: template class The declaration must be declared together with the template parameter).

Let's take a look at how the iterator is defined in the source code

insert image description here

Next is our definition:

//前置声明
template <class K, class T, class Hash, class keyOfT>
class HashTable;
 
template<class K, class T, class Hash, class keyOfT>
class __Hash_Iteartor
{
    
    
public:
	typedef HashNode<T> Node;
	typedef HashTable<K, T, Hash, keyOfT> HT;
	typedef __Hash_Iteartor<K, T, Hash, keyOfT> Self;
    __Hash_Iteartor(Node* node, HT* pht)
  		:_node(node), _pht(pht)
	{
    
    }
    __Hash_Iteartor()
    {
    
    }
private:
	//成员变量
	Node* _node;   //指向结点
	HT* _pht;	   //指向当前表
};

2. Common interfaces

Next, implement some commonly used interfaces:

T& operator*()
{
    
    
	return _node->_data;
}
 
T* operator->()
{
    
    
	return &_node->_data;
}
bool operator!=(const Self& self)
{
    
    
	return _node != self._node;
}
bool operator==(const Self& self)
{
    
    
	return _node == self._node;
}

3. Iterator++

Implementation of Iterator++ in STL:

insert image description here

The idea is as follows:

  1. Determine whether there is a node in _next of _node, if there is, just let _node = _node->_next
  2. If there is no node, the current bucket traversal ends, and the next bucket with data is to be found.
  3. Find the mapping position according to the data field in _node, and then traverse the hash table backwards from the mapping position until there is data at talbe[i], and jump out of the loop if there is data
  4. When i is equal to the size of the hash table, it means that there is no next data, then assign _node to nullptr
  5. Returns *this, which returns the current object.
Self& operator++()
{
    
    
	//在当前桶中进行++
	if (_node->_next)
	{
    
    
		_node = _node->_next;
	}
	else //找下一个有效的桶
	{
    
    
		Hash hash;
		keyOfT kft;
		size_t i = hash(kft(_node->_data)) % _pht->_table.size();
		for (i += 1; i < _pht->_table.size(); i++)
		{
    
    
			if (_pht->_table[i])
			{
    
    
				_node = _pht->_table[i];
				break;
			}
		}
		//如果不存在有数据的桶
		if (i == _pht->_table.size())
			_node = nullptr;
	}
	return *this;
}

Note that at this point we use a hash table and specifically access the elements in it, so we need to use the iterator as a friend class of HashTable (also declare it with template parameters).

insert image description here

4. begin()、end()

begin is to return the first bucket that stores data in the HashTable. If there is no data stored in the table, end() is returned directly, and the _node in the end() iterator is constructed by nullptr.

typedef __Hash_Iteartor<K, T, Hash, keyOfT> iterator;
 
iteratorbegin()
{
    
    
	for (size_t i = 0; i < _table.size(); i++)
	{
    
    
		if (_table[i])
			return iterator(_table[i], this);
	}
	return end();
}
iterator end()
{
    
    
	return iterator(nullptr, this);
}

5. Changes to find

In find, we return the iterator directly, and use an anonymous object to return in the place of return.

insert image description here

6. Subscript access[ ] overload

If we want to overload the subscript access operator in map, we need to modify Insert so that its return value is a pair structure, where first is an iterator and second is a bool type , indicating whether the insertion is successful or not (although it can be achieved without changing ).

insert image description here
insert image description here

After the modification of Insert is completed, the [] subscript access operator can be overloaded in the map.

V& operator[](const K& key)
	{
    
    
		pair<iterator, bool> ret = _ht.Insert(make_pair(key, V()));
		return ret.first->second;
	}

Interview question:
What are the requirements for a type K to be a template parameter of set and unordered_set?

  1. set :
    set requires support for less-than sign comparisons, or display functors that provide comparisons

  2. unordered_set:

    • It is required that an object of type K can be transformed into an integer modulus, or provide a functor that can be converted into an integer
    • K-type objects must support equal comparison, or provide a functor for equal comparison (if set is less than, you can find data in the way of left small and right large; and unordered_set will conflict, use the key value to find only the mapped bucket, and traverse the bucket When , you need to perform an equal comparison)

4. Source code and test cases

1. The underlying HashTable

template<class K>
struct HashFunc
{
    
    
	size_t operator()(const K& key)
	{
    
    
		return (size_t)key;
	}
};
 
template<>
struct HashFunc<string>
{
    
    
	size_t operator()(const string& key)
	{
    
    
		size_t val = 0;
		for (auto ch : key)
			val = val * 131 + ch;
		return val;
	}
};
 
template <class T>
struct HashNode
{
    
    
	HashNode(const T& data)
		:_data(data), _next(nullptr)
	{
    
    }
 
	T _data;
	HashNode<T>* _next;
};
 
 
// 对哈希表进行前置声明
template <class K, class T, class Hash, class keyOfT>
class HashTable;
 
template<class K, class T, class Hash, class keyOfT>
class __Hash_Iteartor
{
    
    
public:
	typedef HashNode<T> Node;
	typedef HashTable<K, T, Hash, keyOfT> HT;
	typedef __Hash_Iteartor<K, T, Hash, keyOfT> Self;
	__Hash_Iteartor(Node* node, HT* pht)
		:_node(node), _pht(pht)
	{
    
    }
	__Hash_Iteartor()
		:_node(nullptr), _pht(nullptr)
	{
    
    }
 
	T& operator*()
	{
    
    
		return _node->_data;
	}
 
	T* operator->()
	{
    
    
		return &_node->_data;
	}
	Self& operator++()
	{
    
    
		//在当前桶中进行++
		if (_node->_next)
		{
    
    
			_node = _node->_next;
		}
		else //找下一个有效的桶
		{
    
    
			Hash hash;
			keyOfT kft;
			size_t i = hash(kft(_node->_data)) % _pht->_table.size();
			for (i += 1; i < _pht->_table.size(); i++)
			{
    
    
				if (_pht->_table[i])
				{
    
    
					_node = _pht->_table[i];
					break;
				}
			}
			//如果不存在有数据的桶
			if (i == _pht->_table.size())
				_node = nullptr;
		}
		return *this;
	}
 
	bool operator!=(const Self& self)
	{
    
    
		return _node != self._node;
	}
	bool operator==(const Self& self)
	{
    
    
		return _node == self._node;
	}
 
private:
	//成员
	Node* _node;   //指向结点
	HT* _pht;	   //指向当前表
 
};
 
template <class K, class T, class Hash, class keyOfT>
class HashTable
{
    
    
public:
	typedef HashNode<T> Node;
	//将迭代器设为友元
	template<class K, class T, class Hash, class keyOfT>
	friend class  __Hash_Iteartor;
 
	typedef __Hash_Iteartor<K, T, Hash, keyOfT> iterator;
 
	iterator begin()
	{
    
    
		for (size_t i = 0; i < _table.size(); i++)
		{
    
    
			if (_table[i])
				return iterator(_table[i], this);
		}
		return end();
	}
	iterator end()
	{
    
    
		return iterator(nullptr, this);
	}
 
	//析构要进行特殊处理,遍历整个表,再删除桶中的数据。
	~HashTable()
	{
    
    
		for (size_t i = 0; i < _table.size(); ++i)
		{
    
    
			Node* cur = _table[i];
			while (cur)
			{
    
    
				Node* next = cur->_next;
				delete cur;
				cur = next;
			}
			_table[i] = nullptr;
		}
	}
 
	inline size_t __stl_next_prime(size_t n)
	{
    
    
		static const size_t __stl_num_primes = 28;
		static const size_t  __stl_prime_list[__stl_num_primes] =
		{
    
    
		  53,         97,         193,       389,       769,
		  1543,       3079,       6151,      12289,     24593,
		  49157,      98317,      196613,    393241,    786433,
		  1572869,    3145739,    6291469,   12582917,  25165843,
		  50331653,   100663319,  201326611, 402653189, 805306457,
		  1610612741, 3221225473, 4294967291
		};
		//取下一次扩容的大小:
		for (size_t i = 0; i < __stl_num_primes; i++)
		{
    
    
			if (__stl_prime_list[i] > n)
				return __stl_prime_list[i];
		}
		return (size_t)-1;
	}
	pair<iterator, bool> Insert(const T& data)
	{
    
    
		Hash hash;
		keyOfT koft;
		//去重
		iterator ret = Find(koft(data));
		if (ret != end())
		{
    
    
			return make_pair(ret, false);
		}
		//扩容  ---  如果插入的数据等于表的大小
		if (_size == _table.size())
		{
    
    
			//size_t newSize = _table.size() == 0 ? 10 : _table.size() * 2;
			vector<Node*> newTable;
			size_t newSize = __stl_next_prime(_table.size());
			newTable.resize(newSize, nullptr);
			//将旧表中的节点移动映射到新表
			for (size_t i = 0; i < _table.size(); i++)
			{
    
    
				Node* cur = _table[i];
				while (cur)
				{
    
    
					Node* next = cur->_next;
					size_t hashi = hash(koft(cur->_data)) % newSize;
					cur->_next = newTable[hashi];
					newTable[hashi] = cur;
					cur = next;
				}
				//将旧表i位置处结点清空
				_table[i] = nullptr;
			}
			_table.swap(newTable);
		}
		size_t hashi = hash(koft(data)) % _table.size();
		//头插
		Node* newNode = new Node(data);
		newNode->_next = _table[hashi];
		_table[hashi] = newNode;
		++_size;
 
		return make_pair(iterator(newNode, this), true);
	}
 
	iterator Find(const K& key)
	{
    
    
		if (_table.size() == 0) return end();
		Hash hash;
		keyOfT koft;
		size_t hashi = hash(key) % _table.size();
		//向桶中进行查找 
		Node* cur = _table[hashi];
		while (cur)
		{
    
    
			if (koft(cur->_data) == key)
			{
    
    
				return iterator(cur, this);
			}
			cur = cur->_next;
		}
		return end();
	}
 
	//单链表不能直接找到该节点并删除
	bool Erase(const K& key)
	{
    
    
		if (_table.size() == 0) return false;
		Hash hash;
		keyOfT koft;
		size_t hashi = hash(key) % _table.size();
		Node* pre = nullptr;
		Node* cur = _table[hashi];
		while (cur)
		{
    
    
			if (koft(cur->_data) == hash(key))
			{
    
    
				//如果删除的是链中第一个元素 --- 即头删
				if (pre == nullptr)
				{
    
    
					_table[hashi] = cur->_next;
				}
				//2.中间删除
				else
				{
    
    
					pre->_next = cur->_next;
				}
				delete cur;
				--_size;
				return true;
			}
			pre = cur;
			cur = cur->_next;
		}
		return false;
	}
 
 
	//表的长度
	size_t BucketSize()
	{
    
    
		return _table.size();
	}
	//数据个数
	size_t Size()
	{
    
    
		return _size;
	}
	//桶的数量
	size_t BucketNum()
	{
    
    
		size_t Num = 0;
		for (size_t i = 0; i < BucketSize(); i++)
		{
    
    
			if (_table[i]) Num++;
		}
		return Num;
	}
	//最长的桶
	size_t MaxBucketLenth()
	{
    
    
		size_t Max_len = 0;
		size_t temp = 0;
		for (size_t i = 0; i < BucketSize(); i++)
		{
    
    
			if (_table[i])
			{
    
    
				size_t len = 1;
				Node* cur = _table[i]->_next;
				while (cur)
				{
    
    
					len++;
					cur = cur->_next;
				}
				if (len > Max_len)
				{
    
    
					Max_len = len;
					temp = i;
				}
			}
		}
		printf("Max_len_i:[%u]\n", temp);
		return Max_len;
	}
	void Print_map()
	{
    
    
		cout << "Print_map:" << endl;
		for (int i = 0; i < _table.size(); i++)
		{
    
    
			Node* cur = _table[i];
			while (cur)
			{
    
    
				cout << "i:" << i << " [" << cur->_data.first << " " << cur->_data.second << "] " << endl;
				cur = cur->_next;
			}
		}
	}
	void Print_set()
	{
    
    
		cout << "Print_set:" << endl;
		for (int i = 0; i < _table.size(); i++)
		{
    
    
			Node* cur = _table[i];
			while (cur)
			{
    
    
				cout << "i:" << i << " [" << cur->_data << "] " << endl;
				cur = cur->_next;
			}
		}
	}
 
 
 
private:
	vector<Node*> _table;
	size_t _size = 0;
};

2. unordered_set/map

unordered_set:

template<class K, class Hash = HashFunc<K>>
class unordered_set
{
    
    
public:
	struct setKeyOfT;
	typedef typename dianxia::HashTable<K, K, Hash, setKeyOfT>::iterator iterator;
 
	iterator begin()
	{
    
    
		return _ht.begin();
	}
	iterator end()
	{
    
    
		return _ht.end();
	}
 
	pair<iterator, bool> insert(const K& kv)
	{
    
    
		return _ht.Insert(kv);
	}
 
	bool erase(const K& kv)
	{
    
    
		return _ht.Erase(kv);
	}
	void print()
	{
    
    
		_ht.Print_set();
	}
 
private:
	struct setKeyOfT
	{
    
    
		const K& operator()(const K& key)
		{
    
    
			return key;
		}
	};
	HashTable<K, K, Hash, setKeyOfT> _ht;
};

unordered_map:

template<class K, class V, class Hash = HashFunc<K>>
class unordered_map
{
    
    
public:
	struct mapKeyOfT;
	typedef typename dianxia::HashTable<K, pair<K, V>, Hash, mapKeyOfT>::iterator iterator;
 
	iterator begin()
	{
    
    
		return _ht.begin();
	}
	iterator end()
	{
    
    
		return _ht.end();
	}
 
	pair<iterator, bool> insert(const pair<K, V>& kv)
	{
    
    
		return _ht.Insert(kv);
	}
	bool erase(const K& k)
	{
    
    
		return _ht.Erase(k);
	}
 
	V& operator[](const K& key)
	{
    
    
		pair<iterator, bool> ret = _ht.Insert(make_pair(key, V()));
		return ret.first->second;
	}
	void print()
	{
    
    
		_ht.Print_map();
	}
 
private:
	//取出pair中的K值  --- 内部类
	struct mapKeyOfT
	{
    
    
		const K& operator()(const pair<K, V>& kv)
		{
    
    
			return kv.first;
		}
	};
	HashTable<K, pair<K, V>, Hash, mapKeyOfT> _ht;
};

3. Test cases

Package test:

void test_unordered01()
{
    
    
	Brant::unordered_map<int, int> mp1;
	mp1.insert({
    
     1,1 });
	mp1.insert({
    
     54,54 });
	mp1.insert({
    
     2,2 });
	mp1.insert({
    
     3,3 });
	mp1.insert({
    
     4,4 });
	mp1.insert({
    
     6,6 });
	mp1.insert({
    
     6,6 });
	mp1.print();
	cout << "Erase:---------------" << endl;
	mp1.erase(1);
	mp1.erase(54);
	mp1.print();
 
	cout << endl << "--------------------------------------" << endl;
	Brant::unordered_set<int> st1;
	st1.insert(1);
	st1.insert(54);
	st1.insert(2);
	st1.insert(3);
	st1.insert(4);
	st1.insert(6);
	st1.insert(6);
	st1.print();
	cout << "Erase:---------------" << endl;
	st1.erase(1);
	st1.erase(54);
	st1.print();
}

Iterator test:

void test_iterator01()
{
    
    
	Brant::unordered_map<string, string> dict;
	dict.insert({
    
     "sort","排序" });
	dict.insert({
    
     "left","左边" });
	dict.insert({
    
     "right","右边" });
	dict.insert({
    
     "string","字符串" });
	Brant::unordered_map<string, string>::iterator it = dict.begin();
	while (it != dict.end())
	{
    
    
		cout << it->first << " : " << it->second << endl;
		++it;
	}
	cout << endl;
}
 
void test_iterator02()
{
    
    
	Brant::unordered_map<string, int> countMap;
	string arr[] = {
    
     "苹果","西瓜","菠萝","草莓","菠萝","草莓" ,"菠萝","草莓"
			, "西瓜", "菠萝", "草莓", "西瓜", "菠萝", "草莓","苹果" };
	for (auto e : arr)
	{
    
    
		countMap[e]++;
	}
	for (auto kv : countMap)
	{
    
    
		cout << kv.first << " " << kv.second << endl;
	}
}

This is the end of this article, the code text is not easy, please support me a lot! ! !

Guess you like

Origin blog.csdn.net/weixin_67401157/article/details/132120949