Foreword : I have read Mr. Hou's "STL Source Code Analysis" before, but that was many years ago. Now, sometimes you need to understand the implementation of STL used in actual work to check problems and crashes at work. Therefore, it is planned to go through the source code of STL again.
Abstract : This article describes map
the implementation of libcxx in llvm.
Keywords : map
, rbtree
, pair
, set
, multiset
Others : Reference code LLVM-libcxx
Note : The implementation of llvm when referring to the code is different from the implementation of gnu and msvc. The article assumes that you are already very familiar with the data structure, and will not delve into the details of the corresponding data structure.
map
It is an associative container in STL. vector
Unlike the sequence container such as , its memory structure is not linear either logically or physically. map
Store elements through key-value pairs, each key is unique. Because map
it is implemented with a red-black tree, map
the search efficiency is very high, and O(logn)
operations such as search, insertion, and deletion can be basically guaranteed to be performed within a certain time complexity.
map
The realization of is through the red-black tree, which is in STL __tree
. Therefore, after we understand __tree
the implementation, we can basically understand 90% of the map details.
typedef __tree<__value_type, __vc, __allocator_type> __base;
1 tree
1.1 Node
Like general STL objects, tree nodes __tree_node_types
describe various types of corresponding nodes, which will not be described in detail here.
Tree nodes are regular nodes, and each node contains two pointers to child nodes, a pointer to a parent node, and a flag indicating the color of the current root node. Here, the value field and the pointer of the index are distinguished, rather than existing in the same class. In addition, __tree_node
the destructor function of the node is declared as delete
, and its destructor is realized through __tree_node_destructor
implementation. __tre_node_destructor
The implementation of destructor is relatively simple, which is to choose whether to release the memory or destroy the object according to whether the current node is constructed.
template <class _Pointer>
class __tree_end_node{
public:
typedef _Pointer pointer;
pointer __left_;
};
template <class _VoidPtr>
class _LIBCPP_STANDALONE_DEBUG __tree_node_base : public __tree_node_base_types<_VoidPtr>::__end_node_type{
typedef __tree_node_base_types<_VoidPtr> _NodeBaseTypes;
pointer __right_;
__parent_pointer __parent_;
bool __is_black_;
};
template <class _Tp, class _VoidPtr>
class _LIBCPP_STANDALONE_DEBUG __tree_node : public __tree_node_base<_VoidPtr>{
public:
typedef _Tp __node_value_type;
__node_value_type __value_;
private:
~__tree_node() = delete;
__tree_node(__tree_node const&) = delete;
__tree_node& operator=(__tree_node const&) = delete;
};
//树节点的析构器
template <class _Allocator>
class __tree_node_destructor{
public:
bool __value_constructed;
__tree_node_destructor(const __tree_node_destructor &) = default;
__tree_node_destructor& operator=(const __tree_node_destructor&) = delete;
explicit __tree_node_destructor(allocator_type& __na, bool __val = false) _NOEXCEPT : __na_(__na), __value_constructed(__val){
}
void operator()(pointer __p) _NOEXCEPT{
if (__value_constructed)
__alloc_traits::destroy(__na_, _NodeTypes::__get_ptr(__p->__value_));
if (__p)
__alloc_traits::deallocate(__na_, __p, 1);
}
};
1.2 Tree iterators
The iterator of the tree is a pointer to a node, and the node operation method is the basic operation method of the red-black tree.
//定义在__tree_node_types中
typedef __conditional_t< is_pointer<__node_pointer>::value, typename __base::__end_node_pointer, __node_pointer>
__iter_pointer;
template <class _Tp, class _NodePtr, class _DiffType>
class _LIBCPP_TEMPLATE_VIS __tree_iterator{
typedef typename _NodeTypes::__iter_pointer __iter_pointer;
__iter_pointer __ptr_;
}
Here is a simple look at ++
the operation, --
similar. The next node of the balanced tree should be the next node in its inorder traversal:
- If the current node is the left node of the parent node, the next node is the parent node;
- If the current node is the right node of the parent node:
- And its right child node is not empty, then the next node is the smallest node of the right subtree (the leftmost child node);
- If the right child node is empty, recursively find the node that is the right child node of its parent node in the ancestor node;
template <class _EndNodePtr, class _NodePtr>
inline _LIBCPP_INLINE_VISIBILITY _EndNodePtr __tree_next_iter(_NodePtr __x) _NOEXCEPT{
_LIBCPP_ASSERT(__x != nullptr, "node shouldn't be null");
if (__x->__right_ != nullptr) //情况2
return static_cast<_EndNodePtr>(_VSTD::__tree_min(__x->__right_));
while (!_VSTD::__tree_is_left_child(__x))//情况2和情况3
__x = __x->__parent_unsafe();
return static_cast<_EndNodePtr>(__x->__parent_);
}
1.3 tree
The implementation of the tree in STL is that __tree
because it is an ordered container, the user needs to provide a comparator (by default ==
). __tree
There are two nodes, a start node and an end node, an allocator, and the current number of nodes and a comparator. The order of the tree is arranged in order. The start node is the leftmost child node of the whole tree __begin_node_
, the root node is the root node __pair1_.first()->left
, and the end node is the rightmost child node __pair1_.first()
.
template <class _Tp, class _Compare, class _Allocator>
class __tree{
__iter_pointer __begin_node_;
__compressed_pair<__end_node_t, __node_allocator> __pair1_;
__compressed_pair<size_type, value_compare> __pair3_;
};
When the tree is empty, __begin_node == __end_node
different interfaces are implemented for repeated elements and unique elements in the tree, in order to support both map
and multmap
.
iterator __insert_unique(const_iterator __p, _Vp&& __v) {
return __emplace_hint_unique(__p, _VSTD::forward<_Vp>(__v));
}
iterator __insert_multi(__container_value_type&& __v) {
return __emplace_multi(_VSTD::move(__v));
}
__emplace_hint_unique
__emplace_hint_unique
is used to insert a node into the current tree, the basic steps are:
- construction node;
- Find whether there is a node with the same key in the tree;
- Insert a node into the tree.
template <class _Tp, class _Compare, class _Allocator>
template <class... _Args>
typename __tree<_Tp, _Compare, _Allocator>::iterator
__tree<_Tp, _Compare, _Allocator>::__emplace_hint_unique_impl(const_iterator __p, _Args&&... __args){
__node_holder __h = __construct_node(_VSTD::forward<_Args>(__args)...);
__parent_pointer __parent;
__node_base_pointer __dummy;
__node_base_pointer& __child = __find_equal(__p, __parent, __dummy, __h->__value_);
__node_pointer __r = static_cast<__node_pointer>(__child);
if (__child == nullptr)
{
__insert_node_at(__parent, __child, static_cast<__node_base_pointer>(__h.get()));
__r = __h.release();
}
return iterator(__r);
}
Let's look at it step by step. The construction of nodes is construct_node
realized by calling, and the search for nodes in the tree is __find_equal
realized by. __find_equal
The implementation is relatively simple through the process of non-recursive in-order traversal. Where value_comp
is a custom comparison function object, the default is lesser
. have to be aware of is
template <class _Tp, class _Compare, class _Allocator>
template <class _Key>
typename __tree<_Tp, _Compare, _Allocator>::__node_base_pointer& __tree<_Tp, _Compare, _Allocator>::__find_equal(__parent_pointer& __parent, const _Key& __v){
__node_pointer __nd = __root();
__node_base_pointer* __nd_ptr = __root_ptr();
if (__nd != nullptr){
while (true){
if (value_comp()(__v, __nd->__value_)){
if (__nd->__left_ != nullptr) {
__nd_ptr = _VSTD::addressof(__nd->__left_);
__nd = static_cast<__node_pointer>(__nd->__left_);
} else {
__parent = static_cast<__parent_pointer>(__nd);
return __parent->__left_;
}
}
else if (value_comp()(__nd->__value_, __v)){
if (__nd->__right_ != nullptr) {
__nd_ptr = _VSTD::addressof(__nd->__right_);
__nd = static_cast<__node_pointer>(__nd->__right_);
} else {
__parent = static_cast<__parent_pointer>(__nd);
return __nd->__right_;
}
}
else{
__parent = static_cast<__parent_pointer>(__nd);
return *__nd_ptr;
}
}
}
__parent = static_cast<__parent_pointer>(__end_node());
return __parent->__left_;
}
Find the position to insert, let's see how to insert it. The basic process is to insert the node to the predetermined position first, then check whether the color of the node and the parent node are correct, and adjust the tree if it is not correct. Insertion is __insert_node_at
achieved via .
template <class _Tp, class _Compare, class _Allocator>
void __tree<_Tp, _Compare, _Allocator>::__insert_node_at( __parent_pointer __parent, __node_base_pointer& __child, __node_base_pointer __new_node) _NOEXCEPT{
__new_node->__left_ = nullptr;
__new_node->__right_ = nullptr;
__new_node->__parent_ = __parent;
// __new_node->__is_black_ is initialized in __tree_balance_after_insert
__child = __new_node;
if (__begin_node()->__left_ != nullptr)
__begin_node() = static_cast<__iter_pointer>(__begin_node()->__left_);
_VSTD::__tree_balance_after_insert(__end_node()->__left_, __child);
++size();
}
template <class _NodePtr>
_LIBCPP_HIDE_FROM_ABI void
__tree_balance_after_insert(_NodePtr __root, _NodePtr __x) _NOEXCEPT{
_LIBCPP_ASSERT(__root != nullptr, "Root of the tree shouldn't be null");
_LIBCPP_ASSERT(__x != nullptr, "Can't attach null node to a leaf");
__x->__is_black_ = __x == __root;
while (__x != __root && !__x->__parent_unsafe()->__is_black_){
// __x->__parent_ != __root because __x->__parent_->__is_black == false
if (_VSTD::__tree_is_left_child(__x->__parent_unsafe())){
_NodePtr __y = __x->__parent_unsafe()->__parent_unsafe()->__right_;
if (__y != nullptr && !__y->__is_black_){
//新插入的节点改变了整棵树的颜色数量,需要对颜色进行改变
__x = __x->__parent_unsafe();
__x->__is_black_ = true;
__x = __x->__parent_unsafe();
__x->__is_black_ = __x == __root;
__y->__is_black_ = true;
}else{
//如果插入的位置是左子节点的右节点需要先左旋转,再右旋转,否则只需要右旋转
if (!_VSTD::__tree_is_left_child(__x)){
__x = __x->__parent_unsafe();
_VSTD::__tree_left_rotate(__x);
}
__x = __x->__parent_unsafe();
__x->__is_black_ = true;
__x = __x->__parent_unsafe();
__x->__is_black_ = false;
_VSTD::__tree_right_rotate(__x);
break;
}
}
else{
_NodePtr __y = __x->__parent_unsafe()->__parent_->__left_;
if (__y != nullptr && !__y->__is_black_){
//新插入的节点改变了整棵树的颜色数量,需要对颜色进行改变
__x = __x->__parent_unsafe();
__x->__is_black_ = true;
__x = __x->__parent_unsafe();
__x->__is_black_ = __x == __root;
__y->__is_black_ = true;
}else{
//如果插入的位置是右子节点的左节点需要先右旋转,再左旋转,否则只需要左旋转
if (_VSTD::__tree_is_left_child(__x)){
__x = __x->__parent_unsafe();
_VSTD::__tree_right_rotate(__x);
}
__x = __x->__parent_unsafe();
__x->__is_black_ = true;
__x = __x->__parent_unsafe();
__x->__is_black_ = false;
_VSTD::__tree_left_rotate(__x);
break;
}
}
}
}
__emplace_hint_multi
__emplace_hint_multi
The difference between and __emplace_hint_unique
is the way to search and insert nodes.
template <class _Tp, class _Compare, class _Allocator>
template <class... _Args>
typename __tree<_Tp, _Compare, _Allocator>::iterator __tree<_Tp, _Compare, _Allocator>::__emplace_hint_multi(const_iterator __p, _Args&&... __args){
__node_holder __h = __construct_node(_VSTD::forward<_Args>(__args)...);
__parent_pointer __parent;
__node_base_pointer& __child = __find_leaf(__p, __parent, _NodeTypes::__get_key(__h->__value_));
__insert_node_at(__parent, __child, static_cast<__node_base_pointer>(__h.get()));
return iterator(static_cast<__node_pointer>(__h.release()));
}
2 pair
STL is used std::pair<key, value>
to represent key-value key-value pairs. The implementation is relatively simple, unlike compressed_pair
.
template <class _T1, class _T2>
struct _LIBCPP_TEMPLATE_VIS pair
#if defined(_LIBCPP_DEPRECATED_ABI_DISABLE_PAIR_TRIVIAL_COPY_CTOR)
: private __non_trivially_copyable_base<_T1, _T2>
#endif
{
_T1 first;
_T2 second;
};
3 map
andmultimap
map
The implementation of and multimap
is basically the same, the difference is whether to allow repeated elements.
template <class _Key, class _Tp, class _Compare = less<_Key>, class _Allocator = allocator<pair<const _Key, _Tp> > >
class _LIBCPP_TEMPLATE_VIS map{
public:
typedef _Key key_type;
typedef _Tp mapped_type;
typedef pair<const key_type, mapped_type> value_type;
private:
__base __tree_;
};
template <class _Key, class _Tp, class _Compare, class _Allocator>
_Tp& map<_Key, _Tp, _Compare, _Allocator>::operator[](const key_type& __k){
return __tree_.__emplace_unique_key_args(__k,
_VSTD::piecewise_construct,
_VSTD::forward_as_tuple(__k),
_VSTD::forward_as_tuple()).first->__get_value().second;
}
template <class _Key, class _Tp, class _Compare = less<_Key>, class _Allocator = allocator<pair<const _Key, _Tp> > >
class _LIBCPP_TEMPLATE_VIS multimap{
public:
typedef _Key key_type;
typedef _Tp mapped_type;
typedef pair<const key_type, mapped_type> value_type;
private:
__base __tree_;
};
template <class ..._Args>
_LIBCPP_INLINE_VISIBILITY
iterator emplace(_Args&& ...__args) {
return __tree_.__emplace_multi(_VSTD::forward<_Args>(__args)...);
}
4 set
andmultiset
set
The difference between and map
is that the internal value is different, set
only the key is stored. Its implementation map
is the same as the red-black tree, but the stored content is different. The difference between set
and is the same as the difference between and and is whether to allow duplicate keys.multiset
map
multimap
template <class _Key, class _Compare = less<_Key>, class _Allocator = allocator<_Key> >
class _LIBCPP_TEMPLATE_VIS set{
public:
typedef _Key key_type;
typedef key_type value_type;
static_assert((is_same<typename allocator_type::value_type, value_type>::value),
"Allocator::value_type must be same type as value_type");
private:
typedef __tree<value_type, value_compare, allocator_type> __base;
__base __tree_;
public:
iterator insert(const_iterator __p, const value_type& __v)
{
return __tree_.__insert_unique(__p, __v);}
}
template <class _Key, class _Compare = less<_Key>,class _Allocator = allocator<_Key> >
class _LIBCPP_TEMPLATE_VIS multiset{
public:
typedef _Key key_type;
typedef key_type value_type;
private:
typedef __tree<value_type, value_compare, allocator_type> __base;
__base __tree_;
public:
iterator insert(const value_type& __v)
{
return __tree_.__insert_multi(__v);}
};