STL source code reading notes (2) - Vector

foreword

Things are almost done recently, let's take a look at the implementation of vecotr. Vector is a very common container in stl. Many people may learn c++ grammar and start contacting stl with vector. The data structure of vector will not be explained in detail, I believe everyone absolutely understands the data structure of array.
In addition, after reading the standard library, I realized that it is actually very complicated to achieve the Standard Template. For some very common functions, just look at the declaration, and some type extractions will not be displayed. It is too long to display them completely. The source code of a vector<T> plus vector<bool> is about 4000 lines.

blog backup

STL version

The stl version used in this article is libc++ 13.0, which belongs to the LLVM project.

LLVM project Github

If you encounter unfamiliar macro definitions, you can refer to the document Symbol Visibility Macros

__vector_base_common

Before introducing __vector_base_common, make it clear that vector is a subclass. Its inheritance relationship is

__vector_base_common <-- __vector_base <-- vector

__vector_base_common is the parent class, derived from __vector_base, and then derived from vector, which uses the CRTP programming method, which we will talk about later. Let's
look at the declaration first.

template <bool>
class _LIBCPP_TEMPLATE_VIS __vector_base_common
{
    
    
protected:
    _LIBCPP_INLINE_VISIBILITY __vector_base_common() {
    
    }
    _LIBCPP_NORETURN void __throw_length_error() const;
    _LIBCPP_NORETURN void __throw_out_of_range() const;
};

The whole __vector_base_common is relatively simple, except for the constructor, the remaining functions are functions that throw errors.
A little bit later in the declaration there is a weird code.

_LIBCPP_EXTERN_TEMPLATE(class _LIBCPP_EXTERN_TEMPLATE_TYPE_VIS __vector_base_common<true>)

I will find out the macro definition _LIBCPP_EXTERN_TEMPLATE.

// Libc++ allows disabling extern template instantiation declarations by
// means of users defining _LIBCPP_DISABLE_EXTERN_TEMPLATE.
//
// Furthermore, when the Debug mode is enabled, we disable extern declarations
// when building user code because we don't want to use the functions compiled
// in the library, which might not have had the debug mode enabled when built.
// However, some extern declarations need to be used, because code correctness
// depends on it (several instances in <locale>). Those special declarations
// are declared with _LIBCPP_EXTERN_TEMPLATE_EVEN_IN_DEBUG_MODE, which is enabled
// even when the debug mode is enabled.
#if defined(_LIBCPP_DISABLE_EXTERN_TEMPLATE)
#   define _LIBCPP_EXTERN_TEMPLATE(...) /* nothing */
#   define _LIBCPP_EXTERN_TEMPLATE_EVEN_IN_DEBUG_MODE(...) /* nothing */
#elif _LIBCPP_DEBUG_LEVEL >= 1 && !defined(_LIBCPP_BUILDING_LIBRARY)
#   define _LIBCPP_EXTERN_TEMPLATE(...) /* nothing */
#   define _LIBCPP_EXTERN_TEMPLATE_EVEN_IN_DEBUG_MODE(...) extern template __VA_ARGS__;
#else
#   define _LIBCPP_EXTERN_TEMPLATE(...) extern template __VA_ARGS__;
#   define _LIBCPP_EXTERN_TEMPLATE_EVEN_IN_DEBUG_MODE(...) extern template __VA_ARGS__;
#endif

In short, in non-debug mode, this macro will be defined as a 19-line variadic macro. That is, extern template (...), which enforces the instantiation of templates.
Regarding this design, I speculate that it can effectively reduce the code expansion caused by templates, but it is not very friendly to debugging, so I customized this strategy.

__vector_base

First let's look at declarations and internal aliases

template <class _Tp, class _Allocator>
class __vector_base
    : protected __vector_base_common<true>{
    
    
public:
    typedef _Allocator                               allocator_type;
    typedef allocator_traits<allocator_type>         __alloc_traits;
    typedef typename __alloc_traits::size_type       size_type;
protected:
    typedef _Tp                                      value_type;
    typedef value_type&                              reference;
    typedef const value_type&                        const_reference;
    typedef typename __alloc_traits::difference_type difference_type;
    typedef typename __alloc_traits::pointer         pointer;
    typedef typename __alloc_traits::const_pointer   const_pointer;
    typedef pointer                                  iterator;
    typedef const_pointer                            const_iterator;
	}

__vector_base is a subclass of __vector_base_common<true>, and one of the two template parameters is an internal data type and an allocator. There are tons of typedefs inside.
Then let's take a look at the members of __vector_base, there are only three __vector_base.

pointer                                    __begin_;
pointer                                    __end_;
__compressed_pair<pointer, allocator_type> __end_cap_;

__begin_ is the first address pointer, __end_ is the end position pointer of the array, size() uses this to calculate the size, and __end_cap_ contains the pointer and space allocator of the maximum space of the array, and the capacity() function uses this to calculate the size. This is the core member of vector.

Then sort out the functions that I think are more important in __vector_base. By the way, _LIBCPP_INLINE_VISIBILITY specifically contains I found StackOverflow.
The original intention is that the function is hidden from abi, which was previously hidden through inline, but now it is hidden using clang's attribute ((internal_linkage)).

Capacity function implementation

_LIBCPP_INLINE_VISIBILITY
    void clear() _NOEXCEPT {
    
    __destruct_at_end(__begin_);};
	
_LIBCPP_INLINE_VISIBILITY
    size_type capacity() const _NOEXCEPT			//计算空间大小的实现
        {
    
    return static_cast<size_type>(__end_cap() - __begin_);}

copy and move space allocator

Copy or move the space allocator type.

 _LIBCPP_INLINE_VISIBILITY
    void __copy_assign_alloc(const __vector_base& __c, true_type)
        {
    
    
            if (__alloc() != __c.__alloc())
            {
    
    
                clear();	\\析构自己内部数据
                __alloc_traits::deallocate(__alloc(), __begin_, capacity());		\\释放自己的空间
                __begin_ = __end_ = __end_cap() = nullptr;
            }
            __alloc() = __c.__alloc();
        }

_LIBCPP_INLINE_VISIBILITY
    void __move_assign_alloc(__vector_base& __c, true_type)
        _NOEXCEPT_(is_nothrow_move_assignable<allocator_type>::value)
        {
    
    
            __alloc() = _VSTD::move(__c.__alloc());
        }

The function of __alloc() is to return the corresponding type of space allocator

Content destruction

template <class _Tp, class _Allocator>
inline _LIBCPP_INLINE_VISIBILITY
void
__vector_base<_Tp, _Allocator>::__destruct_at_end(pointer __new_last) _NOEXCEPT
{
    
    
    pointer __soon_to_be_end = __end_;
    while (__new_last != __soon_to_be_end)
        __alloc_traits::destroy(__alloc(), _VSTD::__to_address(--__soon_to_be_end));	//析构
    __end_ = __new_last;
}

Basically, there is nothing in __vector_base, and the rest are implemented in vector.

vector

Finally arrived at the ontology of the vector, the old rules, first look at the statement

template <class _Tp, class _Alloc = allocator<_Tp> >	//声明
class _LIBCPP_TEMPLATE_VIS vector;		

template <class _Tp, class _Allocator /* = allocator<_Tp> */>	//定义
class _LIBCPP_TEMPLATE_VIS vector
    : private __vector_base<_Tp, _Allocator>	//利用CRTP
{
    
    
private:
    typedef __vector_base<_Tp, _Allocator>           __base;
    typedef allocator<_Tp>                           __default_allocator_type;
public:
    typedef vector                                   __self;
    typedef _Tp                                      value_type;
    typedef _Allocator                               allocator_type;
    typedef typename __base::__alloc_traits          __alloc_traits;
    typedef typename __base::reference               reference;
    typedef typename __base::const_reference         const_reference;
    typedef typename __base::size_type               size_type;
    typedef typename __base::difference_type         difference_type;
    typedef typename __base::pointer                 pointer;
    typedef typename __base::const_pointer           const_pointer;
    typedef __wrap_iter<pointer>                     iterator;
    typedef __wrap_iter<const_pointer>               const_iterator;
    typedef _VSTD::reverse_iterator<iterator>         reverse_iterator;
    typedef _VSTD::reverse_iterator<const_iterator>   const_reverse_iterator;

    static_assert((is_same<typename allocator_type::value_type, value_type>::value),	//c++11 编译期断言
                  "Allocator::value_type must be same type as value_type");
}
//顺带一提,这个声明在iosfwd文件中,第一次看的时候没有找到,看的是定义,还纳闷第二个模板参数的默认值是什么。

By the way, I also want to explain why some vectors have two layers of inheritance. For the first inheritance, __vector_base_common<true> is declared as an extern template to reduce code expansion. The second inheritance uses CRPT to achieve polymorphism at compile time.

Constructor

The simplest thing about the vector constructor is to support InputIterator and _ForwardIterator types as parameters. As for other constructors with parameter types such as initializer_list, they will not be listed one by one. Here take the InputIterator version as an example.

template <class _InputIterator>
        vector(_InputIterator __first,
               typename enable_if<__is_cpp17_input_iterator  <_InputIterator>::value &&
                                 !__is_cpp17_forward_iterator<_InputIterator>::value &&
                                 is_constructible<
                                    value_type,
                                    typename iterator_traits<_InputIterator>::reference>::value,
                                 _InputIterator>::type __last);

If you are just getting started with c++, you must have a big head, but it is easy to understand if we break it down.
First of all, we must realize that this constructor has only two parameters, but the type of the second parameter is a little longer, because a type check is done here. We focus on analyzing the second parameter.

typename enable_if<__is_cpp17_input_iterator  <_InputIterator>::value &&
                                 !__is_cpp17_forward_iterator<_InputIterator>::value &&
                                 is_constructible<
                                    value_type,
                                    typename iterator_traits<_InputIterator>::reference>::value,
                                 _InputIterator>::type __last

The name of the second parameter is __last, which is actually of type _InputIterator. First enable_if, enable_if has two template parameters, the first is a bool value, the second is a type, only when the first template parameter is true, enable_if<>::type is the type of the second parameter. The outermost enable_if is a type check. If the type is wrong, an error will be reported when compiling. Next, let’s look at the first template parameter of enable_if

__is_cpp17_input_iterator  <_InputIterator>::value &&	//判断是否是input_iterator
!__is_cpp17_forward_iterator<_InputIterator>::value &&	//判断是否是forward_iterator
		 is_constructible<								//利用类型萃取,判断是否存在class A(A &parm)拷贝构造函数
		value_type,
		typename iterator_traits<_InputIterator>::reference>::value

The other constructor is almost the same, so I won't go into details.

template <class _InputIterator>
        vector(_InputIterator __first, _InputIterator __last, const allocator_type& __a,
               typename enable_if<__is_cpp17_input_iterator  <_InputIterator>::value &&
                                 !__is_cpp17_forward_iterator<_InputIterator>::value &&
                                 is_constructible<
                                    value_type,
                                    typename iterator_traits<_InputIterator>::reference>::value>::type* = 0);

insert function

The insertion function is just two push_back and emplace_back. Let's look at emplace_back first

emplace_back

//if _LIBCPP_STD_VER > 14,实际上这里是用宏控制c++11和c++14两个版本,14和17区别就是是否返回自身的引用
template <class _Tp, class _Allocator>
template <class... _Args>
inline
#if _LIBCPP_STD_VER > 14
typename vector<_Tp, _Allocator>::reference
#else
void
#endif
vector<_Tp, _Allocator>::emplace_back(_Args&&... __args)
{
    
    
    if (this->__end_ < this->__end_cap())
    {
    
    
        __construct_one_at_end(_VSTD::forward<_Args>(__args)...);
    }
    else
        __emplace_back_slow_path(_VSTD::forward<_Args>(__args)...);
#if _LIBCPP_STD_VER > 14
    return this->back();
#endif
}

It can be seen that the parameter of emplace_back is a folded rvalue reference, and the rvalue reference is used here to take advantage of the characteristics of the folded reference, and then use forward to forward the parameters.
Call the __construct_one_at_end() function when the internal space is sufficient for storage, and call __emplace_back_slow_path if it is insufficient;

First let's look at the __construct_one_at_end() function, which constructs in-place in the available space.

  template <class ..._Args>
  _LIBCPP_INLINE_VISIBILITY
  void __construct_one_at_end(_Args&& ...__args) {
    
    
    _ConstructTransaction __tx(*this, 1);	//简单的来说这个结构体是用来移动__end指针的,参数1就是向后移动1个距离。
    __alloc_traits::construct(this->__alloc(), _VSTD::__to_address(__tx.__pos_),
        _VSTD::forward<_Args>(__args)...);
    ++__tx.__pos_;
  }
};

A _ConstructTransaction structure is defined, which is used to manipulate the pointer inside the vector. Then use __alloc_traits for placement_new. Parameters...__args is the parameter of the constructor. Pay special attention to the fact that __args uses variable parameter templates and folding expressions to match the parameters of the constructor.
Then look at the __emplace_back_slow_path() function when there is insufficient space

template <class _Tp, class _Allocator>
template <class... _Args>
void
vector<_Tp, _Allocator>::__emplace_back_slow_path(_Args&&... __args)
{
    
    
    allocator_type& __a = this->__alloc();
    __split_buffer<value_type, allocator_type&> __v(__recommend(size() + 1), size(), __a);	
	\\__recommend(size() + 1)为最大cap,在未达到空间分配器上限时为乘以2size()为现在要构造的位置,__a为分配器
//    __v.emplace_back(_VSTD::forward<_Args>(__args)...);
    __alloc_traits::construct(__a, _VSTD::__to_address(__v.__end_), _VSTD::forward<_Args>(__args)...);	//在新的空间上构造
    __v.__end_++;
    __swap_out_circular_buffer(__v);	//交换
}

Here is the more classic double expansion, __split_buffer constructs twice the space, then __alloc_traits::construct construction, and finally use __swap_out_circular_buffer to exchange back. Regarding __split_buffer, I speculate that it is a split_buffer common to various containers, which is used to solve container expansion, and has constructors for various iterators and pointers inside.

push_back

push_back is basically similar to emplace_back, except that two versions of lvalue and rvalue are produced.

//左值版本
template <class _Tp, class _Allocator>
inline _LIBCPP_INLINE_VISIBILITY
void
vector<_Tp, _Allocator>::push_back(const_reference __x)
{
    
    
    if (this->__end_ != this->__end_cap())
    {
    
    
        __construct_one_at_end(__x);
    }
    else
        __push_back_slow_path(__x);
}
//右值版本
template <class _Tp, class _Allocator>
inline _LIBCPP_INLINE_VISIBILITY
void
vector<_Tp, _Allocator>::push_back(value_type&& __x)
{
    
    
    if (this->__end_ < this->__end_cap())
    {
    
    
        __construct_one_at_end(_VSTD::move(__x));
    }
    else
        __push_back_slow_path(_VSTD::move(__x));
}

The two auxiliary functions used are the same as emplace_back.

Guess you like

Origin blog.csdn.net/ninesnow_c/article/details/121947014