C++ vector (STL vector) underlying implementation mechanism (easy to understand)

Among STL's many containers, vector is one of the most commonly used containers. The underlying data structure is very simple, just a continuous linear memory space.
By analyzing the source code of the vector container, it is not difficult to find that it is represented by 3 iterators (which can be understood as pointers):

//_Alloc 表示内存分配器,此参数几乎不需要我们关心
template <class _Ty, class _Alloc = allocator<_Ty>>
class vector{
    
    
    ...
protected:
    pointer _Myfirst;
    pointer _Mylast;
    pointer _Myend;
};

Among them, _Myfirst points to the starting byte position of the vector container object; _Mylast points to the end byte of the current last element; _myend points to the end byte of the memory space occupied by the entire vector container.

Figure 1 demonstrates where the three iterators point to.
Insert picture description here
As shown in Figure 1, through these 3 iterators, a vector container with 2 elements and a capacity of 5 can be represented.
On this basis, combining the three iterators in pairs can also express different meanings, for example:

  • _Myfirst and _Mylast can be used to represent the currently used memory space in the vector container;
  • _Mylast and _Myend can be used to represent the free memory space of the vector container;
  • _Myfirst and _Myend can be used to represent the capacity of the vector container.
    For an empty vector container, since there is no space allocation for any element, _Myfirst, _Mylast, and _Myend are all null.
    By flexibly using these three iterators, the vector container can easily implement almost all functions such as the first and last identification, size, container, and empty container judgment, such as:
template <class _Ty, class _Alloc = allocator<_Ty>>
class vector{
    
    
public:
    iterator begin() {
    
    return _Myfirst;}
    iterator end() {
    
    return _Mylast;}
    size_type size() const {
    
    return size_type(end() - begin());}
    size_type capacity() const {
    
    return size_type(_Myend - begin());}
    bool empty() const {
    
    return begin() == end();}
    reference operator[] (size_type n) {
    
    return *(begin() + n);}
    reference front() {
    
     return *begin();}
    reference back() {
    
    return *(end()-1);}
    ...
};

The essence of vector expansion

Another thing that needs to be pointed out is that when the size and capacity of the vector are equal (size==capacity), that is, when it is fully loaded, if you add elements to it, the vector needs to be expanded. The process of vector container expansion requires the following 3 steps:

  1. Completely discard the existing memory space and reapply for a larger memory space;
  2. Move the data in the old memory space to the new memory space in the original order;
  3. Finally, the old memory space is released.
    This also explains why the pointers, references, and iterators related to the vector container may become invalid after the expansion.

It can be seen that vector expansion is very time-consuming . In order to reduce the cost of re-allocating memory space, vector will apply for more memory space than the user needs each time it expands (this is the origin of vector capacity, that is, capacity>=size) for later use.
When the vector container is expanded, different compilers request more memory space differently. Take VS as an example, it will expand 50% of the existing container capacity.

#include <iostream>
#include <vector>
using namespace std;
int main()
{
    
    
    vector<int> a;
    cout << "a.size(): " << a.size() << "       a.capacity(): " << a.capacity() << endl;
    for (int i = 0; i < 10; i++)
    {
    
    
        a.push_back(i);
        cout << "a.size(): " << a.size() << "   a.capacity(): " << a.capacity() << endl;
    }
    return 0;
}

Insert picture description here

Guess you like

Origin blog.csdn.net/J_avaSmallWhite/article/details/109232653