Analysis of stl source code 03-vector

One, vector overview

  • For the syntax of vector, please refer to the article: https://blog.csdn.net/qq_41453285/article/details/105483009
  • In general: vector is a variable size array
  • Features:
    • Support fast random access. Inserting or deleting elements beyond the tail can be slow
    • The elements are stored in contiguous memory space , so the subscript value is very fast
    • Adding or deleting elements in the middle of the container is very time-consuming
    • Once the vector memory is insufficient, after re-applying for memory, the pointers, references, and iterators related to the original vector become invalid. Memory reallocation takes a long time
  • Usually, using vector is the best choice, if there are no special requirements, it is best to use vector
  • Comparison with other containers:
vector Variable size array . Support fast random access. Inserting or deleting elements beyond the tail can be slow
and Deque . Support fast random access. Insert/delete at the beginning and end is fast
list Doubly linked list . Only supports two-way sequential access. Insertion and deletion at any position in the list are fast
forward_list Singly linked list . Only supports one-way sequential access. Insert and delete operations at any position in the linked list are fast
array Fixed size array . Support fast random access. Cannot add or delete elements
string A container similar to vector , but dedicated to storing characters. Random access is fast. Fast insertion or deletion at the tail

Two, vector definition summary

  • The vector is set in the <stl_vector.h> header file
//alloc是SGI STL的空间配置器

template <class T, class Alloc = alloc>

class vector {

public:

    // vector 的嵌套类型定义

    typedef T value_type;

    typedef value_type* pointer;

    typedef value_type* iterator;

    typedef value_type& reference;
    
    typedef size_t size_type;

    typedef ptrdiff_t difference_type;


protected:

    // simple_alloc是SGI STL的空间配置器,见前面空间适配器文章的介绍

    typedef simple_alloc<value_type, Alloc> data_allocator;

    iterator start; // 表示目前使用空间的头

    iterator finish; // 表示目前使用空间的尾

    iterator end_of_storage; // 表示目前可用空间的尾


    void insert_aux(iterator position, const T& x);

    void deallocate() {

    if (start)

        data_allocator::deallocate(start, end_of_storage - start);

    }


    void fill_initialize(size_type n, const T& value) {

    start = allocate_and_fill(n, value);

    finish = start + n;

    end_of_storage = finish;

}


public:

    iterator begin() { return start; }

    iterator end() { return finish; }

    size_type size() const { return size_type(end() - begin()); }

    size_type capacity() const {

        return size_type(end_of_storage - begin());

    }

    bool empty() const { return begin() == end(); }

    reference operator[](size_type n) { return *(begin() + n); }


    vector() : start(0), finish(0), end_of_storage(0) {}

    vector(size_type n, const T& value) { fill_initialize(n,value); }

    vector(int n, const T& value) { fill_initialize(n,value); }

    vector(long n, const T&value) { fill_initialize(n,value); }

    explicit vector(size_type n) { fill_initialize(n,T()); }


    ~vector() {

        destroy(start, finish); //全局函式,见前面文章destroy函数的介绍

        deallocate(); //这是 vector的㆒个 member function

    }


    reference front() { return *begin(); } // 第一个元素

    reference back() { return *(end() - 1); } // 最后一个元素

    void push_back(const T& x) { // 将元素安插至最尾端

    if (finish != end_of_storage) {

        construct(finish, x); //全局函式,见前面文章construct函数的介绍

        ++finish;

    }

    else

        insert_aux(end(), x); //这是 vector的一个member function

    }


    void pop_back() { // 将最尾端元素取出

        --finish;

        destroy(finish); // 全局函式,见前面文章destroy函数的介绍

    }


    iterator erase(iterator position) { // 清除某位置上的元素

    if (position + 1 != end())

        copy(position + 1, finish, position); // 后续元素往前搬移

        --finish;

        destroy(finish); // 全局函式,见前面文章destroy函数的介绍

        return position;
    }


    void resize(size_type new_size, const T& x) {

    if (new_size < size())

        erase(begin() + new_size, end());

    else

        insert(end(), new_size - size(), x);

    }

    void resize(size_type new_size) { resize(new_size, T()); }

    void clear() { erase(begin(), end()); }


protected:

    // 配置空间并填满内容

    iterator allocate_and_fill(size_type n, const T& x) {

    iterator result = data_allocator::allocate(n);

    uninitialized_fill_n(result, n, x); // 全局函式,见前面uninitialized_fill_n函数的介绍

    return result;

}

Three, vector iterator

  • Vector maintains a continuous linear space , so no matter what its element type is, ordinary pointers can be used as vector iterators to meet all necessary conditions
  • The operations supported by vector iterators are (also available for ordinary pointers):
    • operator * 、 operator -> 、 operator ++ 、 operator - 、 operator + 、 operator- 、 operator + = 、 operator- =
  • Vector supports random access, and ordinary pointers have this ability, so, vector provides random access iterators (Random Access iterators)
  • The iterator of vector is defined as follows:
template <class T, class Alloc = alloc>

class vector {

public:

    typedef T value_type;

    typedef value_type* iterator; //vector的迭代器是原生指标

    ...

};
  • E.g:
vector<int>::iterator ivite; //等同于int* ivite;

vector<Shape>::iterator svite; //等同于Shape* svite;

Fourth, the data structure of vector

  • The data structure of vector is very simple: a linear continuous space
  • Here are the three data structures of vector:
    • start: indicates the head of the currently used space
    • finish: indicates the end of the currently used space
    • end_of_storage: indicates the end of the currently available space
template <class T, class Alloc = alloc>

class vector {

...

protected:

    iterator start; //表示目前使用空间的头

    iterator finish; //表示目前使用空间的尾

    iterator end_of_storage; //表示目前可用空间的尾

    ...
    
};
  • Note: In order to reduce the speed cost of space allocation, the actual size of the vector configuration may be larger than the client's demand for possible expansion in the future. This is the concept of capacity. In other words, the capacity of a vector is always greater than or equal to its size. Once the capacity is equal to the size, you will need to open up a new space next time you add new elements. As shown below

  • Using the three iterators of start, finish, and end_of_storage, vector provides functions such as start and end labeling, size, capacity, empty container judgment, annotation [] operator, front element value, and last element value... etc. , as follows:
template <class T, class Alloc = alloc>

class vector {

...

public:

    iterator begin() { return start; }

    iterator end() { return finish; }

    size_type size() const { return size_type(end() - begin()); }

    size_type capacity() const {

        return size_type(end_of_storage - begin());
    
    }

    bool empty() const { return begin() == end(); }

    reference operator[](size_type n) { return *(begin() + n); }

    reference front() { return *begin(); }

    reference back() { return *(end() - 1); }

    ...
    
};

Five, vector construction and memory management (constructor, push_back)

Vector memory management

  • Vector uses alloc as the space configurator by default, and defines a data_allocator accordingly, in order to make it more convenient to use the element size as the configuration unit:
template <class T, class Alloc = alloc>

class vector {

protected:

    // simple_alloc<>见前面文章介绍

    typedef simple_alloc<value_type, Alloc> data_allocator;

    ...

};
  • Therefore, data_allocator::allocate(n) means to configure n element spaces

Constructor

  • Vector provides many constructors, one of which allows us to specify the size and initial value of the space:
//构造函数,允许指定vector大小n和初值value

vector(size_type n, const T& value) { fill_initialize(n, value); }


// 充填并予初始化

void fill_initialize(size_type n, const T& value) {

    start = allocate_and_fill(n, value);

    finish = start + n;

    end_of_storage = finish;

}


// 配置而后充填

iterator allocate_and_fill(size_type n, const T& x) {

    iterator result = data_allocator::allocate(n); //配置n个元素空间

    uninitialized_fill_n(result, n, x); //全局函式,会根据第1个参数类型特性决定使用算法fill_n()或反复调用construct()来完成任务

    return result;

}

 

push_back() function

  • When we insert a new element at the end of the vector with push_back(), the function first checks whether there is spare space, and if there is, it constructs the element directly on the spare space, and adjusts the iterator finish to make the vector larger. If there is no spare space, expand the space (reconfigure, move data, release the original space)
  • The prototype of push_back() is as follows:
void push_back(const T& x) {

    if (finish != end_of_storage) { //还有备用空间

        construct(finish, x); //全局函式
    
        ++finish; //调整水位高度

    }
    
    else //已无备用空间

        insert_aux(end(), x); // vector member function,见下

}
template <class T, class Alloc>

void vector<T, Alloc>::insert_aux(iterator position, const T& x) {

    if (finish != end_of_storage) { //还有备用空间

        // 在备用空间起始处建构一个元素,并以 vector 最后一个元素值为其初值。

        construct(finish, *(finish - 1));

        // 调整水位。

        ++finish;

        T x_copy = x;

        copy_backward(position, finish - 2, finish - 1);

        *position = x_copy;

    }

    else { // 已无备用空间

        const size_type old_size = size();

        const size_type len = old_size != 0 ? 2 * old_size : 1;

        // 以上配置原则:如果原大小为0,则配置 1(个元素大小)

        // 如果原大小不为 0,则配置原大小的两倍,

        // 前半段用来放置原数据,后半段准备用来放置新数据

        iterator new_start = data_allocator::allocate(len); // 实际配置

        iterator new_finish = new_start;

        try {

            // 将原 vector 的内容拷贝到新vector

            new_finish = uninitialized_copy(start, position, new_start);

            // 为新元素设定初值 x
        
            construct(new_finish, x);

            // 调整水位

            ++new_finish;
    
            // 将原vector的备用空间中的内容也忠实拷贝过来

            new_finish = uninitialized_copy(position, finish, new_finish);

        }

        catch(...) {

            // "commit or rollback" semantics.

            destroy(new_start, new_finish);

            data_allocator::deallocate(new_start, len);

            throw;

        }

        //析构并释放原vector
    
        destroy(begin(), end());

        deallocate();


        // 调整迭代器,指向新vector
    
        vector start = new_start;

        finish = new_finish;

        end_of_storage = new_start + len;

    }

}

 

Six, vector memory redistribution strategy

  • Vector memory reallocation strategy:
    • The vector is stored in the form of an array. When adding elements to the vector, if the capacity of the vector is insufficient, the vector will be expanded
    • The rule of expansion is: not to connect the new space after the original space (because there is no guarantee that there is still space available for configuration after the original space), but to apply for a new memory space larger than the current one (gcc and vc have different application rules, see Introduced below), and then copy the contents of the original memory to the new memory, and then release the original memory
    • Important: In the environment of gcc and vc, the expansion rules of vector are different
  • Note (emphasis):  Any operation on the vector, once the space is re-allocated, all iterators pointing to the original vector will become invalid . This is an easy mistake for programmers to make, so be careful

Windows

  • The vector memory redistribution is the regular growth of the capacity, which can be described by the following formula:
maxSize = maxSize + ((maxSize >> 1) > 1 ? (maxSize >> 1) : 1)
  • Illustration:
    • It is increasing sequentially from 1, 2, 3, 4, 6, 9, 13, 19...
    • There are rules starting from 4: the value at the current index is equal to the sum of the value of the previous element and the value of the previous element

  • The test procedure is as follows:
#include <iostream>

#include <vector>


using namespace std;


int main()

{

    std::vector<int> iv;

    iv.push_back(1);


    cout << iv.capacity() << endl; //1


    iv.push_back(1);

    cout << iv.capacity() << endl; //2


    iv.push_back(1);
    
    cout << iv.capacity() << endl; //3


    iv.push_back(1);

    cout << iv.capacity() << endl; //4


    iv.push_back(1);

    cout << iv.capacity() << endl; //6


    iv.push_back(1);


    iv.push_back(1);

    cout << iv.capacity() << endl; //9


    return 0;

}
  • The operation effect is as follows: 

Under Linux

  • The expansion rule under Linux is: it is relatively simple, which is to expand the size to twice the original size
maxSize = maxSize*2;
  • Illustration: It is increasing from 1, 2, 4, 8, 16...sequentially

  • The test procedure is as follows:
#include <iostream>

#include <vector>


using namespace std;


int main()

{

    std::vector<int> iv;

    iv.push_back(1);


    cout << iv.capacity() << endl; //1


    iv.push_back(1);
    
    cout << iv.capacity() << endl; //2


    iv.push_back(1);

    cout << iv.capacity() << endl; //4


    iv.push_back(1);
    
    cout << iv.capacity() << endl; //4


    iv.push_back(1);

    cout << iv.capacity() << endl; //8


    iv.push_back(1);


    iv.push_back(1);

    cout << iv.capacity() << endl; //8


    return 0;

}
  • The operation effect is as follows: 

Seven, vector element operations (pop_back, erase, clear, insert)

  • There are many element operation actions provided, so I will explain them one by one in this article.

pop_back

//将尾端元素拿掉,并调整大小

void pop_back() {

    --finish; //将尾端标记往前移一格,表示将放弃尾端元素

    destroy(finish); // destroy是全局函式

}

 

erase

// 清除[first,last)中的所有元素

iterator erase(iterator first, iterator last) {

    iterator i = copy(last, finish, first); //copy是全局函式

    destroy(i, finish); //destroy是全局函式

    finish = finish - (last - first);

    return first;

}
  • The figure below is a version of the erase function above

 

// 清除某个位置上的元素

iterator erase(iterator position) {

    if (position + 1 != end())

        copy(position + 1, finish, position); //copy是全局函式

    --finish;

    destroy(finish); //destroy是全局函式

    return position;

}

 

clear

//清除容器内所有元素

void clear() { erase(begin(), end()); }

 

insert

//从position开始,插入n个元素,元素初值为x

template<class T,class Alloc>

void vector<T, Alloc>::insert(iterator position, size_type n, const T& x)

{

    if (n != 0) { //当n!= 0才进行以下所有动作

        if (size_type(end_of_storage - finish) >= n){

            //备用空间大于等于“新增元素个数”

            T x_copy = x;

            // 以下计算插入点之后的现有元素个数

            const size_type elems_after = finish - position;

            iterator old_finish = finish;

            if (elems_after > n){

                //“插入点之后的现有元素个数”大于“新增元素个数”

                uninitialized_copy(finish - n, finish, finish);

                finish += n; // 将vector尾端标记后移

                copy_backward(position, old_finish - n, old_finish);

                fill(position, position + n, x_copy); //从插入点开始填入新值

            }

        }

        else {

            //“插入点之后的现有元素个数”小于等于“新增元素个数”

            uninitialized_fill_n(finish, n - elems_after, x_copy);

            finish += n - elems_after;

            uninitialized_copy(position, old_finish, finish);

            finish += elems_after;

            fill(position, old_finish, x_copy);

        }

    }

    else {

        // 备用空间小于“新增元素个数”(那就必须配置额外的内存)

        // 首先决定新长度:旧长度的两倍,或旧长度+新增元素个数

        const size_type old_size = size();

        const size_type len = old_size + max(old_size, n);

        // 以下配置新的vector空间

        iterator new_start = data_allocator::allocate(len);

        iterator new_finish = new_start;


        STL_TRY {

            // 以下首先将旧vector的安插点之前的元素复制到新空间

            new_finish = uninitialized_copy(start, position, new_start);

            // 以下再将新增元素(初值皆为n)填入新空间

            new_finish = uninitialized_fill_n(new_finish, n, x);

            // 以下再将旧 vector 的插入点之后的元素复制到新空间

            new_finish = uninitialized_copy(position, finish, new_finish);

        }

# ifdef STL_USE_EXCEPTIONS

        catch(...) {

            // 如有异常发生,实现"commit or rollback" semantics.

            destroy(new_start, new_finish);

            data_allocator::deallocate(new_start, len);

            throw;
    
        }

# endif /* STL_USE_EXCEPTIONS */


        // 以下清除并释放旧的vector
    
        destroy(start, finish);

        deallocate();

        // 以下调整水位标记

        start = new_start; finish =

        new_finish; end_of_storage =

        new_start + len;

    }

}
  • Note that after the insertion is complete, the new node will be in front of the node pointed to by the sentinel iterator (the position of the previous button, indicating the insertion point)-this is the STL standard specification for "insertion operations". The picture below shows the operation of insert(position,n,x)

When the spare space >= the number of new elements:

  • ①Spare space 2>=number of new elements 2
  • ②The number of existing elements after the insertion point 3>the number of new elements 2

  • ③The number of existing elements after the insertion point 2 <= the number of new elements 3

When the spare space >= the number of new elements:

  • For example, the following spare space 2<number of new elements n==3

Guess you like

Origin blog.csdn.net/www_dong/article/details/113825932