Create your own STL from scratch (four, list)

Briefly

We believe that we have a certain understanding of containers after seeing the implementation of vector. A container is where things are stored, it represents a space. If you want to understand a container intuitively, then understanding its space allocation strategy is a very effective way to start. Next, let's take a look at how the list in STL is implemented.

structure of list

List is what we often call a linked list. When it comes to linked lists, I believe everyone is very familiar with it. Non-contiguous spaces, connecting each small space through pointers, insertion and deletion are O(1) operations, element access is less efficient, and so on. . .
The structure adopted by the list is a double-ended circular linked list, which is represented by the following picture

Double-ended circular linked list structure

Ahem, although the picture is a bit ugly, it is indeed possible to see its structure intuitively. The red box in the figure represents a space in which three things are stored. The first thing is the location of the next space, that is, the location pointed to by the next pointer. The second is where the previous block of space is located. The third is the stored data. One of the special ones is the area with a head . Some people would think that this area is obviously the head of my large linked list. In fact, it is both the head of the linked list and the tail of the linked list. Take a look at the traversal operation of the linked list below, and you can understand it in a block.

for (auto i = head->next; i != head; ++it)
{
    //...
}

List space configuration

The space configuration of the list is a little more complicated than that of the vector. The space of the vector is allocated block by block. When the space is not enough to store my data, I will re-allocate a block of *2 size, and then Move the original data over. Having said so much, I just want to find a comparison. Without comparison, there is no harm.

The list is a small piece of node space, so each time a new node is added, a space must be allocated for us to use. How big is the allocation? Look at the structure of the node of the list to know that

struct _List_node_base
{
    _List_node_base *_m_next;
    _List_node_base *_m_prev;
};

template<typename _Tp>
struct _List_node : public _List_node_base
{
    _Tp _m_data;
};

Did you feel a little strange when you saw the structure of this node, why are there two structures?

In the STL implementation of SGI, the node of the list is divided into a pointer field and a data field. Why should such division, of course, has its advantages.

Our operations on lists are more about traversing nodes, and accessing data members is always when we find a node. Then we only store the pointer field of the node in the traversal operation, and do not store the data, does it save a lot of space, especially for C++, most of this data field is our custom class, and a class contains The space occupied may be very large, unlike pointers, the next pointer is only 4bytes in 32 bits

After understanding the structure of the node, its space allocation is obvious, that is, the space of sizeof(_List_node<_Tp>) can be allocated each time

template<typename _Tp, typename _Alloc>
class _List_base
{
public:
    typedef _Alloc allocator_type;

    _List_base()
    {
        _m_head = _M_get_node();
        _m_head->_m_next = _m_head;
        _m_head->_m_prev = _m_head;
    }

    ~_List_base()
    {
        clear();
        _M_put_node(_m_head);
    }

    void clear();
protected:
    typedef simple_alloc<_List_node<_Tp>, _Alloc> _Alloc_type;

    _List_node<_Tp> * _M_get_node() // 分配节点空间
    {
        return _Alloc_type::allocate(1);
    }
    void _M_put_node(_List_node<_Tp> *_p) // 释放节点空间
    {
        _Alloc_type::deallocate(_p);
    }

    _List_node<_Tp>* _m_head;
};

The code is relatively simple, the _List_base class is mainly responsible for the initialization of the head node. It should be noted that the template parameter passed to the allocator is **_List_node<_Tp>, which is used as the _Tp parameter of simple_alloc

simple_alloc

element access of list

After understanding the memory layout of the list, let's look at the iterator of the list and see how the iterator implements element access to it.

Not much to say, code first, see the difference from the source code

struct _List_iterator_base
{
    _List_node_base* _m_node;

    _List_iterator_base(_List_node_base *_x) :_m_node(_x){}
    _List_iterator_base():_m_node(nullptr){}

    void _M_incr() 
    {
        _m_node = _m_node->_m_next;
    }

    void _M_decr() 
    {
        _m_node = _m_node->_m_prev;
    }

    bool operator==(const _List_iterator_base &_x)const
    {
        return _m_node == _x._m_node;
    }

    bool operator!=(const _List_iterator_base &_x)const
    {
        return _m_node != _x._m_node;
    }
};

Seeing the _m_node in the first line, I believe you should understand the reason why the pointer field and the data field of the node are separated. Only the pointer field is needed in the iterator, because the job of the iterator is to access the node. rather than data

The work done by iteartor's base class is very simple
1. Initialize the current node
2. Provide an interface to access the adjacent nodes of the current node
3. Provide a method for node comparison

With the interface provided by the base class, the implementation of the iterator is very simple

template<typename _Tp, typename _Ref, typename _Ptr>
struct _List_iterator : public _List_iterator_base
{
public:
    typedef _List_iterator<_Tp, _Tp&, _Tp*> iterator;
    typedef _List_iterator<_Tp, const _Tp&, const _Tp*> const_iterator;
    typedef _List_iterator<_Tp, _Ref, _Ptr> _Self;

    typedef _Tp value_type;
    typedef _Ptr pointer;
    typedef _Ref reference;
    typedef _List_node<_Tp> _Node;


    _List_iterator(_Node *_x): _List_iterator_base(_x){}
    _List_iterator(){}
    _List_iterator(const iterator& _x):_List_iterator_base(_x._m_node){}

    reference operator*()const 
    {
        return ((_Node*)_m_node)->_m_data;
    }

    pointer operator->()const
    {
        return &(operator*());
    }

    _Self& operator++()
    {
        this->_M_incr();
        return *this;
    }

    _Self operator++(int)
    {
        _Self _tmp = *this;
        this->_M_incr();
        return _tmp;
    }

    _Self &operator--()
    {
        this->_M_decr();
        return *this;
    }
    _Self operator--(int)
    {
        _Self _tmp = *this;
        this->_M_decr();
        return _tmp;
    }
};

Some types of aliases are defined in the front. The interface that iterator really provides is to access the next node (operator++), access the previous node (operator–), and access the data of the current node (operator*)

The preparatory work is done, you can start to see the specific implementation of the list

The specific implementation of list

template<typename _Tp, typename _Alloc = alloc >
class list : protected _List_base<_Tp, _Alloc>
{
    typedef _List_base<_Tp, _Alloc> _Base;
public:
    typedef _Tp value_type;
    typedef _Tp* pointer;
    typedef const _Tp*  const_point;
    typedef _Tp&    reference;
    typedef const _Tp&  const_reference;
    typedef  _List_node<_Tp> _Node;
    typedef size_t  size_type;
    typedef ptrdiff_t   difference_type;

    typedef _List_iterator<_Tp, _Tp&, _Tp*> iterator;
    typedef _List_iterator<_Tp, const _Tp&, const _Tp*> const_iterator;

    typedef reverse_iterator<const_iterator> const_reverse_iterator;
    typedef reverse_iterator<iterator>     reverse_iterator;

protected:
    using _Base::_M_put_node;
    using _Base::_M_get_node;
    using _Base::_m_head
};

These are the definitions of some aliases and references to base class members and functions. The code of the base class is posted on it. Students who have forgotten it can look it up.

Constructor

list() : _Base(){}

list(size_type _n, const_reference _value):_Base()
{
    insert(begin(), _n, _value);
}

list(size_type _n):_Base()
{
    insert(begin(), _n, _Tp());
}

list(const_point _first, const_point _last)):_Base()
{
    insert(begin(), _first, _last);
}

list(const_iterator _first, const_iterator _last):_Base()
{
    insert(begin(), _first, _last);
}

list(const list<_Tp, _Alloc> &_x):_Base())
{
    insert(begin(), _x.begin(), _x.end());
}

Don't look at the constructor function that seems to be related to the insert function, then ran to see the insert, and was intimidated by it. In fact, the constructor does nothing more than two things:
1. Initialize the head node through the base class
2. Insert the passed data into to the list

It's just that we provide a variety of ways to construct data, which seems to be more responsible. Later, we will find out how it performs fancy insertion.

member access function

iterator begin()
{
    return (_Node*)_m_head->_m_next;
}
const_iterator begin()const
{
    return (_Node*)_m_head->_m_next;
}
iterator end()
{
    return (_Node*)_m_head;
}
const_iterator end()const
{
    return (_Node*)_m_head;
}

reverse_iterator rbegin()
{
    return reverse_iterator(end());
}
const_reverse_iterator rbegin() const
{
    return const_reverse_iterator(end());
}

reverse_iterator rend()
{
    return reverse_iterator(begin());
}
const_reverse_iterator rend() const
{
    return const_reverse_iterator(begin());
}

bool empty()const
{
    return _m_head == _m_head->_m_next;
}

size_type size()
{
    size_type _result = 0;
    distance(begin(), end(), _result);
    return _result;
}
size_type max_size()const
{
    return (size_type)(-1);
}

reference front()
{
    return *begin();
}
const_reference front()const
{
    return *begin();
}
reference back()
{
    return *(--end());
}
const_reference back()const
{
    return *(--end());
}

These unified interfaces are believed to be understood by everyone at a glance. I'll say a few more words here. reverse_iterator is a reverse iterator, which means that the access order of the iterator is reversed. Originally, we visited from the beginning to the end, but it is just the opposite. This will be implemented later. The distance() function in size() everyone can be understood like this

for(auto it = begin(); it != end(); ++it)
    ++result;

It is just to provide a unified interface for different iterators to call. For iterators such as list that cannot be accessed randomly, it is necessary to traverse to know how big its size is, and for random access iterators such as vector, end()-begin() will know its size directly. size, its role is to unify the interface and adopt different strategies for different iterators

insert

Insert is a more important part of the list, and the list is born for insert and delete operations. Let's take a good look at how to perform fancy inserts

// 在pos处插入值x
iterator insert(iterator _pos, const_reference _x)
{
    _Node *_tmp = _M_create_node(_x);
    _tmp->_m_next = _pos._m_node;
    _tmp->_m_prev = _pos._m_node->_m_prev;
    _pos._m_node->_m_prev->_m_next = _tmp;
    _pos._m_node->_m_prev = _tmp;
    return _tmp;
}

This is very important, let’s draw a picture to understand it well, because the following insertion operation is nothing more than inserting multiple values ​​in the pos output. Ahem, it’s time to show the artist again, I’m really excited.

insert1

Well, this is the initial state. For convenience, it will not be drawn in a ring shape. It is good for everyone to know.
Next two steps.
1. Change the point of the tmp pointer field.

insert2

  1. Change the point of the pos pointer field

insert3

Red means deleted, blue means newly added. In this way, the new node is successfully added to the front of the pos

iterator insert(iterator _pos)
{
    return insert(_pos, _Tp());
}

void insert(iterator _pos, const_point _first, const_point _last)
{
    for (; _first != _last; ++_first)
        insert(_pos, *_first);
}
void insert(iterator _pos, const_iterator _first, const_iterator _last)
{
    for (; _first != _last; ++_first)
        insert(_pos, *_first);
}

void insert(iterator _pos, size_type _n, const_reference _x)
{
    for (; _n > 0; --_n)
        insert(_pos, _x);
}

As you can see, the remaining insert operations are the first interface called.

erase

iterator erase(iterator _pos)
{
    _List_node_base *_next_node = _pos._m_node->_m_next;
    _List_node_base *_prev_node = _pos._m_node->_m_prev;
    _Node *_tmp = (_Node*)_pos._m_node;
    _prev_node->_m_next = _next_node;
    _next_node->_m_next = _prev_node;
    destroy(&_tmp->_m_data);
    _M_put_node(_tmp);
    return iterator((_Node*)_next_node);
}

The deletion operation should be well understood by looking at the code. The next pointer of the previous node of pos points to the next node of pos, and the prev pointer of the next node of pos points to the previous node of pos.

push && pop

void push_front(const_reference _x)
{
    insert(begin(), _x);
}
void push_front()
{
    insert(begin());
}
void push_back(const_reference _x)
{
    insert(end(), _x);
}
void push_back()
{
    insert(end());
}
void pop_front()
{
    erase(begin());
}
void pop_back()
{
    erase(--end());
}

With insert and erase as interfaces, push and pop can be called directly

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324734913&siteId=291194637