Implementation of list

1. What is a list

list is a container in STL, and the underlying structure is a bidirectional circular linked list with a head. That is, each node has two pointers pointing to the node's predecessor and successor node respectively, and once the container is created, even if no data is inserted, there should be a sentinel bit (and the pointers all point to itself).

Each node of the list has this structure:

template<class T>
struct __list_node
{
 	__list_node<T>*_prev;
    __list_node<T>*_next;
    T _data;
};

2. Iterator

1. Functional classification of iterators

One-way iterator: only ++, not –. Such as: singly linked list, hash table

Bidirectional iterator: both ++ and –, such as doubly linked list

Random access iterator: not only ++ and –, but also + and -, such as string and vector

Iterators are built-in types (inner classes or defined inside classes)

2. Ordinary iterator of list

I implement the iterators of string and vector with native pointers, because their underlying structure itself is an array, and the space is continuous, so native pointers can just meet our needs (dereferencing can get The data pointed to, ++ can get the next element).

But for the list, only the original pointer is powerless, the space between the nodes of the linked list is not continuous, and even the address of the node opened later is smaller than the address of the node opened earlier. So the ++ of the native pointer cannot get the next node of this node, and dereferencing the node pointer cannot get the data of the node (dereferencing the node can only get the node).


If you pass node->datait, you can get the data corresponding to the node, and if you pass it, node->nextyou can get the next node of the node. But direct dereferencing and ++ can't do this, so we can use operator overloading to change the rules of operators to achieve the goal (you are the rule maker at this moment).

Encapsulating raw pointers and changing their corresponding operator rules can achieve my goal

template<class T>
struct __list_iterator
{
    typedef __list_node<T> node;
    node*pnode;
    
    __list_iterator(node*p)//用一个节点的指针初始化
        :pnode(p)
        {}
    
  	T& operator*()
    {
        return pnode->_data;//解引用拿到节点的数
    }
    
    __list_iterator<T>& operator++()
    {
        pnode=pnode->_next;
        return *this;//返回的还是迭代器
    }
    
    bool operator!=(const __list_iterator<T>& it)
    {
        return pnode!=it.pnode;
    }
};

Both struct and class create classes, but struct is public by default, and class is private by default

3. const iterator of list

When using a native pointer to implement a const iterator, you only need to add a const at the beginning, but for this iterator, you cannot prevent the data pointed by the iterator from being modified by adding a const to the leftmost.

Then here you can repackage a class for const iterators:

template<class T>
struct __list_const_iterator
{
    typedef __list_node node;
    node* pnode;
    
    __list_iterator(node*p)
       :pnode(p)
       {}
    
    const T&operator*()const
    {
        return pnode->_data;
    }
    
 	__list_const_iterator<T>&operator++()
    {
        pnode=pnode->_next;
        return *this;
    }
    
    bool operator!=(const __list_const_iterator<T>&it)
    {
        return pnode!=it.pnode;
    }
}

The biggest difference between const iterators and ordinary iterators is that the data obtained by dereferencing const iterators cannot be modified, but it still needs to support ++


It is observed that it is a bit redundant to encapsulate ordinary iterators and const iterators into a class, because they are the same except for the dereferencing operations.

Therefore, the idea of ​​generic programming can be used to set the return value of the dereferencing function as a template parameter, so that as long as the user passes in different template parameters when using it, the compiler will generate different classes, and this implementation method is also adopted in the library. .

template<class T,class Ref>
struct __list_iteraotr
{
  typedef __list_node node;
  typedef __list_iterator<class T,class Ref> Self;
   node*pnode;
    
   __list_iterator(node*p)
       :pnode(p)
       {}
    
    Ref oprator*()
    {
        return pnode->_data;
    }
    Self &operator++()
    {
        pnode=pnode->_next;
        return *this;
    }
    
    bool operator!=(const Self&it)
    {
        return pnode!=it.pnode;
    }
    
};

4. Invalidation problem of list iterator

Erase is invalid. After deleting this node, the space of this node will be deleted, so the iterator pointing to this deleted node will become invalid.

insert does not fail, the insert is inserted before this node and does not affect this node.

5.list iterator final version

Both normal iterators and const iterators exist, but iterators haven't been implemented yet. The behavior of an iterator is like a pointer, so for a structure, when I have a pointer to this structure, if I want to access the elements in the structure, I need to use this operator, so here I have to overload ->this symbol. But the const iterator cannot be changed, so for ->this operator, one must be overloaded for both const and non-const types, so template parameters are also used here. So far, the iterator has three template parameters.

template<class T,class Ref,class Ptr>
struct __list_iterator
{
  typedef __list_node node;
  typedef __list_iterator<class T,class Ref,class Ptr> Self;
  
  Ref& operator*()
  {
      return pnode->_data;
  }
    
 //这里有两层封装,迭代器是节点的指针解引用是拿到节点,节点再解引用才是数据,但是这里是希望一次解引用->直接拿到数据
 //所以这里返回数据的地址
  Ptr operator->()
  {
      return &(pnode->data);
  }
    
  Self&operator++()
  {
      pnode=pnode->_next;
      return *this;
  }
  
  Self operator++(T)//后置++
  {
      Self tmp(*this);
      pnode=pnode->_next;
	  return tmp;      
  }
  
  Self&operator--()
  {
	pnode=pnode->_prev;
    return *this;
  }
      
  Self operator--(T)
  {
      Self tmp(*this);
      pnode=pnode->_prev;
      return tmp;
  }
  
  bool operator!=(const Self&it)
  {
      return pnode!=it.pnode;
  }
    
  bool operator==(const Self&it)
  {
      return pnode==it.pndoe;
  }
};

6. The value of the iterator

The iterator encapsulates the underlying implementation details, but it provides us with a unified way to access the container and reduces our learning cost. The implementation methods of various containers are different, and the structures are also different, that is, the access methods between different containers are different. But the implementation of the iterator is convenient for us. Although the set used later is a search binary tree, we can still use the iterator to access it like now.

3. Some points for attention

1. The list in the standard library provides a sorting function. The iterator of the list does not support random access, so the list cannot be used for three-number retrieval, which means that the list cannot be used for quick sorting. The sotr function provided in the library is actually a merge sort, which is very inefficient or even lower than copying the data to the vector and then copying it back after sorting. So use less, and try to use vector for sorting.

2. For general classes, the class name is the type, but for class templates, type = class name + template parameters , such as the type of list is list<int>

Four. list and vector comparison

vector:

Advantages of vecotr (structural advantages):

1. Support subscript random access

2. The efficiency of tail plugging and tail deletion is high (the time of expansion will be slower)

3. The CPU cache hit rate is high (loading data from the cache is loading a piece of data at a time, and the data addresses in the vector are continuous, that is to say, the loaded piece of data may be in the vector array)


Disadvantages of vector:

1. The efficiency of non-tail insertion and tail deletion is low

2. There is a price for expansion and there is a certain waste of space


Vector iterator invalidation problem:

Both insert (wild pointer) and erase (different meaning) are invalid

In fact, insert in string also has the problem of iterator failure, but the interfaces in string are almost all accessed by subscript, so the problem of iterator failure is not considered when implementing string

list

Advantages of lists:

1. Space is released on demand, no space wasted

2. Insertion and deletion at any position is O(1)—regardless of the search time


Disadvantages of lists:

1. Does not support random access, low search efficiency

2. CPU cache hit rate is low

3. In addition to storing data, each node also stores two additional pointers


List iterator invalidation problem:

Erase fails, insert does not fail

5. List overall implementation code

#pragma once
#include<iostream>
#include<assert.h>
#include<algorithm>


//list本质是一个带头双向循环链表,因为链表空间的不连续性,所以list的迭代器不能再使用原生指针,
//通过封装+运算符重载改变符号的访问规则,来提供迭代器像指针一样的行为

namespace wbm
{
	
	template<class T>
	struct __list_node
	{
		__list_node<T>* next;
		__list_node<T>* prev;
		T data;

		//初始化链表的节点
		__list_node(const T&x)
			:next(nullptr)
			,prev(nullptr)
			,data(x)
		{}
	};


	//要有const和非const的解引用,一个可供修改,一个不能修改,为了避免代码冗余,可以使用双参数模板
	template<class T,class Ref,class Ptr>
	struct __list_iterator //目的是通过封装和运算符重载达到类似指针,指向节点的指针
	{
		typedef __list_iterator<T, Ref,Ptr>Self;
		typedef __list_node<T> node;
		node* pnode;

		//构造初始化
		__list_iterator(node *p)
			:pnode(p)
		{}

		//返回值取决于内部用的成员,解引用拿到的是数据,++则是比较指针

		Ref operator*()
		{
			return pnode->data;
		}

		Self& operator++()
		{
			pnode = pnode->next;
			return *this;
		}

		Self operator++(T)//后置++,先返回再++
		{
			//node* tmp = pnode;
			Self tmp(*this);//创建一个迭代器对象,然后用*this构造
			pnode = pnode->next;

			return tmp;
		}
		Self& operator--()
		{
			pnode = pnode->prev;
			return *this;
		}
		Self operator--(T)
		{
			//node* tmp = pnode;
			Self tmp(*this);
			pnode = pnode->prev;
			return tmp;
		}

		//重载!=,是两个迭代器之间的比较
		bool operator!=(const Self& it)
		{
			return pnode != it.pnode;
		}

		bool operator==(const Self& it)
		{
			return pnode == it.pnode;
		}

		//重载箭头,
		Ptr operator->()
		{
			//方便直接访问
			return &(pnode->data);//迭代器是节点的指针,要拿到节点的数据要对节点解引用,pnode就是节点
		}
	};

	template<class T>
	class list
	{
		typedef __list_node<T> node;
		

	public:
		typedef __list_iterator<T,T&,T*> iterator;
		typedef __list_iterator<T, const T&,const T*> const_iterator;

		void emptyInit()
		{
			_head = new node(T());//用一个匿名对象来初始化哨兵位,类成员函数有个默认的this指针,new的用法为new+类型
			_head->next = _head;
			_head->prev = _head;
			_size = 0;
		}

		//上手一个构造,将哨兵位衔接起来
		list()
		{
			emptyInit();//复用代码可以提高代码的可维护性
		}

		//迭代器区间构造的构造函数
		template<class InputIterator>
		list(InputIterator first, InputIterator last)
		{
			emptyInit();
			while (first != last)
			{
				push_back(*first);//对迭代器解引用就可以拿到迭代器内部的数据
				++first;
			}
		}

		//拷贝构造和赋值重载
		//拷贝构造本质就是一个构造函数的重载:lt(lt1),类对象之间的初始化
		list(const list<T>&lt)
		{
			//将前面那个节点全部拷贝一份尾插到我的头节点中
			emptyInit();

			for (auto& e : lt)//这里要用引用避免拷贝带来的巨大代价
			{
				push_back(e);//push_back调用insert,内部会用e来初始化节点
			}
		}
		
		list<T>& operator=(list<T> lt)//这里传值时会发生拷贝,直接占用这个空间即可
		{
			swap(lt);
			return *this;
		}

		void swap(list<T>&lt)
		{
			std::swap(_head,lt._head);//交换两者之间的头节点
			std::swap(_size, lt._size);
		}
		//搞迭代器方便访问和打印
		iterator begin()
		{
			//哨兵位的下一个
			return iterator(_head->next);
		}

		iterator end()
		{
			//尾节点的下一个,就是哨兵位
			return iterator(_head);
		}

		const_iterator begin()const
		{
			return const_itertator(_head->next);
		}

		const_iterator end()const
		{
			return const_itertator(_head);
		}

		//尾插函数
		void push_back(const T& val)
		{
			//用一个尾节点标定尾巴,然后将新节点
			/*node* newnode = new node(val);
			node* tail = _head->prev;
			tail->next = newnode;
			newnode->prev = tail;
			newnode->next = _head;
			_head->prev = newnode;*/

			insert(end(), val);
		}

		//尾删,头插头删,插入,删除
		void pop_back()
		{
			//node* tail = _head->prev;
			//_head->prev = tail->prev;
			//tail->prev->next = _head;

			释放tail节点
			//delete tail;

			erase(--end());
		}

		void push_front(const T&x)
		{
			//开辟一个新节点
			/*node* newnode = new node(x);

			newnode->next = _head->next;
			_head->next->prev = newnode;
			newnode->prev = _head;
			_head->next = newnode;*/

			insert(begin(), x);

		}

		void pop_front()
		{
			/*node* next = _head->next;
			_head->next = next->next;
			next->next->prev = _head;
			delete next;*/
			
			erase(begin());
		}

		//在pos位置之前插入
		iterator insert(iterator pos,const T&val)
		{
			//搞一个变量记录pos节点的指针,因为没有重载->
			node* cur = pos.pnode;//cur就相当于这个pos节点的指针
			
			node* newnode = new node(val);
			newnode->prev = cur->prev;
			cur->prev->next = newnode;
			newnode->next = cur;
			cur->prev = newnode;

			_size++;
			//插入以后,迭代器并不失效,因为它指向的还是原位置
			return iterator(newnode);
		}

		iterator erase(iterator pos)//erase以后迭代器失效,因为删除以后这个节点会被释放
		{
			//不能是哨兵位
			assert(pos != end());

			//返回删除节点的后一个节点
			node* cur = pos.pnode;
			node* next = cur->next;
			node* prev = cur->prev;

			next->prev = prev;
			prev->next = next;

			delete pos.pnode;

			_size--;
			return iterator(next);
		}

		void clear()
		{
			//清空掉所有节点,但是要保留哨兵位
			iterator it = begin();
			while (it != end())
			{
				it=erase(it);
				it++;
			}
		}

		bool empty()const
		{
			return _size == 0;
		}

		size_t size()
		{
			return _size;
		}

		//析构函数,调用clear此外再将头节点释放掉
		~list()
		{
			clear();

			delete _head;
			_head = nullptr;
		}

	private:
		node* _head;		//哨兵位,链表节点的指针
		size_t _size;
	};



	void test_list1()
	{
		list<int> lt1;
		lt1.push_back(1);
		lt1.push_back(2);
		lt1.push_back(3);
		lt1.push_back(5);
		lt1.push_back(7);
		lt1.push_back(9);
		

		list<int>::iterator it = lt1.begin();
		while (it != lt1.end())
		{
			cout << *it << " ";
			++it;
		}
		cout << endl;
		

		lt1.pop_back();
		lt1.pop_back();
		lt1.pop_back();
		lt1.pop_back();
		
		for (auto e : lt1)
		{
			cout << e << " ";
		}
		cout << endl;

		lt1.push_front(10);
		lt1.push_front(20);
		lt1.push_front(30);
		lt1.push_front(40);

		for (auto e : lt1)
		{
			cout << e << " ";
		}
		cout << endl;

		lt1.pop_front();
		lt1.pop_front();
		lt1.pop_front();
		lt1.pop_front();
		lt1.pop_front();
		for (auto e : lt1)
		{
			cout << e << " ";
		}
		cout << endl;

		it = lt1.begin();//更新之前用过的迭代器

		lt1.insert(it, 20);
		lt1.insert(it, 30);
		for (auto e : lt1)
		{
			cout << e << " ";
		}
		cout << endl;

		lt1.clear();


	}
}

Guess you like

Origin blog.csdn.net/m0_62633482/article/details/130176322