[STL] Detailed implementation of vector simulation

Table of contents

1. Preparation

二,push_back  

1. About references

2. Modification of parameter const

 Replenish

Third, iterator implementation

Four, Pop_back

Five, insert

1. Supplement - iterator invalidation

Six, erase

Seven, constructor 

1. Iterator construction 

2. Other structures

3. Copy construction 

1) Traditional way of writing

2) Modern writing (improve function reusability) 

Eight, assignment symbol overloading

Nine, resize


 

1. Preparation

      In the preparatory work, we need what we have learned before, the knowledge of namespaces and  class templates , and we need to learn how to implement the STL source code before we implement it.

Before starting to implement, let's get familiar with the framework of vector:

// 头文件
#include <iostream>
#include <vector>
using namespace std;

namespace my_vector  // 里面我们使用的类也叫vector,命名空间隔离,避免与STL中的vector命名重复
{
	template <class T>  // 这里缺了内存池的模板,后面再学习
	class vector
	{
		typedef T* iterator;  // vector在物理空间上是一段连续的空间,所有这里的迭代器就是指针
	public:

	private:
		iterator _start;  // 迭代器开始位置
		iterator _finish;  // 当前迭代器指向位置,相当于size
		iterator end_of_storage;  // 该段迭代器的最终位置
	};
}

// 测试文件///
#include "my_vector.h"

int main()
{
	my_vector::vector<int> p1; // 使用自己的vector需要进行标识命名空间,否则用的就是STL中vector
	return 0;
}

二,push_back  

void push_back(const T& data) // 这里有关 两个问题的探讨,1. const 修饰; 2. 引用
		{
		}

Here are two points      we need to discuss : 1.  Parameter const modification    ; 2. About references        

A word about citations

1. About references

   Our data is int, char is okay, we don’t use references, and value copying is not too bad, but our parameters are string class, vector<T> , etc. At this time, string is a deep copy, which wastes too much performance . Therefore, the reference is selected here.

2. Modification of parameter const

   Let's see the following scene

int main()
{
	my_vector::vector<int> p1;   // 使用自己的vector需要进行标识命名空间,否则用的就是STL中的vector

    // 场景一:参数是 变量对象
	int a = 10;
	p1.push_back(a);  // ok的

	// 场景二:参数是临时变量,或者匿名对象
	p1.push_back(10);  // 挪,没有const 修饰,无法接收临时数据
	my_vector::vector<string> p2;
	p2.push_back(string("xxxxx"));
	return 0;
}

 

Conclusion: 1. In order to ensure that vector can have relatively high performance in the face of various data types, references are needed; 2. The const modification of data can improve the compatibility of data.

 Here is the implementation code for push_back + operator[] :

        size_t size() const  // 针对被const修饰的对象,使其兼容
		{
			return _finish - _start;
		}

		size_t capacity() const
		{
			return end_of_storage - _start;
		}

		void reserve(size_t n)
		{
			if (n > capacity())
			{
				T* new_start = new T[n];
			  if (_start) // 如果旧空间不是空,则需要拷贝备份
			  {
				  // memcpy(new_start, _start, sizeof(T) * size());  为啥不直接选择memcopy
                  // 这里需要重点讲解
                   for (size_t i = 0; i < n; i++)
				  {
					  new_start[i] = _start[i];  // 如果是自定义类型,则会调用其operator
				  }
				  delete[] _start;
			  }
			  _finish = new_start + size();
			  _start = new_start;
			  end_of_storage = new_start + n;
		    }
		}

		T& operator[](size_t pos)
		{
			assert(pos < size());   //进行越界判断 
			return _start[pos];
		}


		void push_back(const T& data)   // 这里有关 两个问题的探讨,1. const 修饰; 2. 引用
		{
			if (_finish == end_of_storage)
			{
				reserve(capacity() == 0 ? 4 : capacity() * 2);
			}
			*_finish = data;
			_finish++;
		}

 Replenish

Why not use the memcpy code?
 

If you've been using a built-in type like int for T , you'll be hard-pressed to spot it. When the data is a custom type , the outer layer is a deep copy, while the inner layer is still a shallow copy .

In the same way, we can also improve the copy construction of the traditional writing method (the copy construction is below)

Third, iterator implementation

       

 In STL, we can see that there are two versions of iterators, the difference is whether the object is a const object.

Analysis: In the const member function, the display performance of this is: const    T& const this, which is a narrowing permission that does not allow modification of the content, so when we need to implement some functions, we need 2 versions.

promise:

        
        iterator begin()
		{
			return _start;
		}

        iterator end()
		{
			return _finish;
		}

        void func(const vector<T>& v)
		{
			const_iterator it = v.begin();
			while (it != v.end())
			{
				cout << *it << endl;
				it++;
			}
		}

There is no const version of the iterator, and this type error may occur when receiving a const object

 The change is simple, the full implementation is as follows:

        iterator begin()
		{
			return _start;
		}

		iterator begin() const
		{
			return _start;
		}

		iterator end()
		{
			return _finish;
		}

		iterator end() const
		{
			return _finish;
		}

Not only the begin() function, but other member functions also need const modified versions under certain circumstances .

 Now that the implementation of the iterator begin and end is completed, the range for can be solved when it faces const objects. (The range for underlying is just replaced by iterator access)

promise:

    my_vector::vector<int> p1;
    const my_vector::vector<int> p2;

	for (cosnt auto& i : p1)  // 调用普通迭代器  (这里的const与&,兼顾效率与兼容性)
	{
		cout << p1[i] << " ";
	}

	for (cosnt auto& i : p2)  // 调用const迭代器
	{
		cout << p1[i] << " ";
	}

Four, Pop_back

        void Pop_back()
		{
			assert(_finish > _start);
			_finish--;
		}

Five, insert

     Regarding insertion, the idea of ​​inserting in C language is not bad.

         void insert(iterator pos, const T& val)
		{
			assert(pos >= _start);
			if (size() + 1 >= capacity())
			{
				reserve(capacity() == 0 ? 4 : capacity() * 2);
			}
	
			if (pos < _finish)
			{
				iterator end = _finish;
				while ( end != pos)
				{
					*end = *(end - 1);
					end--;
				}
				*pos = val;
				_finish++;
			}
			else
			{
				push_back(val);
			}
		}

There is a mine here , let's see if everyone can get it out? The iterator is invalid and then the above code is improved. 

1. Supplement - iterator invalidation

     Based on the code written earlier, let's experiment with the following code.

void func1()
{
	my_vector::vector<int> p2;
	p2.push_back(1);
	p2.push_back(2);
	p2.push_back(3);
	p2.push_back(4);
	p2.push_back(5);

	auto pos = find(p2.begin(), p2.end(), 3);
	p2.insert(pos, 100); // insert一般搭配find算法,一般情况下是不知道数据的迭代器
	for (const auto& n : p2)
	{
		cout << n << " ";
	}
}

 There is no problem running now, but we will run it after p2.push_back(5) is commented out . turn out:

 

???!!! There is a bug??!!!  

 Here is not a secret, the reason for the error:

        In this scenario, both insertion and expansion are required. The iterator pos is essentially a pointer. After the expansion, pos retains the position of the old space, the pos data is not updated, and the wild pointer is wrong.

The modification method is also very simple, improve the pos update mechanism:

 Ok, so what is iterator invalidation trying to tell us?   

 Iterator invalidation—— pos after we insert data may be invalid, try not to use pos to access data.

Here we may have questions ? Haven't we already updated pos? 

Answer: It is modified in the insert function, the value is passed, and the line parameter does not affect the actual parameter . (Please forgive me for asking such a question [I can't smile])

In addition, some people will say, then I will refer to the reference of pos in my pass. In this environment, this problem can indeed be solved, but my answer is to try to be consistent with STL, and the modification of the design framework is often triggered. Move the whole body, we still can't grasp it now.

Six, erase

   It is relatively simple, that is, to replace the data forward. Such as: vector<int> p1(10, 2); initial 10 data, the initial value is 2.

        // STL中要求erase返回被删除位置的下一个位置的迭代器
		iterator erase(iterator pos)
		{
			assert(pos >= _start);
			assert(pos < _finish);

			while (pos + 1 != _finish)
			{
				*pos = *(pos + 1);
				pos++;
			}
			_finish--;
			return pos;
		}

 So is there a problem with iterator invalidation here? Result: The erase we implemented will not invalidate the iterator, what about the one in the STL library? The answer is to look at the compiler. (We know that STL is a standard library, which aims to tell everyone to implement the functional standards of C++ library functions. The specific implementation depends on how the compiler implements it. For example: some compilers may shrink when implementing erase, so iterative device may fail)

 To summarize iterator invalidation: The pos position of insert/erase must be updated; do not access it directly, unexpected results may occur .

Seven, constructor 

      At the beginning we have written a simple constructor, here will add: iterator construction, copy construction.

1. Iterator construction 

       template <class inputIterator>  // 提供一个接收参数迭代器的新模板
 25       vector (inputIterator first, inputIterator last)
 26             :_start(nullptr)
 27             ,_finish(nullptr)
 28             ,end_of_storage(nullptr)
 29       {
 30               while (first != last)
 31               {
 32                  push_back(*first);
 33                  first++;
 34               }                                                                                                     
 35       }

It should be noted here that this is a constructor function that needs to initialize the data, don't forget it.

2. Other structures

         Realize it, initialize multiple data at once. 

         vector(size_t n , const T& val = T())                                                                       
        {                                                                                                           
           for (size_t i = 0; i < n; i++)
          {
            push_back(val);                                                                                         
          }                                                                                                        
        }        

  

But there will be a bug in writing this way, that is, when both parameters are int, there will be ambiguity with the iterator construction , because the computer will look for the best matching function, so it will call the iterator constructor, and the subsequent access will use int as an iterator will result in an error. 

How to adjust?

 In fact, it is very simple to solve:

Method 1: Change size_t to int

vector(int n , const T& val = T()) 

Method 2: keep size_t, let us overload an int constructor

  vector(size_t n , const T& val = T())                                                                       
        {                                                                                                           
           for (size_t i = 0; i < n; i++)
          {
            push_back(val);                                                                                         
          }                                                                                                        
        }        

       vector(int n , const T& val = T())                                                                       
        {                                                                                                           
           for (size_t i = 0; i < n; i++)
          {
            push_back(val);                                                                                         
          }                                                                                                        
        }      

3. Copy construction 

1) Traditional way of writing

         // 传统写法           
E> 56     vector (const my_vector::vector<T>& v)       
   57       :_start(nullptr)              
   58       ,_finish(nullptr)           
   59       ,end_of_storage(nullptr)    
   60     {                                                      
   61        reserve(v.size());                                  
   62        // memcpy(_start, v._start, sizeof(T) * v.size() ); 
   63         for (size_t i = 0; i < v.size(); i++ )
   64         {
   65           _start[i] = v[i];                                                                                       
   66         }                                                                         
   67        _finish += v.size();                                                       
   68     }                        

2) Modern writing ( improve function reusability

     void swap(my_vector::vector<T>& order)
 38     {
 39       std::swap(_start, order._start);
 40       std::swap(_finish, order._finish);
 41       std::swap(end_of_storage, order.end_of_storage);
 42     }
 43 
 44     // 拷贝构造
 45     vector (const my_vector::vector<T>& v )                                                                         
 46       : _start(nullptr)
 47       , _finish(nullptr)
 48       , end_of_storage(nullptr)
 49     {
 50       my_vector::vector<T> tmp(v.begin(), v.end());
 51       // 借用迭代器构造,然后交换数据。
 52       swap(tmp);
 53     }

Eight, assignment symbol overloading

Here is an analysis  of an error-prone point : some students will not be able to distinguish p2 = p1 & p2(Data)

    void t()                                                                               
 81 {                                                                                      
 82   vector<int> p1 = 10; // 这是一个p1的初始化,等价于p1(10)
 83   vector<int> p2;                        
 84   p2 = p1;          // p2已经存在,这才是调用赋值重载函数                                                                                                                                                                  
 86 }      

Regarding this assignment overloading, it is easier to write: 

      vector<T>& operator= (vector<T> v)  // 不添加&是故意的
195     {                                                                                          
196        swap(v); 
197        return *this;                                                                                                
198     }  

Here is the idea inside:

Nine, resize

   The function of adjusting the size of data solves the problem of modifying the size of existing objects .

        // 调整数据大小       
132     void resize(size_t n , const T& val = T())                                                                      
133     {
134        if (n > capacity())
135        {
136          reserve(n);
137        }
138 
139        if (n > size())
140        {
141          // 开始填充数据
142          while ( _finish < _start + n )
143          {
144            *_finish = val;
145            _finish++;
146          }
147        }else 
148        {
149          _finish = _start + n;
150        }
151     }


epilogue

   This section is over here, thank you friends for browsing, if you have any suggestions, welcome to comment in the comment area, if you bring some gains to your friends, please leave your likes, your likes and concerns will become bloggers The driving force of the master's creation .

Guess you like

Origin blog.csdn.net/qq_72112924/article/details/131616414