STL container - simulation implementation of vector (with detailed notes)

1. What is a vector container?

In C++'s STL (Standard Template Library), vector is a dynamic array container. It provides a way to store and access elements, similar to a fixed-size array, but with the ability to be dynamically resized.

Vector containers can be automatically resized as needed at runtime, and elements can be quickly added or removed at the end. It provides a set of member functions and iterators for accessing and manipulating elements in the container, such as inserting, deleting, finding elements, etc.

The following are the characteristics and usage of some vector containers:

The elements in the vector are stored contiguously in memory, and elements can be accessed and modified through subscripts.
The size of the container can grow or shrink automatically as needed, and functions such as push_back and pop_back can be used to conveniently add or remove elements at the end.
Iterators can be used to traverse and access elements in the container.
Vector also provides some member functions, such as size() to get the number of elements in the container, empty() to check whether the container is empty, etc.
Vector containers are very commonly used in C++ for storing and manipulating variable-sized data collections, and can be applied to various scenarios and requirements.

Second, the simulation implementation of vector

2.1 Member variables of vector

The member variables of vector are three pointer variables: _start, _finish, _endofstorage. _start points to the position of the first element in the sequence table, _finish points to the next position of the last valid element, and _endofstorage points to the next position of the capacity size.
insert image description here

2.2 Constructor

The constructor is to initialize member variables.

2.2.1 No-argument constructor

The no-argument constructor just needs to set the three pointers to empty. You can open up a certain space when constructing the function, or you can open up space when inserting elements.

		//构造函数
		//这里的初始化列表的值可以在这里给,也可以在成员变量直接给默认值nullptr
		vector()
			:_start(nullptr)
			,_finish(nullptr)
			,_endofstorage(nullptr)
		{
    
    }

2.2.2 Constructor with parameters



		//有参构造函数
		// 
		//T()是T类型的匿名对象做缺省值,因为vector能存放任何类型的值,这里不能给0作为
		// 缺省值,因为整形int的缺省值是0,但是假如这里的T是string,那么给0做缺省值就
		// 不对了,所以这里是需要注意的,同样,如果成员变量处已经给了默认值,那么这里的
		//初始化列表初始化就可以不写了
		vector(size_t n, const T& value = T())
			:_start(nullptr)
			, _finish(nullptr)
			, _endofstorage(nullptr)
		{
    
    
			//这里复用resize就能完成需求
			resize(n, value);
		}

		//有参构造
		//跟上面的构造函数构成函数重载,避免在调用时和下面的迭代器区间
		//构造函数产生歧义
		vector(int n, const T& value = T())
			:_start(nullptr)
			, _finish(nullptr)
			, _endofstorage(nullptr)
		{
    
    
			resize(n, value);
		}

		//利用迭代器区间初始化的构造函数
		template<class InputIterator>
		vector(InputIterator first, InputIterator last)
			:_start(nullptr)
			, _finish(nullptr)
			, _endofstorage(nullptr)
		{
    
    
			//把这段迭代器区间的值一个一个地尾插即可
			while (first != last)
			{
    
    
				push_back(*first);
				++first;
			}
		}


insert image description here

2.3 Copy constructor

Because there are pointer variables in the member variables here, we must pay attention to the problem of deep and shallow copying! ! ! !

		//拷贝构造(传统写法),注意一定要深拷贝,不然是会出现同一块空间被析构两次的情况的
		vector(const vector<T>& v)
			:_start(nullptr)
			,_finish(nullptr)
			,_endofstorage(nullptr)
		{
    
    
			//开辟跟v一样大小的空间
			T* tmp = new T[v.capacity()];
			if (tmp)
			{
    
    
				//当T是自定义类型string的时候,vector拷贝构造是深拷贝,但是如果用memcpy对vector
				// 里面的内容 进行拷贝,因为memcpy是按字节拷贝的,vector里面存放的是string,string的
				// 成员变量是_str,_size,_capacity三个指针,所以memcpy按字节拷贝是把每一个string的这
				// 三个指针 拷贝给了新的vector,所以新的vector中存放的string的指针和v中存放的string的
				// 指针是一样的,也就是浅拷贝问题,里面的string会被析构两次,造成程序崩溃,所以这里不能用
				// memcpy,而是要把v中的string一个个地赋值给新的vector,本质是调用了string的赋值重载
				// 解决了string的浅拷贝问题
				// 
				//不能用memcpy(tmp, v._start, sizeof(T) * v.size());
				for (size_t i = 0; i < v.size(); i++)
				{
    
    
					tmp[i] = v._start[i];
				}
				_start = tmp;
				_finish = _start + v.size();
				_endofstorage = _start + v.capacity();
			}
		}

2.4 Assignment overload function

The assignment overload function whose member variable has a pointer also needs to pay attention to the problem of deep and shallow copying.

		//赋值重载函数(现代写法)
		//利用传参调用拷贝构造得到想要的临时vector,再利用Swap函数
		//把这个拷贝构造好了的临时对象的成员变量交换给自己的成员变量即可完成赋值
		vector<T>& operator=(vector<T> v)
		{
    
    
			Swap(v);
			return *this;
		}

2.5 Destructors

The role of the destructor is to release and clean up the dynamically allocated resources.

		//析构函数
		~vector()
		{
    
    
			if (_start)
			{
    
    
				delete[] _start;
				_start = nullptr;
				_finish = nullptr;
				_endofstorage = nullptr;
			}
		}

2.6 reserve function

The reserve function is a function to adjust the size of the space, and is generally used for capacity expansion.

		void reserve(size_t n)
		{
    
    
			if (n > capacity())
			{
    
    
				//需要先把size()记录下来,不然后面更新_finish的时候会出错
				//因为存在异地扩容,而size()=_finish-_start,更新_finish
				//的逻辑是_finish=_start+size(),size()=_finish-_start;
				//即_finish=_start+_finish-_start,所以_finish永远都是
				//nullptr的,所以要想后续正确地更新_finish,需要把原来的size()
				//用sz记录下来,然后_finish=_start+sz才是正确的
				size_t sz = size();

				T* tmp = new T[n];
				if (tmp && _start)
				{
    
    
					//当T是自定义类型string的时候,vector拷贝构造是深拷贝,但是如果用memcpy对vector
					// 里面的内容 进行拷贝,因为memcpy是按字节拷贝的,vector里面存放的是string,string的
					// 成员变量是_str,_size,_capacity三个指针,所以memcpy按字节拷贝是把每一个string的这
					// 三个指针 拷贝给了新的vector,所以新的vector中存放的string的指针和v中存放的string的
					// 指针是一样的,也就是浅拷贝问题,里面的string会被析构两次,造成程序崩溃,所以这里不能用
					// memcpy,而是要把v中的string一个个地赋值给新的vector,本质是调用了string的赋值重载
					// 解决了string的浅拷贝问题
					// 
					//不能用memcpy(tmp, _start, sizeof(T) * size());
					for (size_t i = 0; i < sz; i++)
					{
    
    
						tmp[i] = _start[i];
					}
					delete[] _start;
				}
				_start = tmp;
				_finish = _start + sz;
				_endofstorage = _start + n;
			}
		}

2.7 resize function

resize is also a function for adjusting the size of the space, but resize has more functions and can also bring default values.

		void resize(size_t n, const T& value = T())
		{
    
    
			if (n <= size())
			{
    
    
				//如果n<=size(),直接更新_finish指针为最后一个元素的下一个位置即可
				_finish = _start + n;
			}
			else
			{
    
    
				//n>capacity()时,需要先扩容,reserve函数里面会判断是否需要
				//扩容,所以这里可以判断,也可以不判断都行
				reserve(n);
				//根据resize的性质,给_finish到_start+n这段区间填充value即可
				while (_finish != _start + n)
				{
    
    
					*_finish = value;
					++_finish;
				}
			}
		}

2.8 insert function

The insert function is a function used to insert data at a certain position.

		iterator insert(iterator pos, const T& x)
		{
    
    
			assert(pos >= _start && pos <= _finish);

			//满了就扩容
			if (_finish == _endofstorage)
			{
    
    
				//存在迭代器失效问题,需要记录pos相对于原来空间的位置,由于扩容后指向了
				//新的空间,造成原来的迭代器失效,需要更新pos找到pos在新的空间的相对位置
				size_t len = pos - _start;
				size_t newCapacity = capacity() == 0 ? 4 : capacity() * 2;
				reserve(newCapacity);

				//更新pos在新的空间的相对位置
				pos = _start + len;
			}
			//这里的insert就是顺序表的插入,就是把从pos位置开始往后的所有元素
			// 都往后移动一位,给pos腾出一个位置即可插入目标元素
			iterator end = _finish - 1;
			while (end >= pos)
			{
    
    
				*(end + 1) = *end;
				end--;
			}
			*pos = x;
			++_finish;

			//根据源码的要求需要返回插入的元素的迭代器
			return pos;
		}

2.9 erase function

The erase function is a function used to delete an element at a certain position.

		iterator erase(iterator pos)
		{
    
    
			//assert(pos >= begin() && pos < end());
			//iterator ret = pos;
			//iterator start = pos + 1;
			//while (start < end())
			//{
    
    
			//	*pos = *start;
			//	++pos;
			//	++start;
			//}
			//--_finish;
			//return ret;

			assert(pos >= begin() && pos < end());

			//顺序表的删除也非常简单,把pos后面的所有元素都往前挪动一位即可
			iterator start = pos + 1;
			while (start < end())
			{
    
    
				*(start - 1) = *start;
				++start;
			}
			--_finish;
			return pos;
		}

2.10 push_back and pop_back functions

The push_back and pop_back functions are functions for tail insertion and tail deletion. The insert and erase functions can be reused directly.

		void push_back(const T& x)
		{
    
    
			//复用insert尾部插入x即可
			//end()是最后一个元素的下一个位置,刚好就是尾插的位置
			insert(end(), x);
		}

		void pop_back()
		{
    
    
			//end()是最后一个元素的下一个位置,所以尾删需要删除
			//end()的前一个位置,前置--是先--,再使用
			erase(--end());
		}

2.11 capacity function

		size_t capacity() const
		{
    
    
			return _endofstorage - _start;
		}

2.12 size function

		size_t size() const
		{
    
    
			return _finish - _start;
		}

2.13 Operator overloading operator[] function

		//operator[]返回的是pos位置的引用
		T& operator[](size_t pos)
		{
    
    
			assert(pos < size());
			return _start[pos];
		}

		const T& operator[](size_t pos) const
		{
    
    
			assert(pos < size());
			return _start[pos];
		}

2.14 Iterators

		//对于vector,vector里面存放的数据类型的指针就是iterator
		typedef T* iterator;
		typedef const T* const_iterator;
		
		iterator begin()
		{
    
    
			return _start;
		}

		iterator end()
		{
    
    
			return _finish;
		}

		const_iterator begin() const
		{
    
    
			return _start;
		}

		const_iterator end() const
		{
    
    
			return _finish;
		}

Three, STL container vector simulation implementation code summary

#pragma once

#include <iostream>
using namespace std;
#include <assert.h>
#include <vector>

namespace kb
{
    
    
	template<class T>
	class vector
	{
    
    
		//对于vector,vector里面存放的数据类型的指针就是iterator
		typedef T* iterator;
		typedef const T* const_iterator;

	public:
		iterator begin()
		{
    
    
			return _start;
		}

		iterator end()
		{
    
    
			return _finish;
		}

		const_iterator begin() const
		{
    
    
			return _start;
		}

		const_iterator end() const
		{
    
    
			return _finish;
		}

		//构造函数
		//这里的初始化列表的值可以在这里给,也可以在成员变量直接给默认值nullptr
		vector()
			:_start(nullptr)
			,_finish(nullptr)
			,_endofstorage(nullptr)
		{
    
    }

		//有参构造函数
		// 
		//T()是T类型的匿名对象做缺省值,因为vector能存放任何类型的值,这里不能给0作为
		// 缺省值,因为整形int的缺省值是0,但是假如这里的T是string,那么给0做缺省值就
		// 不对了,所以这里是需要注意的,同样,如果成员变量处已经给了默认值,那么这里的
		//初始化列表初始化就可以不写了
		vector(size_t n, const T& value = T())
			:_start(nullptr)
			, _finish(nullptr)
			, _endofstorage(nullptr)
		{
    
    
			//这里复用resize就能完成需求
			resize(n, value);
		}

		//有参构造
		//跟上面的构造函数构成函数重载,避免在调用时和下面的迭代器区间
		//构造函数产生歧义
		vector(int n, const T& value = T())
			:_start(nullptr)
			, _finish(nullptr)
			, _endofstorage(nullptr)
		{
    
    
			resize(n, value);
		}

		//利用迭代器区间初始化的构造函数
		template<class InputIterator>
		vector(InputIterator first, InputIterator last)
			:_start(nullptr)
			, _finish(nullptr)
			, _endofstorage(nullptr)
		{
    
    
			//把这段迭代器区间的值一个一个地尾插即可
			while (first != last)
			{
    
    
				push_back(*first);
				++first;
			}
		}
		
		//拷贝构造(传统写法),注意一定要深拷贝,不然是会出现同一块空间被析构两次的情况的
		vector(const vector<T>& v)
			:_start(nullptr)
			,_finish(nullptr)
			,_endofstorage(nullptr)
		{
    
    
			//开辟跟v一样大小的空间
			T* tmp = new T[v.capacity()];
			if (tmp)
			{
    
    
				//当T是自定义类型string的时候,vector拷贝构造是深拷贝,但是如果用memcpy对vector
				// 里面的内容 进行拷贝,因为memcpy是按字节拷贝的,vector里面存放的是string,string的
				// 成员变量是_str,_size,_capacity三个指针,所以memcpy按字节拷贝是把每一个string的这
				// 三个指针 拷贝给了新的vector,所以新的vector中存放的string的指针和v中存放的string的
				// 指针是一样的,也就是浅拷贝问题,里面的string会被析构两次,造成程序崩溃,所以这里不能用
				// memcpy,而是要把v中的string一个个地赋值给新的vector,本质是调用了string的赋值重载
				// 解决了string的浅拷贝问题
				// 
				//不能用memcpy(tmp, v._start, sizeof(T) * v.size());
				for (size_t i = 0; i < v.size(); i++)
				{
    
    
					tmp[i] = v._start[i];
				}
				_start = tmp;
				_finish = _start + v.size();
				_endofstorage = _start + v.capacity();
			}
		}

		void Swap(vector<T>& tmp)
		{
    
    
			std::swap(_start, tmp._start);
			std::swap(_finish, tmp._finish);
			std::swap(_endofstorage, tmp._endofstorage);
		}

		//赋值重载函数(现代写法)
		//利用传参调用拷贝构造得到想要的临时vector,再利用Swap函数
		//把这个拷贝构造好了的临时对象的成员变量交换给自己的成员变量即可完成赋值
		vector<T>& operator=(vector<T> v)
		{
    
    
			Swap(v);
			return *this;
		}

		//析构函数
		~vector()
		{
    
    
			if (_start)
			{
    
    
				delete[] _start;
				_start = nullptr;
				_finish = nullptr;
				_endofstorage = nullptr;
			}
		}

		void reserve(size_t n)
		{
    
    
			if (n > capacity())
			{
    
    
				//需要先把size()记录下来,不然后面更新_finish的时候会出错
				//因为存在异地扩容,而size()=_finish-_start,更新_finish
				//的逻辑是_finish=_start+size(),size()=_finish-_start;
				//即_finish=_start+_finish-_start,所以_finish永远都是
				//nullptr的,所以要想后续正确地更新_finish,需要把原来的size()
				//用sz记录下来,然后_finish=_start+sz才是正确的
				size_t sz = size();

				T* tmp = new T[n];
				if (tmp && _start)
				{
    
    
					//当T是自定义类型string的时候,vector拷贝构造是深拷贝,但是如果用memcpy对vector
					// 里面的内容 进行拷贝,因为memcpy是按字节拷贝的,vector里面存放的是string,string的
					// 成员变量是_str,_size,_capacity三个指针,所以memcpy按字节拷贝是把每一个string的这
					// 三个指针 拷贝给了新的vector,所以新的vector中存放的string的指针和v中存放的string的
					// 指针是一样的,也就是浅拷贝问题,里面的string会被析构两次,造成程序崩溃,所以这里不能用
					// memcpy,而是要把v中的string一个个地赋值给新的vector,本质是调用了string的赋值重载
					// 解决了string的浅拷贝问题
					// 
					//不能用memcpy(tmp, _start, sizeof(T) * size());
					for (size_t i = 0; i < sz; i++)
					{
    
    
						tmp[i] = _start[i];
					}
					delete[] _start;
				}
				_start = tmp;
				_finish = _start + sz;
				_endofstorage = _start + n;
			}
		}

		void resize(size_t n, const T& value = T())
		{
    
    
			if (n <= size())
			{
    
    
				//如果n<=size(),直接更新_finish指针为最后一个元素的下一个位置即可
				_finish = _start + n;
			}
			else
			{
    
    
				//n>capacity()时,需要先扩容,reserve函数里面会判断是否需要
				//扩容,所以这里可以判断,也可以不判断都行
				reserve(n);
				//根据resize的性质,给_finish到_start+n这段区间填充value即可
				while (_finish != _start + n)
				{
    
    
					*_finish = value;
					++_finish;
				}
			}
		}

		iterator insert(iterator pos, const T& x)
		{
    
    
			assert(pos >= _start && pos <= _finish);

			//满了就扩容
			if (_finish == _endofstorage)
			{
    
    
				//存在迭代器失效问题,需要记录pos相对于原来空间的位置,由于扩容后指向了
				//新的空间,造成原来的迭代器失效,需要更新pos找到pos在新的空间的相对位置
				size_t len = pos - _start;
				size_t newCapacity = capacity() == 0 ? 4 : capacity() * 2;
				reserve(newCapacity);

				//更新pos在新的空间的相对位置
				pos = _start + len;
			}
			//这里的insert就是顺序表的插入,就是把从pos位置开始往后的所有元素
			// 都往后移动一位,给pos腾出一个位置即可插入目标元素
			iterator end = _finish - 1;
			while (end >= pos)
			{
    
    
				*(end + 1) = *end;
				end--;
			}
			*pos = x;
			++_finish;

			//根据源码的要求需要返回插入的元素的迭代器
			return pos;
		}

		iterator erase(iterator pos)
		{
    
    
			//assert(pos >= begin() && pos < end());
			//iterator ret = pos;
			//iterator start = pos + 1;
			//while (start < end())
			//{
    
    
			//	*pos = *start;
			//	++pos;
			//	++start;
			//}
			//--_finish;
			//return ret;

			assert(pos >= begin() && pos < end());

			//顺序表的删除也非常简单,把pos后面的所有元素都往前挪动一位即可
			iterator start = pos + 1;
			while (start < end())
			{
    
    
				*(start - 1) = *start;
				++start;
			}
			--_finish;
			return pos;
		}

		void push_back(const T& x)
		{
    
    
			//复用insert尾部插入x即可
			//end()是最后一个元素的下一个位置,刚好就是尾插的位置
			insert(end(), x);
		}

		void pop_back()
		{
    
    
			//end()是最后一个元素的下一个位置,所以尾删需要删除
			//end()的前一个位置,前置--是先--,再使用
			erase(--end());
		}

		size_t capacity() const
		{
    
    
			return _endofstorage - _start;
		}

		size_t size() const
		{
    
    
			return _finish - _start;
		}

		//operator[]返回的是pos位置的引用
		T& operator[](size_t pos)
		{
    
    
			assert(pos < size());
			return _start[pos];
		}

		const T& operator[](size_t pos) const
		{
    
    
			assert(pos < size());
			return _start[pos];
		}

	private:
		iterator _start = nullptr;
		iterator _finish = nullptr;
		iterator _endofstorage = nullptr;
	};

	void test1(void)
	{
    
    
		vector<int> v;
		//v.push_back(1);
		//v.push_back(2);
		//v.push_back(3);
		//v.push_back(4);
		//v.push_back(5);
		v.insert(v.begin(), 1);
		v.insert(v.begin(), 2);
		v.insert(v.begin(), 3);
		v.insert(v.begin(), 4);
		v.insert(v.begin(), 5);

		//v.insert(v.begin() + 3, 100);

		//for (const auto& e : v)
		//{
    
    
		//	cout << e << " ";
		//}
		//cout << endl;

		//kb::vector<int>::iterator it = v.erase(v.begin());
		//cout << *it << endl;

		size_t sz = v.size();
		for (size_t i = 0; i < sz; i++)
		{
    
    
			v.erase(v.begin());
		}


		for (size_t i = 0; i < v.size(); i++)
		{
    
    
			cout << v[i] << " ";
		}
		cout << endl;

	}

	void test2(void)
	{
    
    
		std::vector<int> v;
		v.push_back(1);
		v.push_back(2);
		v.push_back(3);
		v.push_back(4);
		std::vector<int>::iterator ret = v.erase(v.begin());
		cout << *ret << endl;
		//for (const auto& e : v)
		//{
    
    
		//	cout << e << " ";
		//}
		//cout << endl;

	}

	void test3(void)
	{
    
    
		vector<int> v(5, 1);
		v.push_back(2);
		v.push_back(2);
		v.push_back(2);
		v.insert(v.begin(), 9);

		vector<int> v1(v);

		for (const auto& e : v1)
		{
    
    
			cout << e << " ";
		}
		cout << endl;

		for (const auto& e : v1)
		{
    
    
			cout << e << " ";
		}
		cout << endl;
	}

	void test4(void)
	{
    
    
		vector<int> v1;
		v1.push_back(1);
		v1.push_back(2);
		v1.push_back(3);
		v1.push_back(4);
		v1.push_back(5);
		vector<int> v2(v1.begin(), v1.end());
		//v1.resize(6, 1);
		for (auto& e : v2)
		{
    
    
			cout << e << " ";
		}
		cout << endl;


	}
}

The above is the entire content of the simulation implementation of the commonly used interfaces of the STL container vector. In fact, vector has many interfaces, but these interfaces are commonly used, and other less commonly used interfaces will not be implemented. The above is what I want to share with you today, have you learned it? If this article is helpful to you, please be careful and pay attention to it. We will continue to update the relevant knowledge of C++ in the future. See you in the next issue! ! ! ! !

Guess you like

Origin blog.csdn.net/weixin_70056514/article/details/131740707