[Advanced Road to C++] Basic use and simulation implementation of vector

Preface

 As one of the STL containers, vectorthe name is often confusing? Literally, we usually translate it into 向量, but it doesn’t make sense. I always feel that it should be dynamic arraytranslated into a dynamic array/sequence table, which is easier to understand? So why?

I saw such an answer on Zhihu and it seems quite reasonable~

Insert image description here

  • There is a comment below that I also think is quite interesting. I understand the difficulties of the designers of STL - it's not that they don't want to take it, but that the dynamic arrayname has already been taken, so they had to choose this name.

Original link: How to understand that vector is a dynamic array in C++, and the original meaning of this word is vector? Why is it called that?

Check the documentation for the explanation of this word~

Vectors are sequence containers representing arrays that can change in size.
Translation: Vectors are containers that represent dynamic array sequences. ——It means sequence table.

Check the definition again~

template < class T, class Alloc = allocator<T> > class vector; 
//第一个参数是模板参数,第二个参数是空间配置器也叫内存池,这个参数我们先不做了解。

1. Simple use of vector

①Interface

The common interfaces here are similar to those of string. I will talk about some differences from string.

1.reserve

 This reserve will only expand the capacity , and will not handle other cases. In other cases, string will give an ambiguous answer - optimization (depending on the implementation of the compiler).

②Usage

1. Built-in types

Here is an int

vector<int> v;

2. Two-dimensional

For example, if you want to open a two-dimensional dynamic array (int)

vector<vector<int>> vv;

Memory layout:
Insert image description here

3. Custom type

For example string

vector<string> vv;
string str("shun_hua");
vv.push_back(str);
//用类定义变量,再用变量进行初始化
vv.push_back(string("shun_hua"));
//用匿名对象初始化
vv.push_back("shun_hua");
//隐式类型转换直接初始化

Note: As long as it is a type, template instantiation can be performed.

③The iterator fails

We only need to remember two sentences:

  • Changes in data storage space (commonly expansion) may cause the iterator to become invalid.

Because the bottom layer of an iterator is something similar to a pointer, when expansion occurs, if the iterator pointing to the old space is not updated to point to the new space, it will be accompanied by invalidation problems.

For example: push_back, reserve, insert, swap

  • Moving data may invalidate the iterator.

For example: erase is judged to be invalid after being used under VS. If it is used again, an error will occur directly, but under Linux, it will not. Therefore, in order to consider platform portability, we unanimously believe that iterators will become invalid.

2. Simulation implementation

①Key points

  • 1. In order to do this 不与库里面的vector冲突, we need to 命名空间implement the classes we implement封装
  • 2. The framework we implement here is 顺序表implemented according to the data structure.
  • 3. For understanding, the following interface is 分开讲解yes, I will give it at the end 源码.

②Basic framework

namespace my_vector
{
    
    
	template<class T>
	class vector
	{
    
    
	public:
		typedef T value_type;
		typedef const T const_value_type;
		typedef T* iterator;
		typedef const T* const_iterator;
		//迭代器的类型重定义
	private:
		iterator _begin = nullptr;
		iterator _finish = nullptr;
		iterator _end_of_storage = nullptr;
		//给缺省值,构造函数无需写初始化列表了,方便一些。
	};
}

Rough framework diagram:
Insert image description here
Some friends asked, why not implement it in the standard form of a sequence table? The implementation here is mainly to learn the underlying principles of the library, so the library is implemented like this, and we wrote it this way. In fact, writing this way also has advantages, which will be discussed below.

③Iterator

1.begin

iterator begin()
{
    
    
	return _begin;
}
const_iterator begin()const
{
    
    
	return _begin;
}

2.end

iterator end()
{
    
    
	return _finish;
}
const_iterator end()const
{
    
    
	return _finish;
}

A brief mention: the return value of cbegin and cend in the library is only const_iterator, and the this pointer has been specially processed~

④ size

size_t size() const
{
    
    
	return _finish - _begin;
}

⑤capacity

size_t capacity()const
{
    
    
	return _end_of_storage - _begin;
}
  • Size and capacity are used 指针减指针等于相邻元素个数, and the interval is 左闭右开.

⑥Constructor and destructor

Constructor

1.Default constructor

vector()
{
    
    
	
}

2.Constructor

//为了与下面的构造函数关联起来,这里就直接给出了。
void resize(size_t n, value_type val = value_type())
{
    
    
	if (n < size())
	{
    
    
		_finish = _finish + n;
	}
	else
	{
    
    
		reserve(n);
		iterator end = _begin + n;
		while (_finish != end)
		{
    
    
			*_finish = val;
			_finish++;
		}
		//这里直接对_finish进行调整,最后省去了一步操作。
	}
}
vector(size_t n, const value_type& val = value_type())
{
    
    
	resize(n, val);
}
//这个是用迭代器区间进行初始化
template<class InputIterator>
vector(InputIterator first , InputIterator last)
{
    
    
	size_t old_size = last - first;
	_begin = new value_type[old_size];
	InputIterator begin = first;
	while (begin != last)
	{
    
    
		push_back(*begin);
		begin++;
	}
}

There are several issues that need to be discussed here. Let’s talk about the first one first——value_type()

  • This is supported by C++ syntax and is applicable to built-in types and custom types. For built-in types, its default constructor will be called, and it is also possible for built-in types.

The second problem is that the compiler will not call a certain template as we want, but will use the most appropriate template.

//第一种写法
my_vector::vector<int> v(10,1);
//第二种写法
my_vector::vector<int> v(10u,1);
  • In fact, in the first way of writing, we want to use the first constructor, but the types here are the same and are more suitable for the second construction, so the compiler here will call the second one, so how to solve it here? Obviously the iterator cannot be explicitly declared here, so we can only use forced type conversion, which is the second way we see.

3.Copy construction

vector(const vector& v)
{
    
    
	_begin = new value_type[v.size()];
	//大多数小伙伴可能会写第一种
	//memcpy(_begin, v._begin, sizeof(value_type) * v.size());
	//第二种	
	for (size_t i = 0; i < v.size(); i++)
	{
    
    
		_begin[i] = v[i];//不要小瞧这一步操作,下面细讲。
	}
	_finish = _end_of_storage = _begin + v.size();
}
//这个用到了reserve 和push_back,也比较方便
vector(const vector& v)
{
    
    
	reserve(v.capacity());
	for (auto e : v)
	{
    
    
		push_back(e);
	}
}

  • This involves the issue of deep copy and shallow copy.

Here's an example.

vector<string> v;
v.push_back("1111");
v.push_back("2222");
vector<string> v1(v);

If we use the first way of writing:
Insert image description here

  • What is typical here is a shallow copy. At the end of the operation , if the same space is destructed twice, an error will be reported.

The second way of writing will call the assignment overload of the string class to complete the deep copy. If you want to say that the assignment overload of the string class is a shallow copy , then the problem of the string class is not our problem.

destructor

~vector()
{
    
    
	delete[]_begin;
	_begin = _finish = _end_of_storage = nullptr;
}

⑦reserve

void reserve(size_t n = 0)
{
    
    
	if (n > capacity())
	{
    
    
		size_t old_size = size();
		iterator tmp = new value_type[n];
		//这里在自定义类型也会出深拷贝的浅拷贝问题
		//memcpy(_tmp, _begin, sizeof(value_type)*old_size);
		for (size_t i = 0; i < size(); i++)
		{
    
    
			tmp[i] = _begin[i];
		}
		delete[] _begin;
		_begin = tmp;
		_finish = _begin + old_size;
		_end_of_storage = _begin + n;
	}
}

⑧push_back

void push_back(const value_type & val)
{
    
    
	if (_finish == _end_of_storage)
	{
    
    
		size_t new_capacity = size() == 0 ? 4 : capacity() * 2;
		//扩容
		reserve(new_capacity);
	}
	*(_finish++) = val;
}

⑨[]

value_type& operator[](size_t pos)
{
    
    
	assert(pos < size());
	return _begin[pos]; 
}

const_value_type& operator[](size_t pos)const
{
    
    
	assert(pos < size());

	return _begin[pos];
}

⑩insert

void insert(iterator pos, const size_t val)
{
    
    
	assert(pos <= _finish && pos >= _begin);
	//是可以等于_finish的相当于尾插了
	if (_finish == _end_of_storage)
	{
    
    
		size_t rpos = pos - _begin;
		size_t new_capacity = size() == 0 ? 4 : capacity() * 2;
		//扩容
		reserve(new_capacity);
		pos = _begin + rpos;
	}
	iterator end = _finish;
	while (end != pos)
	{
    
    
		*(end) = *(end - 1);
		end--;
	}
	*pos = val;
	_finish++;
}

⑪erase

iterator erase(iterator pos)
{
    
    
	assert(pos < _finish&& pos >= _begin);
	//只能删除有效数据
	iterator cur = pos;
	while (cur != _finish)
	{
    
    
		*(cur) = *(cur + 1);
		cur++;
	}
	_finish--;
	return pos;
}

Example: Delete even-numbered codes

	my_vector::vector<int> v;
	v.push_back(1);
	v.push_back(2);
	v.push_back(2);
	v.push_back(3);
	v.push_back(4);
	v.push_back(5);
	for (auto e : v)
	{
    
    
		cout << e << " ";
	}
	cout << endl;
	//这是通用的代码
	my_vector::vector<int>::iterator it = v.begin();
	while (it != v.end())
	{
    
    
		if (*it % 2 == 0)
		{
    
    
			it = v.earse(it);
		}
		else
		{
    
    
			it++;
		}
	}
	//这是不具有平台移植性的代码
	my_vector::vector<int>::iterator it = v.begin();
	while (it != v.end())
	{
    
    
		if (*it % 2 == 0)
		{
    
    
			v.earse(it);
		}
		else
		{
    
    
			it++;
		}
	}
	//这是错误的代码,想想为什么。
	my_vector::vector<int>::iterator it = v.begin();
	while (it != v.end())
	{
    
    
		if (*it % 2 == 0)
		{
    
    
			v.earse(it);
		}
		it++;
	}
	for (auto e : v)
	{
    
    
		cout << e << " ";
	}
	cout << endl;

  • Therefore: Use the return value to solve the problem of iterator failure, making the code platform-portable.

⑫pop_back

void pop_back()
{
    
    
	earse(--end());
}

⑬swap

void swap(vector & x)
{
    
    
	std::swap(_begin, x._begin);
	std::swap(_finish, x._finish);
	std::swap(_end_of_storage, x._end_of_storage);
}

⑭ =

  • Here we still use modern writing
vector& operator =(vector tmp)
{
    
    
	swap(tmp);
	return *this;
}

Source code

namespace my_vector
{
    
    
	template<class T>
	class vector
	{
    
    
	public:
		typedef T value_type;
		typedef const T const_value_type;
		typedef T* iterator;
		typedef const T* const_iterator;
		//迭代器
		iterator begin()
		{
    
    
			return _begin;
		}
		const_iterator begin()const
		{
    
    
			return _begin;
		}
		iterator end()
		{
    
    
			return _finish;
		}
		const_iterator end()const
		{
    
    
			return _finish;
		}
		vector(const vector& v)
		{
    
    
			_begin = new value_type[v.size()];
			//这里也会发生深拷贝的浅拷贝现象
			/*memcpy(_begin, v._begin, sizeof(value_type) * v.size());*/
			for (size_t i = 0; i < v.size(); i++)
			{
    
    
				_begin[i] = v[i];
			}
			_finish = _end_of_storage = _begin + v.size();
		}
		vector(size_t n, const value_type& val = value_type())
		{
    
    
			resize(n, val);
		}
		template<class InputIterator>
		vector(InputIterator first , InputIterator last)
		{
    
    
			size_t old_size = last - first;
			_begin = new value_type[old_size];
			InputIterator begin = first;
			int i = 0;
			while (begin != last)
			{
    
    
				push_back(*begin);
				begin++;
			}
		}
		//这个比较简单
		//vector(const vector& v)
		//{
    
    
		//	reserve(v.capacity());
		//	for (auto e : v)
		//	{
    
    
		//		push_back(e);
		//	}
		//}
		~vector()
		{
    
    
			delete[]_begin;
			_begin = _finish = _end_of_storage = nullptr;
		}
		size_t size() const
		{
    
    
			return _finish - _begin;
		}
		size_t capacity()const
		{
    
    
			return _end_of_storage - _begin;
		}
		void reserve(size_t n = 0)
		{
    
    
			if (n > capacity())
			{
    
    
				size_t old_size = size();
				iterator tmp = new value_type[n];
				//这里在自定义类型会出大坑
				//memcpy(_tmp, _begin, sizeof(value_type)*old_size);
				for (size_t i = 0; i < size(); i++)
				{
    
    
					tmp[i] = _begin[i];
				}
				delete[] _begin;
				_begin = tmp;
				_finish = _begin + old_size;
				_end_of_storage = _begin + n;
			}
		}
		void push_back(const value_type & val)
		{
    
    
			if (_finish == _end_of_storage)
			{
    
    
				size_t new_capacity = size() == 0 ? 4 : capacity() * 2;
				//扩容
				reserve(new_capacity);
			}
			*(_finish++) = val;
		}
		value_type& operator[](size_t pos)
		{
    
    
			assert(pos < size());
			return _begin[pos]; 
		}
		const_value_type& operator[](size_t pos)const
		{
    
    
			assert(pos < size());

			return _begin[pos];
		}
		void insert(iterator pos, const size_t val)
		{
    
    
			assert(pos <= _finish && pos >= _begin);
			if (_finish == _end_of_storage)
			{
    
    
				size_t rpos = pos - _begin;
				size_t new_capacity = size() == 0 ? 4 : capacity() * 2;
				//扩容
				reserve(new_capacity);
				pos = _begin + rpos;
			}
			iterator end = _finish;
			while (end != pos)
			{
    
    
				*(end) = *(end - 1);
				end--;
			}
			*pos = val;
			_finish++;
		}
		iterator erase(iterator pos)
		{
    
    
			assert(pos < _finish&& pos >= _begin);
			iterator cur = pos;
			while (cur != _finish)
			{
    
    
				*(cur) = *(cur + 1);
				cur++;
			}
			_finish--;
			return pos;
		}
		//尾删
		void pop_back()
		{
    
    
			earse(--end());
		}
		void swap(vector & x)
		{
    
    
			std::swap(_begin, x._begin);
			std::swap(_finish, x._finish);
			std::swap(_end_of_storage, x._end_of_storage);
		}
		//赋值
		vector& operator =(vector tmp)
		{
    
    
			swap(tmp);
			return *this;
		}
		//value_type()这里匿名算是调用默认构造,对缺省参数进行初始化
		//1.对内置类型,C++对其做了升级,有对应的默认构造
		//2.对自定义类型,会去调用其默认构造。
		void resize(size_t n, value_type val = value_type())
		{
    
    
			if (n < size())
			{
    
    
				_finish = _finish + n;
			}
			else
			{
    
    
				reserve(n);
				iterator end = _begin + n;
				while (_finish != end)
				{
    
    
					*_finish = val;
					_finish++;
				}
			}
		}
	private:
		iterator _begin = nullptr;
		iterator _finish = nullptr;
		iterator _end_of_storage = nullptr;
	};
}

Summarize

 That’s it for today’s sharing. If you think the article is good,Give it a like and encourage it.! Us 下篇文章再见!

Guess you like

Origin blog.csdn.net/Shun_Hua/article/details/131652609