[C++]vector simulation implementation

Table of contents

Foreword:

1. vector structure

2. Default member functions

2.1 Constructor

No-argument construction:

Constructed with parameters:

Argument constructor overload:

2.2 Assignment operator overloading, copy construction (difficulties)

2.3 Destructor:

3. Expansion

3.1 reserve

3.2 resize

4. Insertion and deletion

5. Iterator operations


Foreword:

        The vector imitated in this article is not completely consistent with the STL source code. For example, this article directly uses new to open up space, but the source code is allocated through the memory pool, but this does not affect the relationship between each other, so this article still has a certain learning meaning.

1. vector structure

template:

template<class T>

class vector

{

public:

        typedef T* iterator;

        typedef const T* const_iterator;

private:

        iterator _start;

        iterator _finish;

        iterator _end_of_storage;

        I believe that if you have used vector, you must know that every time you use vector, you need to mark clearly what kind of data this class is used to store, for example:

vector<int> vv1; store integer data

vector<double> vv2; store floating point data

vector<vector<int>> vv3; vector data that stores integer data

        Therefore, the vector class we simulate cannot be oriented to a certain type of data alone, but should consider all types, even if the custom type has been nested, it can be instantiated, as follows:

vector<vector<vector<vector<vector<int>>>>>  vv4;

         Although the above-mentioned types are unlikely to be encountered in this life for us, we can't prevent some curious friends from causing trouble, so we must be able to support this kind of writing, just like going to a bar to satisfy The handsome guy who orders fried rice.

        So first come to the conclusion: our class needs to be built as a template class.

Member variables:

        In the above code, we can see that the type of our member variable is a custom type redefinition iterator , which is the iterator we are familiar with, and our iterator is implemented in the form of a pointer, so these three variables can be understood as our Three locations where the data space is stored. _start corresponds to the beginning, _finish corresponds to the end of the data, and _end_of_storage corresponds to the last position of the space capacity. That is, it corresponds to the size and capacity of our sequence table.

2. Default member functions

        Whenever we implement a class, default member functions are essential, especially for classes like our vector that need to apply for space on the heap.

2.1 Constructor

No-argument construction:

        The no-argument construction can be said to be very simple, we don’t even need to open up space, we just need to initialize our three iterator variables through the initialization list, as follows:

//默认构造方式,全指针都置空值,后续无论怎么插入都有扩容为指针赋值
vector()
	:_start(nullptr)
	, _finish(nullptr)
	, _end_of_storage(nullptr)
{}

        Some friends may want to ask, then using this method, doesn't it mean that you can't insert data in the future? The fact is not, because we have other functional interfaces to serve us, and everyone will understand after reading it.

Constructed with parameters:

        If there is a parameter construction, we will keep it with the library once. Since it does not support direct initialization through a value, but supports construction through n T-type data, then we also learn it like this:

         It’s okay if you don’t understand the implementation of the above library, it’s the same with mine:

//有参构造n个T类型的数据进入vector<int>
vector(size_t n, const T& va = T())
	:_start(nullptr)
	, _finish(nullptr)
	, _end_of_storage(nullptr)
{
	T* temp = new T[n];
	for (size_t i = 0; i < n; ++i)
	{
		//这里赋值操作是<T>和<T>之间的操作
		temp[i] = va;
	}
	_start = temp;
	_finish = _start + n;
	_end_of_storage = _finish;
}

        The structure here is also the same. It is necessary to initialize the three iterator variables to null pointers through the initialization list, because we cannot guarantee that there is no smart guy who assigns 0 to n to initialize. Secondly, I use anonymous objects like T() here. as the default parameter.

        Some students are going to ask here: Doesn’t the life cycle of an anonymous object like T() have only one line? How many lines of your code are there? Isn’t it wrong to still use it?

        Actually not, in C++, when we use a variable of const plus reference type to receive an anonymous object, the life cycle of the anonymous object will be changed to the life cycle length of this variable, as follows:

const int& ret = T(); Change the life cycle of T() to ret

T(); There is only one line in the life cycle

         It is worth noting that the above const must be added, because our T() belongs to the temporary object itself, and the temporary object has constant attributes, that is, unchangeable attributes. If you do not receive const, it will cause an error.

        What's interesting is that without const, no error will be reported in vs2013 and its previous versions. I don't know about the subsequent versions, but the BUG has been fixed in vs2019.

Function design:

        I have used the most comprehensive way to write the code here, and it can be directly and extremely simplified through function reuse later.

        First, apply for n T-type spaces to the heap through new. The C language application method of malloc cannot be used here. I believe everyone understands the reason. It is because we want to implement template classes, so it is definitely possible to use custom types. A custom type needs to call its constructor, which can only be realized by new, not by malloc.

T* temp = new T[n];

        Then assign values ​​continuously through the loop, and finally locate the three iterator variables. The assignment operation in the loop body here involves very complicated nesting operations, and it is by no means a simple line of code. Wait until the next section to overload the assignment operator I'll talk about it again.

Argument constructor overload:

vector(int n, const T& va = T())
	:_start(nullptr)
	, _finish(nullptr)
	, _end_of_storage(nullptr)
{
	while (n--)
	{
		push_back(va);
	}
}

        The overloaded function body is what I call the ultimate optimization method. Through the continuous reuse of the tail insertion function, the function function is realized, but the point is not here. The significance of my overloading this function is to provide for our iteration. The container range constructor service

        Why do you say that? Please look at the iterator range constructor first.

template<class InputIterator>
vector(InputIterator start,InputIterator end)
	:_start(nullptr)
	, _finish(nullptr)
	, _end_of_storage(nullptr)
{
	while (start != end)
	{
		push_back(*start);
		++start;
	}
}

         First of all, we have to know one point, that is, C++ provides us with function overloading, then the editor will match the most suitable function for us through the type of parameters passed in .

        Example: When we pass parameters in the following ways:

vector<int> vv1(5,5);

        The editor will regard our parameter passing type as int, int instead of size_t, int, that is to say, the editor will find it in the iterator interval constructor, which is not the result we want, and the editor However, the compiler thinks this is the most suitable function. In order to avoid this situation, it is necessary to overload the int type constructor.

        Of course, some friends may not understand why we need to write another template, which is really uncomfortable to understand. This is indeed uncomfortable, but the benefits it brings are also great, because the functions implemented in this way allow us to make various construction methods. No matter what form of iterator is used to construct, we have corresponding Constructed in a unique way, this is the awesomeness.

2.2 Assignment operator overloading, copy construction (difficulties)

code:

vector<T>& operator=(const vector<T>& va)
{
	if (_start != va._start)
	{
		//只能通过new的方式实现,因为要调用自定义类型的构造函数
		T* temp = new T[va.capacity()];
		//保留数据
		//memcpy(temp, va._start, sizeof(T)* va.size());
		int len = va.size();
		for (int i = 0; i < len;++i)
		{
			//嵌套
			temp[i] = va[i];
		}
		//释放掉原来空间,如果有的话
		delete[] _start;
		_start = temp;
		_finish = _start + va.size();
		_end_of_storage = _start + va.capacity();
	}
	return *this;
}

        The assignment needs to be of the same type to be successful. I believe everyone can understand the use of const vector<T>& here, so I will directly cut to the point.

        Why do I use loops to assign values ​​one by one in my code, why not call the library function memcpy() directly? It is simple and convenient, but by assigning values ​​one by one through the loop, I feel that it is easy to write. In fact, it's not that the blogger doesn't want to use memcpy(), but the reality gave the blogger a slap and made me realize how troublesome it is to write a deep copy of the template class.

        To give an example: if we build a class of type vector<vector<int>>, then we open up a space for storing n vector<int>, and then these vector<int> are directly copied by us with memcpy , but we can't directly write a function for it, because this is a template, if it is written separately, it will not satisfy the vector<int> type. The following figure:

         Originally, we hoped that the assigned object would have a new space to carry the data, as shown in the figure above, but if we implement it through memcpy, can it be successful? Please see the picture below:

         Obviously, if a function is implemented through memcpy? We have indeed implemented a layer of deep copy, carrying the vector<int> with a new space, but what about the space inside? These spaces also need to be new. Is it possible to use memcpy? Works, but not quite, unless you guarantee that you won't be using custom type templates in the future.

        So I decided to use a loop method, one assignment at a time, one sentence is very important, that is:

temp[i] = va[i];

         This sentence is not as simple as you think, you think, if it is a built-in type, do we care about his application space? Don't care, so what will it do with custom types? nesting! ! ! , until the built-in type nesting is encountered, as shown in the figure below:

        In the above picture, we can see that every time we assign a value to copy, if we encounter an assignment operator, the editor will go back to check whether it is a custom type, and need to call the assignment operator overload function, and so on, until there is no assignment operator overload function, that is, there is no need to apply for space .

         Compile, run, and use the image below will not report any errors.

Copy construction:

//拷贝构造
vector(const vector<T>& va)
{
	//赋值重载
	*this = va;
}

        The copy construction can directly reuse the assignment operator to overload the function. The writing method is similar to logical thinking, and bloggers also want to be lazy.

2.3 Destructor:

//析构函数
~vector()
{
	//释放空指针不会出错,无论是什么方式
	delete[] _start;
	_start = nullptr;
	_finish = nullptr;
	_end_of_storage = nullptr;
}

        There is nothing to say about the destructor, just look at the code.

3. Expansion

3.1 reserve

code:

void reserve(size_t n)
{
	if (n > capacity())
	{
		size_t len = size();

		T* temp = new T[n];

		if (_start != nullptr)
		{
			//memcpy(temp, _start, sizeof(T) * size());
			for (size_t i = 0; i < len; ++i)
			{
				temp[i] = _start[i];
			}
			delete[] _start;
		}
		_start = temp;
		_finish = _start + len;
		_end_of_storage = _start + n;
	}
}

        Vector is the same as other strings, sequence tables, or any data structure. It can not be shrunk without shrinking. Therefore, when our n is smaller than the capacity, it can be returned without any processing.

        Similarly, the problem of multiple deep copies is also involved here, so it cannot be realized with memcpy, and it needs to be assigned sequentially with a loop. Also, we need to consider that the original space needs to be released after the space is expanded to prevent memory leaks.

3.2 resize

code:

void resize(size_t n,const T val& = T())
{
	//小于操作不需要缩容,只需要将_finish重定位即可
	if (n < size())
	{
		_finish = _start + n;
	}
	else if (n == size()){}		//无动作
	else
	{
		int len = n - size();
		//_finish和_end_of_storage的绝对位置之差与n的大小之比
		if (_finish + n > _end_of_storage)
		{
			reserve(capacity() == 0 ? n : capacity()+n);
		}

		//补齐T类型数据
		while (len-- > 0)
		{
			*_finish = val;
			_finish++;
		}
				
	}
}

        The resize function has one more function than the reserve, that is, it can realize the resize size, and the part larger than the size is occupied by the data val. Then reuse the reserve. If it is the first expansion, you need to use the trinocular operation to determine the given size.

4. Insertion and deletion

        The content is relatively simple, I believe everyone can understand it with the annotations, and directly upload the code:

//尾插只需要考虑如果空间不够就应该扩容
//并且,还有空容量的情况也需要考虑
void push_back(const T& val)
{
	if (_finish == _end_of_storage)
	{
		reserve(capacity() == 0 ? 4 : capacity() * 2);
	}
	//如果不需要reserve表示不用扩容,那么就是原地扩,不需要动_start和_end_of_storage
	//如果扩了容,reserve会为我们将这几个变量重定向
	*_finish = val;
	++_finish;
}

//插入
iterator insert(iterator pos, const T& val)
{
	//检查插入位置的有效性
	assert(pos >= _start);
	assert(pos <= _finish);

	//提前保留绝对距离
	size_t len = pos - _start;
	//判断扩容
	if (_finish == _end_of_storage)
	{
		reserve(capacity() == 0 ? 4 : capacity() * 2);
	}

	//通过绝对值偏移重定向
	pos = _start + len;

	//指向最后一个位置的下一个地址
	iterator end = _finish;

	//移动完pos位置的数据,结束
	while (end > pos)
	{
		*(end) = *(end - 1);
		--end;
	}
	//因为有赋值运算符重载,那么不管是否是内置类型,都能满足
	*pos = val;

	//向后移动一位
	++_finish;
	return pos;
}

//删除
void erase(iterator pos)
{
	//判断pos可行性
	assert(pos >= _start);
	assert(pos < _finish);

	//从pos下一个位置地址开始依次向前移动
	iterator start = pos + 1;

	//中途是没有任何扩容缩容的地方,所以迭代器不会更换
	while (start != _finish)
	{
		*(start - 1) = *start;
		++start;
	}

	//删除会有机会产生结果不确定的意外情况
	//所以,任何一次删除,该迭代器都应该被销毁
	//该函数才不会设置返回值
	--_finish;
}

//尾删,检查是否还有数据能删除
void pop_back()
{
	assert(_start != _finish);
	--_finish;
}

5. Iterator operations

//迭代器
iterator begin()
{
	return _start;
}
iterator end()
{
	return _finish;
}
const_iterator begin() const
{
	return _start;
}
const_iterator end() const
{
	return _finish;
}

//求个数、容量
size_t size() const
{
	return _finish - _start;
}
size_t capacity() const
{
	return _end_of_storage - _start;
}

//重载[],需要区分const有无的区别和pos位置的准确性
T& operator[](size_t pos)
{
	assert(pos < size());

	return _start[pos];
}
const T& operator[](size_t pos) const
{
	assert(pos < size());

	return _start[pos];
}

       The iterator is invalid. This part will be implemented in different ways according to different editors. The blogger doesn’t want to talk about it here, but I suggest that you don’t use this iterator after inserting and deleting data through the iterator. or relocate him to prevent illegal crossing.


        The above is the blogger's full understanding of the vector simulation implementation, and I hope it can help everyone.

Guess you like

Origin blog.csdn.net/weixin_52345097/article/details/129346123