C++7: STL-simulation implements vector

Table of contents

member variable of vector

Constructor

reserve

size()

capacity()

push_back

 some small bugs

assignment operator overloading

destructor

 【】Operator overloading

resize

pop_back

Insert

 iterator invalidation

erase

2D array problem

in conclusion


vector, the translation software will tell you that it means vector, but in fact it is a sequence table container, which is not bad from the sequence table we just implemented, but the application of templates makes it more versatile. Next, we will try To understand how to learn the source code to implement vector.

Use of vectors:

int main()
{
	vector<int> v;
	
	v = { 1,2,3,4,5 };

	for (auto& e : v)
	{
		cout << e;
	}
}	

Due to the use of templates, vector can not only store built-in types, but also store custom types, such as using vector to set vector

 Of course, vector also has many application interfaces, which are not described here. If necessary, please move to this website: https://cplusplus.com/reference/

It seems that the use of vector is quite convenient, and our old acquaintance template parameters have also been added, so let's simulate and implement it.

member variable of vector

 With the help of the lessons learned from the sequence table, we infer that the implementation of the member variables of vector should be as follows

namespace myvector
{
	template<class T>
	class vector
	{

	public:
	private:
		T* _v;
		size_t _size;
		size_t _capacity;
	};
}

But things backfired, running to check the source code of STL gave us a blow

 Broken, how are 3 iterators? We see that the definition of an iterator is two typedef nesting dolls, its essence is still T*, we can understand Start, the head pointer of the sequence table structure, but why size and capacity become finish and end of storage?

We got the answer in "SLT Source Code Analysis"

 So in fact, there is not much difference between them in essence, which is also helpful for the use of paradigms. But I still prefer the size and capacity and don’t change them.

Constructor

Let's implement a basic no-argument version first

//无参构造函数
vector()
	:_start(nullptr), _size(nullptr), _capacity(nullptr)
{}

 Then there is the parameterized version


vector(int n, const T& val = T())
	:_start(nullptr), _finish(nullptr), _endofstorge(nullptr)
	{
		reverse(n);
		for (int i = 0; i < n; ++i)
		{
			push_back(val);
		}
	}

reserve

 In order to realize a vector with basic functions, we first solve the problem of expansion, and then we can happily write push_back.

Since the premise of expansion is not to shrink, then we need to get the size of the current vector, but unlike the string that can directly get the current size, we need to write an additional function to get it, but it is not difficult, after all, it is more commonly used function.

When two pointers of the same type are subtracted, the result is the number of data types between them.

size()

size_t size()
{
	return _size - _start;
}

capacity()

size_t capacity()
{
	return _capacity - _start;
}

Then it can be used to implement reserve. The logic of resetting space is not much different from that of string. First, we check whether shrinkage has occurred, and then use the number of spaces that need to be opened to create a new space, and then put Copy the data in the old space with memcpy, and then reset the size and capacity of the new space, because their bodies are pointers, and the space pointed to has been destroyed, so we use the current new head pointer += the original one The number of data sets it back on track.

The following code has a small BUG, ​​we will talk about it later

void reserve(const size_t n)
{
	if (n > capacity())
	{
		T* tmp = new T[n];
        

        //如果旧空间就是需要被开辟的,也就是_start是个空指针,不需要拷贝直接赋值就行
        //重置空间的话就往下走。
		if (_start != nullptr)
		{
			memcpy(tmp, _start, _size);
			delete[] _start;
		}

		_start = tmp;
		_size = _start + size();
		_capacity = _start + n;
	}
}

push_back

 With the function of space development, push_back is not difficult.

		void push_back(const T& val)
		{
			if (_size == _capacity)
			{
				size_t newcapacity = capacity() == 0 ? 4 : capacity() * 2;
				reserve(newcapacity);
			}

			*_size = val;
			++_size;
		}

 some small bugs

 At this point, a basic vector can be used, but there is still a small problem. If we directly run the above code, the following error will be reported.

what happened? Why is _size still empty? The previous process has been gone through during debugging, so the problem should appear when the capacity is expanded, and there is a problem with the method of _size assignment.

We went back to the reset function and found that when we wanted to use the size() function to obtain the current number of data, we ignored that _size itself still points to the old space at this time, and _start has already been updated, and the subtraction of the two pointers is basically no correct answer

Then we still need to save the number of old spaces

void reserve(const size_t n)
 {
    if (n > capacity())
	{
		T* tmp = new T[n];

		size_t oldsize = size();

		if (_start != nullptr)
		{
			memcpy(tmp, _start, sizeof(T)*size());
			delete[] _start;
		}

		_start = tmp;
		_size = _start + oldsize;
		_capacity = _start + n;
	}
}

Insert 5 more data to try

 no problem.

assignment operator overloading

 In order to deal with the problem of assignment, the overloaded function of the assignment operator generated by the compiler by default is a shallow copy that will cause the problem of being destroyed twice and crashing, so we still need to implement it

The assignment occurs in the case of the same type, so both the return value and the parameter should be vector

Then borrow the modern way of writing our previous string, that is, exchange a copy with the current this, and we will implement a simple swap function

void swap(vector<T>& v)
{
	std::swap(_start, v._start);
	std::swap(_size, v._size);
	std::swap(_capacity, v._capacity);
}

 Then in order not to affect the value on the left side of the assignment operator, we do not pass the reference, pass the value directly, generate a copy construction and then exchange.

vector<T>& operator = (vector<T> tmp)
{
    swap(tmp);
    return *this;
}

destructor

		//析构函数
		~vector()
		{
			delete[] _start;
			_start = _size = _capacity = nullptr;
		}
		 

 【】Operator overloading

T& operator[](size_t n)
{
	assert(pos < size());
	return *(_start+n);
}		

resize

 There is not much difference between the resize function and the logic of string. The only thing to pay attention to is the problem of filling the default value.

When n>_capacity, the capacity needs to be expanded; when n>size, the extra space needs to be filled with the default value; when n<size, the current size needs to be reset to n

void resize(const size_t n,T val = T())
{
	if (n > capacity())
	{
		reserve(n);
	}

    if (n > size())
    {
		while (_size < _start + n)
	    {
			*_size = val;
			++_size;
		}
	}
	else
	{
		_size = _start + n;
	}
}

Here, in order to adapt to the custom type, val uses anonymous construction to assign values ​​to val, which is also valid for built-in types

T val = T()

pop_back

		void pop_back()
		{
			assert(_size  > _start);

			--_size;
		}

Insert

 Let's look at the description first

 Pass in an iterator, and then insert data at the iterator position. Then we simply implement an iterator

//迭代器
iterator begin() 
{
	return _start;
}

iterator end()
{
	return _size - 1;
}

Since the principle of using auto is automatic derivation, when we implement the iterator of a certain container, we can use the range for normally, and auto can successfully deduce the iterator of the container in the range for to achieve traversal.

Next is the main logic of insert

		void insert(iterator n, const T& val)
		{
			assert(n >= _start);
			//若满扩容
			if (_size == _capacity)
			{
				size_t newcapacity = capacity() == 0 ? 4 : capacity() * 2;
				reserve(newcapacity);
			}

			if (n < _size)
			{
				iterator end = _size;
				while (end > n)
				{
					*end = *(end - 1);
					--end;
				}
				*n = val;
				++_size;
			}

		}

 Broken, how did it crash? What happened? Let's debug it

Through debugging, we found that the value of n is not between the current _start and _size positions at all, and this while loop has gone through a lot of times, and this problem only occurs during capacity expansion, then we have come up with an initial The conclusion is that the position pointed to by n cannot take effect correctly after expansion, that is, the problem of iterator failure.

 iterator invalidation

The reason for the iterator failure is shown in the figure. For better understanding, the incoming iterator n is renamed to pos

 Then in order to successfully correct the new position of pos, we save the previous data length in the same way as we handle pushback, and then update pos in the new space

		void insert(iterator n, const T& val)
		{
			assert(n >= _start);

			//扩容会引发迭代器失效的问题,需要更新迭代器
			if (_size == _capacity)
			{
				size_t length = n - _start;
				size_t newcapacity = capacity() == 0 ? 4 : capacity() * 2;
				reserve(newcapacity);
				n = _start + length;
			}

			if (n <  _size)
			{
				iterator end = _size;
				while (end > n)
				{
					*(end) = *(end - 1);
					end--;
				}

				*n = val;
				//整体大小+1,更新_size
				_size++;
			}
		}

Then through the above analysis, we know the specific reason for the iterator failure, and the real failure pointed to by the iterator failure is actually that the iterator after the inset can no longer be used. Because we are calling by value, the iterator outside the inset is still not updated. If we want to continue using it, there may be an out-of-bounds access problem.

erase

void erase(iterator n)
{
	assert(n >= _start);
	assert(n < _size);
			
	if (n == _size - 1)
	{
		pop_back();
		return;
	}

	iterator cur = n + 1;
	while (cur < _size)
	{
		*(cur - 1) = *cur;
		++cur;
	}
	--_size;
}

2D array problem

 The objects served by the vector container itself are various types, so when we want to store other container types inside the vector, it should also support it. Let's try to implement a two-dimensional array with our own vector .

 There is no problem, but once expansion occurs, it will crash directly

 Through debugging we found that the problem occurred when the destructor

 But tracing back to the source, when there is an error in the destructor, there must be a problem with the development of the space, so the culprit should be our reserve

 So why is an error reported when creating a two-dimensional array? Still need to draw a picture.

 

So in order to avoid this situation, we should abandon memcpy and replace it with a deeper copy, that is, after opening a new space, copy the original value intact, that is, a new copy with the same content but different addresses Then put it into the expanded space.

		void reserve(const size_t n)
		{
			if (n > capacity())
			{
				T* tmp = new T[n];

				size_t oldsize = size();

				if (_start)
				{
					for (size_t i = 0; i < oldsize; ++i)
					{
						tmp[i] = _start[i];
					}
                    //释放旧空间防止内存泄漏
					delete[]_start;
				}



				_start =  tmp;
				_size  =  tmp + oldsize;
				_capacity = _start + n;
			}
		}

in conclusion

 According to the above simulation implementation, we basically understand the basic structure of vector and the use of interfaces. Its essence is different from the sequence table. In order to serve custom types and generic types, the member variables are iterators, and the iterators themselves It is a class template parameter. It is not difficult to implement, but the details still need additional processing.

 At this point, the overview of simulating vector is over, thanks for reading! Hope to help you a little bit!

Guess you like

Origin blog.csdn.net/m0_53607711/article/details/128760687