[STL] Simulation to realize simple vector

Table of contents

1. Read the source code

2. Framework construction

3. Iterator of vector

4. Copy construction and assignment of vector

copy construction

assignment

5. Common important interface implementations of vector

implementation of operator[ ]

Implementation of the insert interface

Erase interface implementation

Implementation of the pop_back interface

resize interface implementation

Source code sharing

Write at the end:


1. Read the source code

If you want to implement a vector yourself, reading the source code to understand its implementation is an essential step.

However, when we get the source code of vector, where should we start with a bunch of codes?

Of course we start from the core of a class, that is, from its member variables:

Here we found his member variable, his type is iterator, what is this,

Let's trace the source:

We can see that iterator is actually a pointer type of T*,

And iterator is an iterator, here we can roughly guess that the iterator of vector is actually a native pointer.

Back to the topic, what is the role of his member functions?

At this time, we can learn more about it by looking at his: constructor + insert interface:

 

 Let's look at the constructor first, the others are some overloads, and the specific implementation is also encapsulated,

But we can't see any tricks in his default structure, but they are all initialized to 0.

Then we still have to look at his plug-in interface:

The push_back of the source code says that if finish != end_of_storage, call construct and then finish++

Here we can first guess the meaning of the source code, he gave start, finish, end_of_storage,

In fact, we can say that start is the beginning position of the array, and finish is the end position.

end_of_storage is the last position of the array capacity, then the judgment of this if statement is that if the array is not full,

Just insert a piece of data and let finish++, here we don't know what construct is,

But don't get caught up in some details when looking at the source code. Let's take a good look at the big framework first. At this time,

We can roughly guess that the logic in else is the logic that needs to be expanded. It calls insert_aux,

Then let's take a look at this function again:

This function is very large, so I will analyze it bit by bit.

At the beginning, another judgment is made. The insert here is not necessarily only used by push_back.

So this judgment may be needed when calling in other places:

Then let's look at the logic in else. First of all, here is the expansion strategy.

If it is expanded to 1 for the first time, it will be doubled in other cases.

Then, allocate is called here, which is STL's own space configurator to ask for memory.

It should be because STL dislikes the speed of malloc to open memory, so it implements a memory pool internally.

 Then this piece of logic is to copy the data to the new space,

Then call construct to insert the data, here I still don't look at its underlying implementation:

Then finally here is to release the old space, and then update the member variables:

Then I add a small point here:

What is used here is the operation of try catch to catch exceptions. In order to prevent memory leaks, catch here has the operation of destroying memory.

At this time, I have to complain about the hobbies of old C++ programmers. Using macros, I always like to make a bunch of macros, which makes people uncomfortable.

Back to the topic, here we want to understand the meaning of member variables, but his push_back package is more complicated,

So let's look at an expansion logic (reserve) to verify our conjecture just now:

We don't care about the rest, let's look at the operation of several member variables,

start = tmp, here it can almost be confirmed that start points to the beginning of the array,

finish = tmp + old_size, the old_size here is the previous data size, then finish is also correct,

end_of_storage = start + n, n from the reserve function is the capacity to be expanded to,

Then we have a general understanding of the meaning of his member variables.

Just said, let's see how the implementation of construct looks like:

Found no, construct is actually a positioning new, if we need to open a space for a custom type,

Then we can't call malloc directly, we have to call the constructor of the custom type,

And why does this destroy release it, because when cleaning up resources, he calls destroy,

In fact, it is calling the destructor of the custom type to clean up resources. 

2. Framework construction

Then let's not talk much, and start writing our own vector directly.

Let’s make a quick stand first and let the code run:

#pragma once

#include <iostream>
#include <vector>

using namespace std;

namespace xl {
	template<class T>
	class vector {
    public:
		typedef T* iterator;

	private:
		iterator _start;
		iterator _finish;
		iterator _end_of_storage;

	public:
		vector()
			: _start(nullptr)
			, _finish(nullptr)
			, _end_of_storage(nullptr)
		{}
		
	public:
		iterator begin() {
			return _start;
		}

		iterator end() {
			return _finish;
		}
			 
	public:
		void reserve(size_t n) {
			if (n > capacity()) {
				size_t old_size = size();
				T* tmp = new T[n];
				if (_start) {
					for (size_t i = 0; i < size(); i++) {
						tmp[i] = _start[i];
					}
					delete[] _start;
				}
				_start = tmp;
				_finish = _start + old_size;
				_end_of_storage = _start + n;
			}
		}

		void push_back(const T& x) {
			if (_finish == _end_of_storage) {
				size_t new_capacity = capacity() == 0 ? 4 : capacity() * 2;
				reserve(new_capacity);
			}
			*_finish = x;
			_finish++;
		}

	public:
		int size() const {
			return _finish - _start;
		}

		int capacity() const {
			return _end_of_storage - _start;
		}
	};
}

We implemented a structure, a push_back, a basic iterator,

Now we can run the code:

void test() {
	xl::vector<int> v;
	v.push_back(1);
	v.push_back(2);
	v.push_back(3);
	v.push_back(4);
	v.push_back(5);

	for (auto e : v) cout << e << " ";
	cout << endl;
}

Here we use the range for directly, because the range for is no problem, and the iterator must be no problem.

Let's see the result:

Here we add the important destructor:

~vector()
{
	if (_start) {
		delete[] _start;
		_start = _finish = _end_of_storage = nullptr;
	}
}

3. Iterator of vector

Let's look at such a scenario:

void Print(const vector<int>& v) {
	for (auto e : v) cout << e << " ";
	cout << endl;
}

void test2() {
	xl::vector<int> v;
	v.push_back(1);
	v.push_back(2);
	v.push_back(3);
	v.push_back(4);
	v.push_back(5);

	for (auto e : v) cout << e << " ";
	cout << endl;

	Print(v);
}

In fact, the compiler reports an error:

Why is this?

When we built the framework, we only implemented ordinary iterators,

Here we add const when passing parameters, which leads to the situation of authority amplification.

So we need to overload a const iterator:

public:
    typedef T* iterator;
    typedef const T* const_iterator;

public:
	iterator begin() {
		return _start;
	}

	iterator end() {
		return _finish;
	}

	const_iterator begin() const {
		return _start;
	}

	const_iterator end() const {
		return _finish;
	}

Let's quickly test it out:

void Print(const xl::vector<int>& v) {
	for (auto e : v) cout << e << " ";
	cout << endl;
}

void test2() {
	xl::vector<int> v;
	v.push_back(1);
	v.push_back(2);
	v.push_back(3);
	v.push_back(4);
	v.push_back(5);

	for (auto e : v) cout << e << " ";
	cout << endl;

	Print(v);
}

output:

4. Copy construction and assignment of vector

copy construction

Here is the traditional way of writing:

// 传统写法	
vector(const vector<T>& v)
	: _start(nullptr)
	, _finish(nullptr)
	, _end_of_storage(nullptr)
{
	_start = new T[v.capacity()];
	for (size_t i = 0; i < v.size(); i++) {
		_start[i] = v._start[i];
	}
	_finish = _start + v.size();
	_end_of_storage = _start + v.capacity();
}

Of course, there are many ways to achieve it, just come as comfortable as you like~

assignment

Here I will directly use the modern way of writing, because it is really and convenient to implement:

void swap(vector<T>& v) {
	std::swap(_start, v._start);
	std::swap(_finish, v._finish);
	std::swap(_end_of_storage, v._end_of_storage);
}

// 现代写法
vector<T>& operator=(vector<T> v) {
	swap(v);
	return *this;
}

5. Common important interface implementations of vector

implementation of operator[ ]

T operator[](size_t pos) {
	assert(pos < size());
	return _start[pos];
}

const T operator[](size_t pos) const {
	assert(pos < size());
	return _start[pos];
}

Let's test it out:

void test1() {
	xl::vector<int> v;
	v.push_back(1);
	v.push_back(2);
	v.push_back(3);
	v.push_back(4);
	v.push_back(5);

	for (int i = 0; i < 5; i++) {
		cout << v[i] << " ";
	}
	cout << endl;
}

output:

Implementation of the insert interface

void insert(iterator pos, const T& x) {
	assert(pos >= _start && pos <= _finish);

	if (_finish == _end_of_storage) {
		size_t len = pos - _start; // 防止迭代器失效的问题(扩容之后pos仍指向旧空间)
		size_t new_capacity = capacity() == 0 ? 4 : capacity() * 2;
		reserve(new_capacity);
		pos = _start + len;
	}
	iterator end = _finish - 1;
	while (end >= pos) {
		*(end + 1) = *end;
		end--;
	}
	*pos = x;
	_finish++;
}

After implementing insert, we don't need to implement push_back ourselves anymore.

Just reuse insert directly:

void push_back(const T& x) {
	//if (_finish == _end_of_storage) {
	//	size_t new_capacity = capacity() == 0 ? 4 : capacity() * 2;
	//	reserve(new_capacity);
	//}
	//*_finish = x;
	//_finish++;

	insert(end(), x);
}

Let's test it out:

void test2() {
	xl::vector<int> v;
	v.push_back(1);
	v.push_back(2);
	v.push_back(3);
	v.push_back(4);
	v.push_back(5);

	for (auto e : v) cout << e << " ";
	cout << endl;
}

 Still this code, look at the output:

Now we have solved the problem of iterator invalidation inside insert,

Let's look at such a scenario:

void test2() {
	xl::vector<int> v;
	v.push_back(1);
	v.push_back(2);
	v.push_back(4);
	v.push_back(5);

	xl::vector<int>::iterator it = v.begin() + 2;
	v.insert(it, 3);
	*it += 10;

	for (auto e : v) cout << e << " ";
	cout << endl;

}

We inserted a 3, then put 3 += 10, it should print out 13,

However, look at the output:

Why is it still printing 3?

Let's debug to see:

 So far it's normal:

When we got here, we found that the it pointer has become a random value. Why?

Although we have implemented an operation to prevent the iterator from invalidating the expansion inside the insert implementation,

However, the change of the formal parameters does not affect the actual parameters, and the old space is released after the expansion, causing the iterator to become invalid. 

So how do we solve it?

Let's see how the source code is implemented: (When there are details, you can look at the implementation details of the source code)

The operation used in the source code is to create a return value.

It is to return an iterator pointing to the new insertion position. If you don’t understand the source code, you can go to see what the documentation says:

Here's what the documentation says about this return value.

Erase interface implementation

void erase(iterator pos) {
	assert(pos >= _start && pos < _finish);
	iterator it = pos + 1;
	while (it != _finish) {
		*(it - 1) = *it;
		it++;
	}
	_finish--;
}

Let's test it out:

void test3() {
	xl::vector<int> v;
	v.push_back(1);
	v.push_back(2);
	v.push_back(3);
	v.push_back(4);

	xl::vector<int>::iterator it = v.begin();
	v.erase(it);

	for (auto e : v) cout << e << " ";
	cout << endl;
}

output:

It seems that there is no problem, but it is not the case. Let's look at another scene:

void test3() {
	xl::vector<int> v;
	v.push_back(1);
	v.push_back(2);
	v.push_back(3);
	v.push_back(4);
	v.push_back(5);

	xl::vector<int>::iterator it = v.begin();
	while (it != v.end()) {
		v.erase(it);
		it++;
	}

	for (auto e : v) cout << e << " ";
	cout << endl;
}

output:

How did it collapse?

After erase, the iterator may become invalid. Let's try the one in the library:

Run the code just now:

In fact, the library of VS has made a mandatory check, and he does not allow us to access the iterator after erase,

So we let the it++ program report an error.

How did Curry handle it?

Have to see what the source code looks like:

We found that he also solved this problem by returning a value,

We can also easily see that the return value returns the iterator at the original position,

According to this feature, let's test it:

void test3() {
	vector<int> v;
	v.push_back(1);
	v.push_back(2);
	v.push_back(3);
	v.push_back(4);
	v.push_back(5);

	vector<int>::iterator it = v.begin();
	while (it != v.end()) {
		it = v.erase(it);
	}

	for (auto e : v) cout << e << " ";
	cout << endl;
}

output:

It is indeed deleted, so let me change our code:

iterator erase(iterator pos) {
	assert(pos >= _start && pos < _finish);
	iterator it = pos + 1;
	while (it != _finish) {
		*(it - 1) = *it;
		it++;
	}
	_finish--;
	return pos;
}

 This will work fine:

Implementation of the pop_back interface

Just reuse erase directly:

void pop_back() {
	erase(end() - 1);
}

resize interface implementation

Here we first add a new knowledge:

After C++ has templates, built-in types have been upgraded, and they can also be initialized using constructors.

Look at the code:

void test4() {
	int i = 0;
	int j = 1;

	int a = int();
	int b = int(1);

	cout << i << " " << j << " " << a << " " << b << endl;
}

output:

Ok, let's look at this interface again:

void resize(size_t n, const T& val = T()) {
	if (n < size()) {
		_finish = _start + n;
	}
	else {
		reseve();
		while (_finish != _start + n) {
			*_finish = val;
			_finish++;
		}
	}
}

In this way, when we give val a default value, we can override custom types and built-in types.

Of course, the interfaces of vector are more than these, but we have basically implemented the core ones, and we can implement other interfaces if you are interested~

Source code sharing

Gitee Link: Simple STL Simulated: Simulated Simple STL (gitee.com)

Write at the end:

The above is the content of this article, thank you for reading.

If you feel that you have gained something, you can give the blogger a like .

If there are omissions or mistakes in the content of the article, please private message the blogger or point it out in the comment area~

Guess you like

Origin blog.csdn.net/Locky136/article/details/131740411