Block analysis from function prototype to block implementation of C++ STL (vector)

content

1. Partial analysis of function prototype (simple example is easy to understand)

Two, iterator design ideas, vector iterator analysis

Third, the overall frame that can quickly see the effect

Third, the key functions are analyzed and implemented graphically in blocks. It is useless to write only what may be unclear to everyone, and the parts that can be written are written.

Design ideas of various constructors, how to write only one and then realize code reuse

range construction

copy construction

assignment overloading

Mobile Constructs (more thieves, contemptible, directly grab the identity of the dying person)

The design of insert + erase, as well as analyzing the return value, plus analyzing the location where the iterator failure will occur, and throwing details in the design

insert

erase

4. Overall code (implementation code)

5. Summarize the integration, and specifically talk about how to write an STL container and iterator design ideas (what is an iterator), and the problem of iterator failure.

How to implement an STL writing methodology


1. Partial analysis of function prototype (simple example is easy to understand)

  • Simple test use:
int main() {
	std::vector<int> vint;
	vint.reserve(3);
	cout << vint.capacity() << endl;
	vint.resize(10, 5);
	cout << vint.size() << endl;
	for (auto& e : vint) {
		cout << e << " ";
	}
	cout << endl;
	return 0;
}

 The above can prove that the size changed by resize, the capacity changed by reserve is to open space in advance, as for the use of reserve, the number of expansions can be reduced.. Improve efficiency..

  •  Iterator function, return iterator. Support iterator, only with iterator can support range traversal method. For (aoto& e : vint) traversal method used above is only possible because of iterator as support
  • To verify, simply use the iterator template class to simulate the implementation of for_each
  • Introducing a small knowledge point, iterator type + functor type as a type template is a very common way to use it. This method is often used in the scenario of performing function processing on elements of a range:   ( very important trait )
  • template<class InputIterator, class Function>
    	void Func(InputIterator first, InputIterator last, Function f) {
    		while (first != last) {
    			f(*first);
    			++first;
    		}
    		return;
    	}

    eg :      

template<class T >
	struct Print {
		void operator()(T& val)const {
			cout << val << " ";
		}
	};
	template<class InputIterator, class Function>
	void for_each(InputIterator first, InputIterator last, Function f) {
		while (first != last) {
			f(*first);
			++first;
		}
		return;
	}
}

int main() {
	std::vector<int> vint;
	vint.reserve(3);
	cout << vint.capacity() << endl;
	vint.resize(10, 5);
	cout << vint.size() << endl;
	//for (auto& e : vint) {
	//	cout << e << " ";
	//}
	tyj::for_each(vint.begin(), vint.end(), tyj::Print<int>());
	cout << endl;
	return 0;
}
  •  So it can be proved that this traversal method of for (aoto& e : container) should also be implemented based on iterators at the bottom.

Two, iterator design ideas, vector iterator analysis

  • The iterator is a bit like a pointer to a pointer. It can be a class, or even a raw pointer. It is a class that supports ++ -- -> and * operator overloading, or a raw pointer.
  • Because the bottom layer of vector is a dynamic array. It maintains a linear space, so ordinary raw pointers can be used directly as iterators, because the pointer itself supports the operations of operator++ operator-- operator*
		typedef T* iterator;
		typedef const T* const_iterator;

Third, the overall frame that can quickly see the effect

  • Let me explain the above first. Many times we have to write a big thing. In order to reduce the cost of deBug, we can first realize the necessary blocks that can see the effect. The general framework is written well, and the effect can be seen. Then
  • Expansion mechanism when inserting elements: If expansion is required, the expansion is generally done by doubling the capacity. The expansion is divided into three parts, creating new space, transferring data, and destroying the original space...      (Remember to transfer data Call deep copy, not memcpy) why?

 

namespace tyj {
	template <class T>
	class vector {
		typedef T* iterator;
		typedef const T* const_iterator;
	public:
		vector() //默认构造
			: start(nullptr)
			, finish(nullptr)
			, end_of_storage(nullptr) {
		}
		vector(int n, const T& val = T()) //带参构造
			: start(new T[n]) {
			finish = start + n;
			end_of_storage = finish;
			for (int i = 0; i < n; ++i) {
				start[i] = val;
			}
		}

		iterator begin() {
			return start;
		}

		const_iterator begin() const {
			return start;
		}

		iterator end() {
			return finish;
		}

		const_iterator end() const {
			return finish - 1;
		}

		size_t size() const {
			return finish - start;
		}

		size_t capacity() const {
			return end_of_storage - start;
		}
		void reserve(int n) {//预先开空间
			if (n <= capacity()) return;
			//说明需要新开空间, 三步
			//1.新开空间
			T* pTemp = new T[n];
			//2.转移数据
			int _size = size();
			for (int i = 0; i < _size; ++i) {
				pTemp[i] = start[i];
			}
			//3.销毁原有空间
			delete[] start;
			start = pTemp;
			finish = start + _size;
			end_of_storage = start + n;
		}

		void push_back(const T& val) {
			if (size() == capacity()) {
				reserve(size() == 0 ? 8 : (size() << 1));
			}
			//放入数据
			*finish = val;
			finish++; //尾部迭代器后移
		}
	private:
		//[start, finish)
		iterator start;//当前使用空间起点
		iterator finish;//当前使用空间末尾
		iterator end_of_storage; //空间末尾
	};

	template <class T >
	struct Print {
		void operator()(T& val) const {
			cout << val << " ";
		}
	};

	template<class InputIterator, class Function>
	void for_each(InputIterator first, InputIterator last, Function f) {
		while (first != last) {
			f(*first);
			++first;
		}
	}
}

int main() {
	tyj::vector<int> vint;
	for (int i = 0; i < 5; ++i)
		vint.push_back(i);
	cout << vint.size() << endl;
	tyj::for_each(vint.begin(), vint.end(), tyj::Print<int>());
	cout << endl;
	return 0;
}

Third, the key functions are analyzed and implemented graphically in blocks. It is useless to write only what may be unclear to everyone, and the parts that can be written are written.

  •  Then there is the question of how to design the analysis of each function

Design ideas of various constructors, how to write only one and then realize code reuse

  • range construction

  • swap prepares code for reuse

  • copy construction

  • assignment overloading

In the same way, the copy constructor has been reused to construct the vec object and directly steal the vec and replace it with your own. Have you seen the horse bandit? Let me be you, you are the air 

  • Mobile Constructs (more thieves, contemptible, directly grab the identity of the dying person)

The design of insert + erase, as well as analyzing the return value, plus analyzing the location where the iterator failure will occur, and throwing details in the design

  • insert

  • erase

  • Look at insert + erase There is an iterator as the return value, why, with the old and new iterators, to prevent the iterator from invalidating. . . It is understandable why the following code needs to receive the return value after erase to prevent the use of invalid iterators
  • void test() {
    	std::vector<int> vec(5, 1);
    	for (auto it = vec.begin(); it != vec.end(); ++it) {
    		it = vec.erase(it);
    	}
    }
    
    it = v.begin();
    PrintVector(v);
    while (it != v.end()) {
    	if (*it % 2 == 0)
    		it = v.erase(it);
    	else
    		++it;
    }
  • So far, the more useful parts have been analyzed.

4. Overall code (implementation code)

#include <vector> 
#include <algorithm>
#include <iostream>
#include <assert.h>
using namespace std;


namespace tyj {
	template <class T >
	class vector {
	public:
		//原生指针作为迭代器
		typedef T* iterator;
		typedef const T* const_iterator;
		//无参构造函数
		vector()
			: start(nullptr)
			, finish(nullptr)
			, end_of_storage(nullptr) {
		}

		vector(int n, const T& val = T()) 
			: start(new T[n]), 
			finish(start + n), 
			end_of_storage(start + n) {
			//放入数据
			for (int i = 0; i < n; ++i) {
				start[i] = val;
			}
		}

		//写范围构造函数
		template<class InputIterator>
		vector(InputIterator first, InputIterator last) {
			//利用范围进行构造
			while (first != last) {
				push_back(*first++);//直接进行复用接口
			}
		}
		//接下来写拷贝构造, 赋值运算符重载,移动构造,全部进行复用接口
		void swap(vector<T>& vec) {
			::swap(start, vec.start);
			::swap(finish, vec.finish);
			::swap(end_of_storage, vec.end_of_storage);
		}

		vector(const vector<T>& vec) 
			: start(nullptr)
			, finish(nullptr)
			, end_of_storage(nullptr) {

			vector<T> tmp(vec.begin(), vec.end());
			swap(tmp);//复用范围构造代码
		}

		vector<T>& operator=(vector<T> vec) {
			swap(vec);//复用拷贝构造代码
			return *this;
		}

		//移动构造, 窃取资源,窃取将亡对象的底层堆区资源
		//反正你都要死亡了不如用来给我构造
		vector(vector<T>&& vec) 
			: start(nullptr)
			, finish(nullptr)
			, end_of_storage(nullptr) {
			swap(vec);
		}

		iterator begin() {
			return start;
		}

		const_iterator begin() const {
			return start;
		}

		iterator end() {
			return finish;
		}

		const_iterator end() const {
			return finish;
		}

		void resize(size_t n, const T& val = T()) {
			if (n <= size()) return;	//啥事不用做
			//至此说明了超出的部分需要设置为val
			if (n > capacity()) reserve(n);//先扩容
			//然后赋值
			while (finish != start + n) {
				*finish++ = val;
			}
		}

		void reserve(int n) {
			//预先开空间,三部曲目
			if (n <= capacity()) return;
			//至此说明需要重新开空间
			T* pTemp = new T[n];
			size_t sz = size();//保存之前容器中的元素大小
			//转移数据
			for (int i = 0; i < sz; ++i) {
				pTemp[i] = start[i];
			}

			delete [] start;//删除掉原有的空间
			start = pTemp;
			finish = pTemp + sz;
			end_of_storage = start + n;
		}

		void push_back(const T& val) {
			if (finish == end_of_storage) {
				reserve(capacity() > 0 ? capacity() << 1 : 4);
			}
			//有了空间,然后就是进行数据插入
			*finish++ = val;//放入数据
		}

		bool empty() {
			return start == nullptr;
		}

		size_t size() {
			return finish - start;
		}

		size_t capacity() {
			return end_of_storage - start;
		}

		//网pos位置插入一个元素,插入一定谨防的是插入扩容的原有迭代器失效问题
		iterator insert(iterator pos, const T& val) {
			assert(pos >= start && pos <= finish);//断言位置合法
			if (finish == end_of_storage) {
				size_t step = pos - start;
				//扩容首先
				reserve(capacity() > 0 ? capacity() << 1 : 4);
				//扩容之后原有的pos迭代器会失效跟新
				pos = start + step;
			}
			//拿到finish迭代器同时finish迭代器后移动
			iterator it = finish++;
			//腾出空间
			while (it != pos) {
				*it = *(it - 1);
				--it;
			}
			*pos = val;//放入插入数据
			return pos;//返回插入位置元素
		}

		//删除一定注意的是删除的位置其实还是pos 因为返回下一个元素其实是覆盖到pos位置了
		iterator erase(iterator pos) {
			//覆盖删除整个元素
			assert(pos >= start && pos < finish);
			iterator it = pos + 1;
			while (it <= finish) {
				*(it - 1) = *it;//覆盖删除
				++it;
			}
			--finish;//尾部迭代器前移动
			return pos;
		}

		~vector() {
			if (start) {
				delete[] start;
			}
			start = finish = end_of_storage = nullptr;
		}
	private:
		T* start;
		T* finish;
		T* end_of_storage;
	};

	template<class InputIterator, class Function>
	void for_each(InputIterator first, InputIterator last, Function f) {
		while (first != last) {
			f(*first);
			++first;
		}
	}
	template<class T>
	struct Print {
		void operator()(T& val) const {
			cout << val << " ";
		}
	};
}
template<class T>
void PrintVector(tyj::vector<T>& vec) {
	tyj::for_each(vec.begin(), vec.end(), tyj::Print<T>());
	cout << endl;
}

test code

void TestVector3()
{
	int a[] = { 1, 2, 3, 4 };
	tyj::vector<int> v(a, a + sizeof(a) / sizeof(a[0]));

	// 使用find查找3所在位置的iterator
	auto pos = find(v.begin(), v.end(), 3);
	//pos迭代器可以失效,  插入操作迭代器也会失效
	// 在pos位置之前插入30
	v.insert(pos, 30);
	PrintVector(v);
	// 删除pos位置的数据
	pos = find(v.begin(), v.end(), 3);
	v.erase(pos);
	PrintVector(v);
}

void TestVector1()
{
	// constructors used in the same order as described above:
	tyj::vector<int> first; // empty vector of 

	tyj::vector<int> second(4, 100); // four ints with value 

	tyj::vector<int> third(second.begin(), second.end()); // iterating through 
	tyj::vector<int> fourth(third); // a copy of third
	// the iterator constructor can also be used to construct from arrays:
	int myints[] = { 16, 2, 77, 29 };
	tyj::vector<int> fifth(myints, myints + sizeof(myints) / sizeof(int));
	std::cout << "The contents of fifth are:";
	for (tyj::vector<int>::iterator it = fifth.begin(); it != fifth.end(); ++it)
		std::cout << *it << " ";
	std::cout << endl;
	// 测试T是string时,拷贝问题
	tyj::vector<string> strV;
	strV.push_back("1111");
	strV.push_back("2222");
	strV.push_back("3333");
	strV.push_back("4444");

}

void TestVector2()
{
	// 使用push_back插入4个数据
	tyj::vector<int> v;
	v.push_back(1);
	v.push_back(2);
	v.push_back(3);
	v.push_back(4);
	PrintVector(v);
	// 使用迭代器进行修改
	auto it = v.begin();
	while (it != v.end())
	{
		*it *= 2;
		++it;
	}
	PrintVector(v);

	// 这里可以看出C++11支持iterator及接口,就支持范围for
	for (auto e : v)
		cout << e << " ";
}

// iterator失效问题
void TestVector4()
{
	int a[] = { 1, 2, 3, 4 };
	tyj::vector<int> v(a, a + sizeof(a) / sizeof(a[0]));
	// 删除pos位置的数据,导致pos迭代器失效
	auto pos = find(v.begin(), v.end(), 3);
	v.erase(pos);
	cout << *pos << endl; // 此处会导致非法访问
	// 在pos位置插入数据,导致pos迭代器失效。
	// insert会导致迭代器失效,是因为insert可
	// 能会导致增容,增容后pos还指向原来的空间,而原来的空间已经释放了。
	pos = find(v.begin(), v.end(), 3);
	v.insert(pos, 30);
	cout << *pos << endl; // 此处会导致非法访问
	// 实现删除v中的所有偶数
	// 下面的程序会崩溃掉,如果是偶数,erase导致it失效
// 对失效的迭代器进行++it,会导致程序崩溃
	auto it = v.begin();

	//while (it != v.end())
	//{
	//	if (*it % 2 == 0)
	//		v.erase(it);
	//	++it;//每一次erase之后进行跳跃
	//}

	// 以上程序要改成下面这样,erase会返回删除位置的下一个位置
	it = v.begin();
	PrintVector(v);
	while (it != v.end())
	{
		if (*it % 2 == 0)
			it = v.erase(it);
		else
			++it;
	}
}


int main() {
	//tyj::vector<int> vec1;
	//for (int i = 0; i < 5; ++i) {
	//	vec1.push_back(i);
	//}
	//PrintVector<int>(vec1);
	//tyj::vector<int> vec2;
	//vec2.resize(5, 1);
	//PrintVector<int>(vec2);
	//int nums[5] = { 1, 2, 3, 4, 5 };
	//tyj::vector<int> vec3(nums, nums + sizeof(nums) / sizeof(int));
	//PrintVector<int>(vec3);
	//tyj::vector<int> vec4(vec3);
	//PrintVector<int>(vec4);
	//TestVector3();
	//TestVector2();
	//TestVector4();
	TestVector1();
	return 0;
}

5. Summarize the integration, and specifically talk about how to write an STL container and iterator design ideas (what is an iterator), and the problem of iterator failure.

  • How to implement an STL writing methodology

  • First analyze the problem of how to start if you want to write an STL container, and find an official document analysis component + interface
  • Recommended official document website:    www.cplusplus.com   
  • To write an STL, you need to separate the iterator design from the overall design. First implement the iterator design + implement the simplest insertion interface (complete framework), and test the insertion interface. You can achieve certain achievements, and then you can know your basic Is the frame set up correctly?
  • Read the document to analyze the key interface functions, you can buy a STL source code analysis, design the interface according to Mr. Hou Jie's description, and    implement the useful interfaces one by one.    (Continuously test and find details in the process of implementation)
  • Summarize the pits in the implementation details + quickly reproduce and rewrite the entire framework  ( skilled memory )
  • In fact, it is not very suitable to talk about the design of iterators here, because T* is a native pointer, and it is also a continuous linear storage space. As an iterator, the native pointer itself supports ++ -- * these operations, plus The elements themselves are stored contiguously, so not much effort has been put into the design of the iterator here.
  • Iterator, which is a smart pointer class feeling, but not quite, can help us manage access to elements in the container, support ++ -- * these operations to access elements in complex storage structures.
  • The design of iterators is for unification, to be the glue, to be the glue between various containers and algorithms. With iterators, we can do unified generic algorithm operations for various containers. , in order to unify the traversal methods of container elements in various formats and provide a unified traversal interface. It is a particularly important component in STL

The above algorithms are impossible to write without iterators.

  • The iterator fails. To put it bluntly, it is actually a memory problem. After the expansion, the original iterator cannot be used because the original iterator points to the original space, but  the original space is released after the expansion. How dare to access, so every operation on iterators we need to receive the return value is the reason, with old and new iterators, avoid using invalid iterators, it = e.erase(it) can not be e.erase(it++); ; Because you don't know how it is after deleting it, but the return iterator is a valid iterator pointing to the next element, so you need to receive the return value.               After deleting it, it is essentially an invalid iterator, how to catch it directly ; Because of its nature, it is an iterator that is about to be deleted before receiving the return value.
  • Therefore, deletion, insertion and expansion may cause the iterator to fail. The insertion fails because of the expansion, and the deletion fails because it has not been killed. It is likely to be delete or other operations.

Guess you like

Origin blog.csdn.net/weixin_53695360/article/details/123248476