The use and simulation implementation of vector

Table of contents

1. Introduction and use of vector

1.Introduction to vector

2.Use of vector

1. Definition of vector

 2.Use of vector iterator

3. Vector space growth problem

4.vector addition, deletion, checking and modification

3.Vector iterator failure problem (key points)

1. Operations that will cause changes to its underlying space

2. Deletion operation of the specified position element--erase

3. How the g++ compiler handles iterators under Linux.

2. Vector in-depth analysis and simulation implementation

1. Simulated implementation of the core framework interface of std::vector

2. Copy problem using memcpy

3. Dynamic two-dimensional array understanding



1. Introduction and use of vector

1.Introduction to vector

1. Vector is a sequence container that represents a variable-size array.
2. Just like arrays, vectors also use continuous storage space to store elements. This means that you can use subscripts to access the elements of the vector , which is as efficient as an array. But unlike an array, its size can be changed dynamically , and its size will be automatically handled by the container.
3. Essentially, vector uses a dynamically allocated array to store its elements . When new elements are inserted, the array needs to be resized to increase storage space. This is done by allocating a new array and then moving all elements into this array. In terms of time, this is a relatively expensive task, because the vector is not resized every time a new element is added to the container.
4. Vector allocation space strategy: vector will allocate some additional space to accommodate possible growth because the storage space is larger than the actual required storage space. Different libraries use different strategies to trade off space usage and reallocation. But in any case, the reallocation should be logarithmically increasing in interval size, so that inserting an element at the end is completed in constant time.
5. Therefore, vector takes up more storage space in order to gain the ability to manage storage space and grow dynamically in an efficient manner.
6. Compared with other dynamic sequence containers (deque, list and forward_list), vector is more efficient when accessing elements, and adding and deleting elements at the end is relatively efficient . For other deletion and insertion operations that are not at the end, the efficiency is even lower. Unified iterators and references are better than list and forward_list.

2.Use of vector

When learning vector, you must learn to check the documentation: Introduction to vector documentation , vector is very important in practice. In practice, we only need to be familiar with common interfaces.

1. Definition of vector

 2. Use of vector iterator

 

 Note: All iterator ranges are left-closed and right-open, and not only vector iterators can be passed, but other types of iterators can also be passed, as long as the types can match.

The following is a code demonstration

void Print(const vector<int>& v)
{
	// const对象使用const迭代器进行遍历打印
	vector<int>::const_iterator it = v.begin();
	while (it != v.end())
	{
		cout << *it << " ";
		++it;
	}
	cout << endl;
}

3. Vector space growth problem

 

1. When the capacity code is run under vs and g++, you will find that the capacity increases by 1.5 times under vs and 2 times by g++. The specific amount of growth is defined based on specific needs. vs is the PJ version STL, g++ is the SGI version STL.
2. Resize will also initialize while opening space, which will affect the size .
3. Reserve is only responsible for opening up space . If you know for sure how much space is needed, reserve can alleviate the cost problem of vector expansion.
// 如果已经确定vector中要存储元素大概个数,可以提前将空间设置足够
// 就可以避免边插入边扩容导致效率低下的问题了
void TestVector()
{
     vector<int> v;
     size_t sz = v.capacity();
     v.reserve(100); // 提前将容量设置好,可以避免一遍插入一遍扩容
     cout << "making bar grow:\n";
     for (int i = 0; i < 100; ++i) 
     {
     v.push_back(i);
     if (sz != v.capacity())
     {
     sz = v.capacity();
     cout << "capacity changed: " << sz << '\n';
     }
     }
}

 Through testing, it was found that the space had been opened in advance and the capacity had become 100.

4. vector addition, deletion, checking and modification

 Important function interface parameters

void push_back (const value_type& val);

void pop_back();

template <class InputIterator, class T>   
InputIterator find (InputIterator first, InputIterator last, const T& val);

iterator insert (iterator position, const value_type& val);
void insert (iterator position, size_type n, const value_type& val);

iterator erase (iterator position);iterator erase (iterator first, iterator last);

3.Vector iterator failure problem (key points)

The use of iterators is particularly widespread. The main function of iterators is to allow algorithms to not care about the underlying data structure. The underlying layer is actually a pointer , or a pointer is encapsulated . For example, the iterator of vector is the original pointer T*. Therefore, when the iterator fails, it actually means that the space pointed to by the corresponding pointer at the bottom of the iterator is destroyed, and using a space that has been released will cause the program to crash (that is, if you continue to use an expired iterator, the program may crash ).

1. Operations that will cause changes to its underlying space

For example: resize, reserve, insert, assign, push_back, etc., may cause the iterator to fail.

#include <iostream>
using namespace std;
#include <vector>
int main()
{
  vector<int> v{1,2,3,4,5,6};
  auto it = v.begin();
// 将有效元素个数增加到100个,多出的位置使用8填充,操作期间底层会扩容
 // v.resize(100, 8);
 
 // reserve的作用就是改变扩容大小但不改变有效元素个数,操作期间可能会引起底层容量改变
 // v.reserve(100);
 
 // 插入元素期间,可能会引起扩容,而导致原空间被释放
 // v.insert(v.begin(), 0);
 // v.push_back(8);
 
 // 给vector重新赋值,可能会引起底层容量改变

  v.assign(100, 8);
  while(it != v.end())
  {
  cout<< *it << " " ;
  ++it;
  }
  cout<<endl;
  return 0;
}
Reason for the error: The above operations may cause the vector to expand, which means that the underlying principle of the vector is that the old space is released , and when printing, it still uses the old space between releases. When operating the it iterator , the actual operation is a piece that has been released
space, causing the code to crash when running.
Solution: After the above operation is completed, if you want to continue to operate the elements in the vector through the iterator, you only need to reassign it.

2. Deletion operation of the element at the specified position - -erase

#include <iostream>
using namespace std;
#include <vector>
int main()
{
 int a[] = { 1, 2, 3, 4 };
 vector<int> v(a, a + sizeof(a) / sizeof(int));
 // 使用find查找3所在位置的iterator
 vector<int>::iterator pos = find(v.begin(), v.end(), 3);
 // 删除pos位置的数据,导致pos迭代器失效。
 v.erase(pos);
 cout << *pos << endl; // 此处会导致非法访问
 return 0;
}
After erase deletes the element at pos position, the elements after pos position will be moved forward, without causing changes in the underlying space. Theoretically, the iterator should not fail, but: if pos happens to be the last element, after deletion, pos happens to be The position of end, and there is no element at the end position, then pos is invalid. Therefore, when deleting an element at any position in the vector, vs considers the iterator at that position to be invalid.

The function of the following code is to delete all even numbers in the vector. Which code is correct and why?
#include <iostream>
using namespace std;
#include <vector>
int main()
{
 vector<int> v{ 1, 2, 3, 4 };
 auto it = v.begin();
 while (it != v.end())
 {
 if (*it % 2 == 0)
 v.erase(it);
 ++it;
 } 
 return 0;
}


int main()
{
 vector<int> v{ 1, 2, 3, 4 };
 auto it = v.begin();
 while (it != v.end())
 {
 if (*it % 2 == 0)
 it = v.erase(it);    //返回一个迭代器,指向删除数据的下一个位置
 else
 ++it;
 }
 return 0;
}

The first code is wrong, it will cause the iterator to become invalid , and its deletion logic is wrong. Taking the above code as an example, when the program deletes "2", the pos position will become "3", then it++, the iterator will point to 4, and the judgment of 3 will be missed, and the last one is an even number 4, delete it Later, the iterator will exceed _finish, causing it to never == v.end().

3. How the g++ compiler handles iterators under Linux.

// 1. 扩容之后,迭代器已经失效了,程序虽然可以运行,但是运行结果已经不对了
int main()
{
 vector<int> v{1,2,3,4,5};
 auto it = v.begin();
 cout << "扩容之前,vector的容量为: " << v.capacity() << endl;
 // 通过reserve将底层空间设置为100,目的是为了让vector的迭代器失效 
 v.reserve(100);
 cout << "扩容之后,vector的容量为: " << v.capacity() << endl;
 // 经过上述reserve之后,it迭代器肯定会失效,在vs下程序就直接崩溃了,但是linux下不会
 // 虽然可能运行,但是输出的结果是不对的
 while(it != v.end())
 {
 cout << *it << " ";
 ++it;
 }
 cout << endl;
 return 0;
}
输出:
扩容之前,vector的容量为: 5
扩容之后,vector的容量为: 100
0 2 3 4 5 409 1 2 3 4 5


// 2. erase删除任意位置代码后,linux下迭代器并没有失效
// 因为空间还是原来的空间,后序元素往前搬移了,it的位置还是有效的
#include <vector>
#include <algorithm>
int main()
{
 vector<int> v{1,2,3,4,5};
 vector<int>::iterator it = find(v.begin(), v.end(), 3);
 v.erase(it);
cout << *it << endl;
 while(it != v.end())
 {
 cout << *it << " ";
 ++it;
 }
 cout << endl;
 return 0;
}

程序可以正常运行,并打印:
4
4 5



// 3: erase删除的迭代器如果是最后一个元素,删除之后it已经超过end
// 此时迭代器是无效的,++it导致程序崩溃
int main()
{
 vector<int> v{1,2,3,4,5};
 // vector<int> v{1,2,3,4,5,6};
 auto it = v.begin();
 while(it != v.end())
 {
 if(*it % 2 == 0)
 v.erase(it);
 ++it;
 }
 for(auto e : v)
 cout << e << " ";
 cout << endl;
 return 0;
}

As can be seen from the above three examples: under Linux, the g++ compiler is not very strict in detecting iterator failure, and the processing is not as extreme as in the case of SGI STL. After the iterator fails, the code does not necessarily crash. But the running result is definitely wrong. If it is not within the range of begin and end, it will definitely crash.

Solution to iterator failure: Just reassign the iterator before use.

2. Vector in-depth analysis and simulation implementation

 

1. Simulated implementation of the core framework interface of std::vector

#pragma once

#include <iostream>
using namespace std;
#include <assert.h>


namespace Kevin
{
	template<class T>
	class vector
	{
	public:
		// Vector的迭代器是一个原生指针
		typedef T* iterator;
		typedef const T* const_iterator;

		///
		// 构造和销毁
		vector()
			: _start(nullptr)
			, _finish(nullptr)
			, _endOfStorage(nullptr)
		{}

		vector(size_t n, const T& value = T())
			: _start(nullptr)
			, _finish(nullptr)
			, _endOfStorage(nullptr)
		{
			reserve(n);
			while (n--)
			{
				push_back(value);
			}
		}

		/*
		* 理论上将,提供了vector(size_t n, const T& value = T())之后
		* vector(int n, const T& value = T())就不需要提供了,但是对于:
		* vector<int> v(10, 5);
		* 编译器在编译时,认为T已经被实例化为int,而10和5编译器会默认其为int类型
		* 就不会走vector(size_t n, const T& value = T())这个构造方法,
		* 最终选择的是:vector(InputIterator first, InputIterator last)
		* 因为编译器觉得区间构造两个参数类型一致,因此编译器就会将InputIterator实例化为int
		* 但是10和5根本不是一个区间,编译时就报错了
		* 故需要增加该构造方法
		*/
		vector(int n, const T& value = T())
			: _start(new T[n])
			, _finish(_start+n)
			, _endOfStorage(_finish)
		{
			for (int i = 0; i < n; ++i)
			{
				_start[i] = value;
			}
		}

		// 若使用iterator做迭代器,会导致初始化的迭代器区间[first,last)只能是vector的迭代器
		// 重新声明迭代器,迭代器区间[first,last)可以是任意容器的迭代器
		template<class InputIterator>
		vector(InputIterator first, InputIterator last)
		{
			while (first != last)
			{
				push_back(*first);
				++first;
			}
		}

		vector(const vector<T>& v)
			: _start(nullptr)
			, _finish(nullptr)
			, _endOfStorage(nullptr)
		{
			reserve(v.capacity());
			iterator it = begin();
			const_iterator vit = v.cbegin();
			while (vit != v.cend())
			{
				*it++ = *vit++;
			}
			_finish = it;
		}

		vector<T>& operator=(vector<T> v)
		{
			swap(v);
			return *this;
		}

		~vector()
		{
			if (_start)
			{
				delete[] _start;
				_start = _finish = _endOfStorage = nullptr;
			}
		}

		/
		// 迭代器相关
		iterator begin()
		{
			return _start;
		}

		iterator end()
		{
			return _finish;
		}

		const_iterator cbegin() const
		{
			return _start;
		}

		const_iterator cend() const
		{
			return _finish;
		}

		//
		// 容量相关
		size_t size() const 
		{ 
			return _finish - _start; 
		}

		size_t capacity() const 
		{ 
			return _endOfStorage - _start; 
		}

		bool empty() const 
		{ 
			return _start == _finish; 
		}

		void reserve(size_t n)
		{
			if (n > capacity())
			{
				size_t oldSize = size();
				// 1. 开辟新空间
				T* tmp = new T[n];

				// 2. 拷贝元素
		        // 这里直接使用memcpy会有问题吗?同学们思考下
		        //if (_start)
		        //	memcpy(tmp, _start, sizeof(T)*size);

				if (_start)
				{
					for (size_t i = 0; i < oldSize; ++i)
						tmp[i] = _start[i];

					// 3. 释放旧空间
					delete[] _start;
				}

				_start = tmp;
				_finish = _start + oldSize;
				_endOfStorage = _start + n;
			}
		}

		void resize(size_t n, const T& value = T())
		{
			// 1.如果n小于当前的size,则数据个数缩小到n
			if (n <= size())
			{
				_finish = _start + n;
				return;
			}

			// 2.空间不够则增容
			if (n > capacity())
				reserve(n);

			// 3.将size扩大到n
			iterator it = _finish;
			_finish = _start + n;
			while (it != _finish)
			{
				*it = value;
				++it;
			}
		}

		///
		// 元素访问
		T& operator[](size_t pos) 
		{ 
			assert(pos < size());
			return _start[pos]; 
		}

		const T& operator[](size_t pos)const 
		{ 
			assert(pos < size());
			return _start[pos]; 
		}

		T& front()
		{
			return *_start;
		}

		const T& front()const
		{
			return *_start;
		}

		T& back()
		{
			return *(_finish - 1);
		}

		const T& back()const
		{
			return *(_finish - 1);
		}
		/
		// vector的修改操作
		void push_back(const T& x) 
		{ 
			insert(end(), x); 
		}

		void pop_back() 
		{ 
			erase(end() - 1); 
		}

		void swap(vector<T>& v)
		{
			std::swap(_start, v._start);
			std::swap(_finish, v._finish);
			std::swap(_endOfStorage, v._endOfStorage);
		}

		iterator insert(iterator pos, const T& x)
		{
			assert(pos <= _finish);

			// 空间不够先进行增容
			if (_finish == _endOfStorage)
			{
				//size_t size = size();
				size_t newCapacity = (0 == capacity()) ? 1 : capacity() * 2;
				reserve(newCapacity);

				// 如果发生了增容,需要重置pos
				pos = _start + size();
			}

			iterator end = _finish - 1;
			while (end >= pos)
			{
				*(end + 1) = *end;
				--end;
			}

			*pos = x;
			++_finish;
			return pos;
		}

		// 返回删除数据的下一个数据
		// 方便解决:一边遍历一边删除的迭代器失效问题
		iterator erase(iterator pos)
		{
			// 挪动数据进行删除
			iterator begin = pos + 1;
			while (begin != _finish) {
				*(begin - 1) = *begin;
				++begin;
			}

			--_finish;
			return pos;
		}
	private:
		iterator _start;		// 指向数据块的开始
		iterator _finish;		// 指向有效数据的尾
		iterator _endOfStorage;  // 指向存储容量的尾
	};
}

2. Copy problem using memcpy

int main()
{
 bite::vector<bite::string> v;
 v.push_back("1111");
 v.push_back("2222");
 v.push_back("3333");
 return 0;
}

Assuming that the reserve interface in the vector implemented by the simulation is copied using memcpy, will there be any problems with the above code?

 

Inserting "2222" requires opening up new space.

 

memcpy is a binary format copy of memory. It copies the contents of one memory space to another memory space intact.

If you copy a custom type element, memcpy is efficient and error-free. However, if you copy a custom type element, and resource management is involved in the custom type element, an error will occur, because memcpy's copy is actually Shallow copy.

 Conclusion: If resource management is involved in the object, you must not use memcpy to copy between objects, because memcpy is a shallow copy, otherwise it may cause memory leaks or even program crashes.

3. Dynamic two-dimensional array understanding

vector<vector<int>> vv(n);
Construct a vv dynamic two-dimensional array. There are a total of n elements in vv . Each element is of vector type. Each row does not contain any elements. If n is 5 , it is as follows:

After completing the element filling, as shown below:

When using vector in the standard library to construct a dynamic two-dimensional array, it is actually consistent with the above figure.

Guess you like

Origin blog.csdn.net/weixin_65592314/article/details/129446007