【C++】STL---vector

1. Introduction to vector

  1. A vector is a sequence container that represents a variable-sized array.
  2. Just like arrays, vectors also use contiguous storage space to store elements. This means that you can use subscripts to access the elements of the vector , which is as efficient as an array. But unlike an array, its size can be changed dynamically, and its size will be automatically
    processed by the container.
  3. Essentially, vector uses a dynamically allocated array to store its elements. When new elements are inserted, the array needs to be resized. To increase storage space, allocate a new array and then move all elements to this array. In terms of time, this is a relatively expensive task, because the vector is not resized every time a new element is added to the container.
  4. Vector allocation space strategy: vector allocates some extra space to accommodate possible growth because the storage space is larger than the actual storage space required. Different libraries use different strategies to trade off space usage and reallocation.
  5. Therefore, vector takes up more storage space in order to gain the ability to manage storage space and grow dynamically in an efficient manner.

2. Simulation implementation of vector

When learning vector , you must learn to read the documentation: Introduction to vector documentation , vector is very important in practice. In practice, we only need to be familiar with common interfaces. Now we will directly start the simulation implementation. In the simulation implementation, we implement common interfaces , and their use and precautions will be explained in the implementation.

First, we put the vector into our own namespace namespace Young; secondly, we need to know that in vs2019 , the implementation of vector is implemented with three iterators. These three iterators respectively point to: the beginning of the data block, the effective The end of the data and the end of the storage capacity are similar to the implementation of string , except that they are expressed in a different form, and are essentially the same. The statement is as follows:

		namespace Young
		{
			// 使用模板,泛型编程
			template <class T>
			class vector
			{
			public:
				typedef T* iterator;
				typedef const T* const_iterator;
		
			private:
				// 给缺省值
				iterator _start = nullptr;  // 指向数据块的开始
				iterator _finish = nullptr;  // 指向有效数据的尾
				iterator _endofstorage = nullptr; // 指向存储容量的尾
			};
		}

1. Capacity-related interfaces

(1)size

To get the length of the valid data, we only need to subtract two iterators, and use the one pointing to the end of the valid data to subtract the one pointing to the beginning of the data block. The implementation is as follows:

			// 获取有效数据长度
			size_t size() const
			{
				return _finish - _start;
			}

(2)capacity

The interface for obtaining capacity is similar to the one above, as follows:

			// 获取容量
			size_t capacity() const
			{
				return _endofstorage - _start;
			}

(3)reserve

We have also implemented reserve in string . The reserve of vector is similar to that of string . When n (the size of the space that needs to be adjusted) is greater than capacity() , it will be expanded, otherwise it will not be reduced; and it only changes the capacity but not the size ; other The implementation is as follows:

			// 申请空间
			void reserve(size_t n)
			{
				if (n > capacity())
				{
					T* tmp = new T[n];
					size_t sz = size();
	
					if (_start)
					{
						// 不能使用 memcpy 拷贝数据 --- 浅拷贝问题
						// memcpy(tmp, _start, sizeof(T) * sz);
	
						for (size_t i = 0; i < sz; i++)
						{
							tmp[i] = _start[i];
						}
	
						delete[] _start;
					}
	
					_start = tmp;
					_finish = _start + sz;
					_endofstorage = _start + n;	
				}
			}

Note that when we copy data, we cannot use memcpy to copy, because this will lead to deep shallow copy problems, such as instantiating the following objects:

Insert image description here
When we perform tail insertion, since the v object has no space, we need to open space. Our default is to open 4 spaces. When we need to insert the fifth data, we need to expand it again. At this time, the problem is reflected. As shown below:

Insert image description here

Because the data block maintained by _start contains the string custom type, the _str pointer in it should point to a string. When we need to expand the capacity, memcpy will copy it byte by byte and copy it to _tmp . Among them, the _str of the string in _tmp also points to the original space. When we , the _start space was released, and there is still a pointer in _tmp pointing to the released space, so this causes a wild pointer problem. Therefore, memcpy cannot be used to copy data.delete[] _start;

Therefore, we should copy by assignment, as shown below. If it is a custom type string like the one above , it will call its own assignment overload, which is a deep copy, so it will not cause the above problems;

				for (size_t i = 0; i < sz; i++)
				{
					tmp[i] = _start[i];
				}

It should also be noted that before we copy the data, we need to record the original length. After copying the data, after reassigning _tmp to _start , we need to update _finish and _endofstorage , and we need to add the original length and capacity. .

(4)resize

			// 调整空间为 n,并初始化空间
			void resize(size_t n,const T& value = T())
			{
				// 如果 n 小于原来的长度,调整 _finish 的位置
				if (n <= size())
				{
					_finish = _start + n;
				}
	
				// 否则,重新开空间,并在数据的尾部插入需要初始化的值
				else
				{
					reserve(n);
	
					while (_finish < _start + n)
					{
						*_finish = value;
						_finish++;
					}
				}
			}

(5)empty

			// 判断是否空
			bool empty()
			{
				return _start == _finish;
			}

2. [] overload

We can also implement random access to subscripts in vector, so we can overload [ ] to support random access. The implementation is as follows:

Non-const objects:

			// [] 重载
			T& operator[](size_t pos)
			{
				assert(pos < size());
	
				return _start[pos];
			}

const object:

			const T& operator[](size_t pos) const
			{
				assert(pos < size());
	
				return _start[pos];
			}

3. Iterator

Non-const objects:

			// 迭代器
			iterator begin()
			{
				return _start;
			}
	
			iterator end()
			{
				return _finish;
			}

const object:

			const_iterator begin() const
			{
				return _start;
			}
	
			const_iterator end() const
			{
				return _finish;
			}

4. Modify data-related interfaces

(1)push_back

When tail plugging, you need to pay attention to first determine whether the space is full. If it is full, you need to expand it. If it is empty at the beginning, we will open 4 spaces by default. The implementation is as follows:

			// 尾插
			void push_back(const T& x)
			{
				if (size() == capacity())
				//if (_finish == _endofstorage)
				{
					reserve(capacity() == 0 ? 4 : capacity() * 2);
				}
	
				*_finish = x;
				_finish++;
			}

(2)pop_back

Tail deletion only needs to reduce _finish ; the implementation is as follows:

			// 尾删
			void pop_back()
			{
				assert(size() > 0);
	
				_finish--;
			}

(3)insert

Insert data at pos position and implement as follows:

			// 在 pos 位置插入数据
			void insert(iterator pos, const T& value)
			{
				assert(pos >= _start);
				assert(pos <= _finish);
	
				if (_finish == _endofstorage)
				{
					// 记录 pos 的长度,防止开空间后迭代器失效
					size_t len = pos - _start;
					reserve(capacity() == 0 ? 4 : capacity() * 2);
	
					// 开好空间后 _start 重新加上 len ,使 pos 回到原来相对 _start 的位置
					pos = _start + len;
				}
	
				// 挪动数据
				iterator end = _finish - 1;
				while (end >= pos)
				{
					*(end + 1) = *end;
					end--;
				}
	
				// 插入数据
				*pos = value;
				_finish++;
			}

When implementing insert , you need to pay attention to the problem of iterator failure . In the code implemented above, if the length of pos is not recorded and the space is not enough and needs to be expanded, pos will still remain in the original space, and the original space has been is released, it will cause a wild pointer problem; so in order to avoid this problem, we need to record the length of pos before expansion, update the position of pos after opening the space , and let it return to the position of the new space relative to the original space. This is the first iterator failure problem.

The use of insert is as shown below:

Insert image description here

(4)erase

Delete the data at pos position as follows:

			// 删除 pos 位置的数据
			iterator erase(iterator pos)
			{
				assert(pos >= _start);
				assert(pos < _finish);
	
				// 挪动数据
				iterator end = pos + 1;
				while (end < _finish)
				{
					*(end - 1) = *end;
					end++;
				}
				// 有效数据减一
				_finish--;
	
				return pos;
			}

The use and implementation of erase will face the second situation of iterator failure . For example, let's take the above data as an example. Suppose we need to delete even numbers in the data, such as the following figure:

Insert image description here

As a result, the even numbers were not completely deleted. Why? This is because after deleting the first 2, it points to the original second 2. After using ++, the second 2 is missed; the same is true for the following 6, for example, as shown below:

Find even numbers:
Insert image description here

After removing the first 2:

Insert image description here

after it++:

Insert image description here

This kind of iterator failure may also face the problem of program crash. If there is only one 6 in the above data, it will miss a position with v1.end() and cause the program to crash, such as the following figure:

After finding 6:

Insert image description here

After deletion:

Insert image description here

after it++:

Insert image description here

As shown above, in this case it will never be equal to v1.end() , so the program will loop endlessly.

What's the solution? The solution is that when we implement erase , we need to return the position of the current pos after being deleted , such as in the above implementation; and when using it, we use it to receive this position after erase and no longer access this position, for example The following figure:

Insert image description here

In this code, after we erase , we do not access the current position, but access it when there is no erase ;

			if (*it % 2 == 0)
			{
				it = v1.erase(it);
			}
			else
			{
				it++;
			}

Conclusion: After insert and erase, the iterator is invalid and can no longer be accessed.

(5)swap

We use the swap function of the standard library to implement the swap function in our vector , as follows:

			// 交换
			void swap(vector<T>& v)
			{
				std::swap(_start, v._start);
				std::swap(_finish, v._finish);
				std::swap(_endofstorage, v._endofstorage);
			}

(6)clear

To clear the data, just change _finish to _start . The implementation is as follows:

			// 清空数据
			void clear()
			{
				_finish = _start;
			}

5. Constructor

Because we gave the default value at the declaration, the parameterless constructor can be written in the following form, because the default value will eventually go through the initialization list:

			// 构造函数
			vector()
			{}

We then overload other forms of constructors, for example vector<int> v(10,0), open 10 spaces and initialize them to 0, as follows:

			// vector<int> v(10,0);
			vector(int n, const T& value = T())
			{
				reserve(n);
	
				for (int i = 0; i < n; i++)
				{
					push_back(value);
				}
			}

We gave an anonymous object in the default value, because we don't know what the type of T is, so we need to give an anonymous object in the default value; if it is a built-in type, it will be initialized to nullptr or 0 , we previously What I learned is that the compiler will not process built-in types, but anonymous objects will. Note that const must be added before the type because anonymous objects have constancy. If it is a custom type , it will call its own constructor to set the default value.

Use the following image:

Insert image description here

We also need to overload a construction form and construct it with an iterator range, which is implemented as follows:

			// vector<string> v(str.begin(),str.end());
			template <class InputIterator>
			vector(InputIterator first, InputIterator last)
			{
				while (first != last)
				{
					push_back(*first);
					first++;
				}
			}

We use the function template again in the class template. The life cycle of this function template is only within this constructor; it is used as follows:

Insert image description here

6. Copy constructor

The copy constructor only needs to apply for a space as large as the formal parameter object, and then insert the data. The implementation is as follows:

			// 拷贝构造函数 -- v2(v1);
			vector(const vector<T>& v)
			{
				reserve(v.capacity());
				for (auto& e : v)
				{
					push_back(e);
				}
			}

7. Assignment operator overloading

Assignment operator overloading uses parameter passing to copy tmp , and then exchanges * this data with tmp . Finally, * this is returned. When tmp goes out of scope , the destructor will be automatically called; the implementation is as follows:

			// 赋值运算符重载 -- v2 = v1;
			vector<T>& operator=(vector<T> tmp)
			{
				swap(tmp);
	
				return *this;
			}

8. Destructor

The destructor only needs to release the space of _start , because _start, _finish, and _endofstorage are actually the same space, but their locations are different. The implementation is as follows:

			// 析构函数
			~vector()
			{
				delete[] _start;
				_start = _finish = _endofstorage = nullptr;
			}

Guess you like

Origin blog.csdn.net/YoungMLet/article/details/132230486