[C++] STL_vector iterator invalidation problem

insert image description here

1 Introduction

**The main function of the iterator is to let the algorithm not care about the underlying data structure. The underlying layer is actually a pointer, or it encapsulates the pointer. For example: the iterator of string is the original pointer char, and the iterator of vector is the original ecology pointer T. Therefore , the invalidation of the iterator actually means that the space pointed to by the corresponding pointer at the bottom of the iterator is destroyed, and a piece of space that has been released is used, resulting in a program crash (that is, if you continue to use the invalid iterator, the program may crash. ).
We understand the invalidation of the iterator, so now we analyze which operations in the vector will cause the invalidation of the iterator.

2. Situation 1: The operation of changing the underlying space

The function interfaces that change the underlying space include: resize, reserve, insert, assign, push_back, etc.
Reason:
These interfaces all have the problem of capacity expansion. When the capacity is expanded, there is a remote expansion. After the remote expansion, the original space is released, but the iterator refers to the released space, which will cause the iterator to fail. The problem will cause the problem of program crash.
Solution:
Once there is expansion, update the iterator once after expansion, and reassign the iterator.
Example:
Let's take a look at the insert interface.
insert image description here

We can see from the figure that when we need to insert data 30 before 3, but the space is already full, we need to expand the capacity. The expansion is to open a space in a different place. After the space is opened, the data in the old space will be copied back, and the old space will be copied back. The space is released, _start points to the head of the new space, but it refers to the position of the old space, which means the iterator becomes invalid. We remember the relative position of it relative to _start, and after the new space is opened, update it to point to the relative position of the new space. (Method: Calculate the distance len from it to _start, and after opening a new space, update it to the new _start+len).
Code:

iterator insert(iterator pos, const T& x)
{
    
    
    assert(pos >= _start);
    assert(pos <= _finish);

    if (_finish == _endOfStorage)
    {
    
    
        size_t len = pos - _start;//先记下_start到pos位置的距离,因为扩容后迭代器pos就会失效
        reserve(capacity() == 0 ? 4 : 2 * capacity());
        pos = _start + len;//新的空间需要更新迭代器pos
    }

    iterator end = _finish - 1;
    //挪动数据
    while (end >= pos)
    {
    
    
        *(end + 1) = *end;
        --end;
    }

    *pos = x;
    ++_finish;

    return pos;
}

3. Situation 2: Delete operation of specified position element

For the erase interface, it also causes iterator invalidation. So how did it cause it, let's analyze it.
Cause:
After erasing deletes the element at pos, the elements after pos will move forward without changing the underlying space. Theoretically speaking, the iterator should not fail, but: if pos happens to be the last element, delete it After that, pos happens to be the position of end, and the position of end has no elements, so pos becomes invalid. Therefore, when deleting an element at any position in the vector, VS considers that the iterator at that position is invalid.
insert image description here

#include <iostream>
using namespace std;
#include <vector>

int main()
{
    
    
    int a[] = {
    
     1, 2, 3, 4 };
    vector<int> v(a, a + sizeof(a) / sizeof(int));
    // 使用find查找3所在位置的iterator
    vector<int>::iterator pos = find(v.begin(), v.end(), 3);
    // 删除pos位置的数据,导致pos迭代器失效。
    v.erase(pos);
    cout << *pos << endl; // 此处会导致非法访问
    return 0;
}

Solution:
The essence is the iterator failure caused by tail deletion, so we return to the next position of it after tail deletion. Our simulation implementation is data coverage (it+1 covers it), so what is returned is it , after deleting --_finish, when the position it points to is _finish, it just stops, so the problem of iterator invalidation is solved.
Code:

iterator erase(iterator pos)
{
    
    
    assert(pos >= _start);
    assert(pos < _finish);

    iterator it = pos + 1;
    //挪动数据
    while (it < _endOfStorage)
    {
    
    
        *(it - 1) = *it;
        ++it;
    }
    --_finish;

    return pos;
}

4. g++ compiler invalidation detection for iterators

Under Linux, the g++ compiler is not very strict in the detection of iterator failure, and the handling is not as extreme as that of vs2019.
Let's take a look at the following situations, how the code behaves differently under vs2019 and g++.

4.1 Expansion

int main()
{
    
    
    vector<int> v{
    
    1,2,3,4,5};
    for(size_t i = 0; i < v.size(); ++i)
    	cout << v[i] << " ";
    cout << endl;
    
    auto it = v.begin();
    cout << "扩容之前,vector的容量为: " << v.capacity() << endl;

    v.reserve(100);
    cout << "扩容之后,vector的容量为: " << v.capacity() << endl;
    
    while(it != v.end())
    {
    
    
        cout << *it << " ";
        ++it;
    }
    cout << endl;
    
    return 0;
}

The result of running under g++:
insert image description here

Running results under vs2019:
insert image description here

The program crashed under vs2019.
Conclusion: After expansion, the iterator is invalid. Although it can run under g++, the result is wrong, and the program directly crashes under vs.

4.2 erase deletes any position (non-tail delete)

#include <vector>
#include <algorithm>
using namespace std;

int main()
{
    
    
    vector<int> v{
    
    1,2,3,4,5};
    vector<int>::iterator it = find(v.begin(), v.end(), 3);
    
    v.erase(it);
    cout << *it << endl;
    
    while(it != v.end())
    {
    
    
        cout << *it << " ";
        ++it;
    }
    cout << endl;
    
    return 0;
}

The result of running under g++:
insert image description here

Running results under vs2019:
insert image description here

Conclusion: In the deletion of non-tail deletion, the space does not change, the iterator refers to the same space, the iterator is not invalid under g++, the subsequent data is moved forward after deletion, and the it position is not invalid, as long as it is under VS Erase, it is judged that the iterator is invalid.

4.3 erase

int main()
{
    
    
	vector<int> v{
    
    1,2,3,4,5,6};

	auto it = v.begin();
	while (it != v.end())
	{
    
    
		if (*it % 2 == 0)
			v.erase(it);
		++it;
	}

	for (auto e : v)
		cout << e << " ";
	cout << endl;

	return 0;
}

The result of running under g++:
insert image description here

Don't look at it under vs2019, it crashes directly.
Conclusion: When deleting at the end, there is data movement after deletion. Once the position of _finish and it is moved, erase will return to the next position of the deleted position. At this time, the iterator is invalid, and then the ++it program directly collapse.

5. Summary

This article mainly talks about iterator failures caused by expansion, insertion, and deletion. g++ is not strict with iterator failure detection, but vs is very strict with iterator failure detection and crashes directly.
1. Generally, the iterator needs to be updated for capacity expansion. We don’t know which time the capacity expansion is a remote expansion.
2. When inserting at any position, once there is expansion, the iterator must be updated. The essence is to update the iterator for expansion.
3. When deleting any position, the non-tail deletion under g++ does not consider the iterator failure problem, and the tail deletion must pay attention to the iterator failure problem; the deletion in vs2019 is considered as the iterator failure, and it crashes directly.

Guess you like

Origin blog.csdn.net/Ljy_cx_21_4_3/article/details/132517537