[C++] String class simulation implementation Part 1 (with complete source code)

Preface

In the previous article, we introduced in detail the use of some common interfaces of the string class. In this article, we will conduct a simulated implementation of string to help everyone understand more deeply.

1. Basic structure of string

In the previous article we learned about:

The bottom layer of string is actually a character array that supports dynamic growth.. Then determine its structure, and then we start to simulate and implement it.

First create a new header file string.hand define a string class:

class string
{
    
    
    public :
    //成员函数
    private :
        char*  _str;
        size_t _size;
        size_t _capacity;
};

Here, there are three member variables of the string class, a character pointer _strpoints to the opened dynamic array, _sizeidentifies the number of valid data, and _capacityrecords the size of the capacity (excluding '\0').

But because there is already a string class in the standard library,In order to avoid conflicts, we need to define a namespace and put our own implemented string class into our own namespace

namespace w
{
    
    
    class string
{
    
    
    public :
    //成员函数
    private :
        char*  _str;
        size_t _size;
        size_t _capacity;

};
    
} 

2. Constructor, destructor

2.1 Implementation of constructor

2.1.1 Constructor with parameters

First, let's simulate and implement a constructor with parameters:

We know that there are many constructors of the string class in the standard library. Here we only simulate and implement the most commonly used ones:
Insert image description here

In the previous article, we mentioned that we should try to use the initialization list for initialization. We can write like this:
Insert image description here
char* strBut here you will find that the program reports an error, because if it is initialized as shown above, it first involves the issue of permission amplification (discussed in the previous article). It is modified and cannot be modified, constbut the assignment _stris _strof char*type and can be modified. Secondly, initialization with a constant string cannot be modified.

then what should we do? We do not pass parameters directly here but open space and use strcpy to copy:

      string(const char* str)
	        :_str(new char[strlen(str)+1])
		    ,_size(strlen(str))
		    ,_capacity(strlen(str))
	    {
    
    
            strcpy(_str, str);
        }

By the way, here we provide an interface to return a string:

 const char* c_str()
        {
    
    
            return _str;
        }

We are creating a test.cppfile to test the interface we wrote:
Insert image description here

2.2 Destructor

Here we give the destructor directly:

~string()
        {
    
    
            delete[] _str;
            _str = nullptr;
            _size = _capacity = 0;
        }

2.3 Parameterless constructor

Sometimes we will encounter such a scenario:
Insert image description here
So here we need to implement a parameterless constructor.

Assume that the parameterless constructor here is implemented like this:
Insert image description here

Is this really feasible?
If a null pointer is passed here _str, the function just implemented c_strwill return an empty program and the program will crash. And the interface in the standard library c_strwill still have a return value even if it is empty.

So what should we do here? We can write this:

  string()
	        :_str(new char[1])
		    ,_size(0)
		    ,_capacity(0)
	    {
    
    
            _str[0] = '\0';
        }

Here we _stropen up a space, and then give this space '\0'. In this way, the above problems will not occur.

2.4 Merging no-parameter and parameterized constructors

We have mentioned before that you can use the full default for no parameters and with parameters.

Let’s look at several ways to write it:
Insert image description here
Can it be written like this?The answer is that you definitely can’t write it like this. The types will not match. One is a character and the other is a string..
Insert image description here
Can it be written like this?The answer is definitely not. If you write strlen like this, strit will be an empty string.

In fact, it should be written like this:
Insert image description here
Here we directly give an empty string, which is "\0"present

3. String traversal

3.1 operator[ ]

We know that in the standard library, you can access a certain character in a string through subscripting. Let's implement the overloading [].

First we need to implement size(the interface:
Insert image description here

Next we will implement the following []overloading:
Insert image description here
Insert image description here
Here we have implemented two versions: the normal version corresponds to ordinary objects, and the const version corresponds to const objects, and these two functions constitute function overloading

Let’s verify it below:
Insert image description here

3.2 Iterator simulation implementation (simple implementation)

In addition to []traversing and accessing string objects, we can also access them using iterators.

As we said, the iterator can be understood as something like a pointer, but it is not necessarily a pointer.
We initially introduced that there are several versions of STL, and the implementation of different versions may be different.
In fact, the string iterator under VS is not implemented using pointers, but the SGI version used under G++ is implemented by pointers.
So here we simulate the implementation using pointers:
Insert image description here

Let’s verify it below:
Insert image description here
Insert image description here

Similarly, we can also use range for to traverse:
Insert image description here
the bottom layer of range for is the iterator used.
You can understand that the syntax of range for is actually somewhat similar to the macro we learned before. It will be replaced by an iterator, which is equivalent to assigning *it to ch. The bottom layer of range for is brainless replacement.

3.3 const iterator simulation implementation

Here we implement the const version for const objects to use:
Insert image description here

4. Addition, deletion, checking and modification of data

First, let's implement push_back()and append(). Both of these are inserting data. Since data is inserted, we must consider the issue of expansion.
Insert image description here
So if we expand the capacity here, how much should we expand at one time?
For push_back, it is okay to expand twice at a time, but it may not be possible for append to expand twice at a time.
Why?
If the current capacity is 10, adding a string with a length of 25 and expanding the capacity to twice the original size of
20 is not enough.

Then here we reserve through another interface of string, which can change the capacity to the size we specify and help us expand the capacity.
Let's first implement reserve.

4.1 reserve

Let’s first take a look at how to implement reserve:
Insert image description here
When the value of parameter n is less than here _capacity, if this if is not added, the size will be reduced. But we know that Curry’s interface will not be reduced in size. Therefore, this conditional judgment needs to be added.

4.2 push_back和append

Then reservewe will continue to implement push_backand append.

Insert image description here
push_backHere we directly choose to double the expansion.

Insert image description here
The capacity here appendis at least expanded to _size + len.

Let’s implement it below:
Insert image description here

4.3 +=

Although we have push_back and append, we prefer to use overloaded ones +=. Of course, the bottom layer of += can also be implemented using push_back and append.
Insert image description here

Let’s implement it below:
Insert image description here

4.4 insert

For insert, we mainly implement these two versions of the library:
Insert image description here
Insert image description here

First, let's implement inserting n characters at the pos position:
the logic is actually relatively simple.First, determine whether expansion is needed, and then insert data. If you insert it in the middle, you need to move the data.
Insert image description here

Is there any problem with writing this way? Let’s test it out:
Insert image description here
There seems to be no problem. Is it really okay?

Let's look at a special case: pos = 0inserting data at that time:
Insert image description here

The program hangs here. So why?
Here, pos = 0when end is equal to 0, it will enter the loop. What will end become again? Is it -1?
Insert image description here
The type of end here is szie_t, an unsigned integer, so after end is 0 - - is not -1, but the maximum value of the integer. An out-of-bounds occurrence occurs and the loop does not end normally, so the program crashes.

So how to solve it? Is it possible to change end to int?

It's not feasible here either. Comparing end with pos, end becomes int, but pos is of type size_t, so integer promotion will occur here (C language knowledge). So how should we solve it?

There are many solutions here, and we use one of them to solve it using npos mentioned in our previous article :
Insert image description here

Insert image description here

Let’s test it again:
Insert image description here

Just now we inserted a character, now we will insert a string. Then the logic is actually the same as above. It's just that we only need to move n spaces above, so here we need to move the data to make strlen(str)space.
Insert image description here

Let's test it below:
Insert image description here

4.5 erase

Then let's implement erase and delete len characters from the pos position:
Insert image description here

For erasethe first case, which is pos+lenless than the length of the string, we need to delete the last len ​​characters starting at the pos position, but still retain the subsequent characters. Then here is to move the data at the back and overwrite the ones that need to be deleted.
In other cases, len is relatively large, and pos+len is directly greater than or equal to the length of the string, then delete everything after pos. Or if the pos parameter is not passed and the default value is npos, then all the following ones must be deleted, so these two situations can be handled uniformly. Here you only need to “\0”set .
Insert image description here

Let’s test it out:
Insert image description here

Of course, in order to be consistent with the standard library, we also use reference return here:
Insert image description here

4.6 find

Let’s implement it below find. The implementation of find is actually very simple. It traverses to find it. If it is found, it returns the subscript. If it cannot find it, it returns npos.
Insert image description here

Of course, find also supports searching for a string starting from the pos position: here we reuse the search method in the C language strstr.Insert image description here

Let's test it below:
Insert image description here

4.7 substr

Let’s implement it next substr. Its logic is also very simple.

Something to note here is that we need to conditionally judge that when the intercepted string is long enough, the length we intercept is from posthe position to the end of the string.
Insert image description here

5. Copy construction

Let's first write a piece of code like this:
Insert image description here
There is a copy construct here, s2 is a copy construct of s1.

5.1 Shallow copy default copy structure

In the previous article on classes and objects, we know that the copy constructor will be generated by default if we don’t write it ourselves. Here we run the above code directly:
Insert image description here

A classic shallow copy problem occurred when the program went wrong here. We have also talked about it in previous articlesIf not explicitly defined, the compiler will generate a default copy constructor. The default copy constructor copies objects in memory storage byte order. This type of copy is called shallow copy, or value copy.Once resource application is involved, the copy constructor must be written, otherwise it will be a shallow copy and problems will occur.

5.2 Deep copy copy constructor

Here we need to implement the copy constructor ourselves and complete the deep copy:
Insert image description here
Insert image description here
let's test it:
Insert image description here

6. Source code (upper part)

6.1 string.h

#include <iostream>
using namespace std;
namespace w
{
    
    
    class string
{
    
    
    public :
        typedef char* iterator;
        typedef const char* const_iterator;
       iterator begin()
		{
    
    
			return _str;
		}

		iterator end()
		{
    
    
			return _str + _size;
		}

        const_iterator begin() const
		{
    
    
			return _str;
		}

		const_iterator end() const
		{
    
    
			return _str + _size;
		}

        string(const char* str = "")
	        :_str(new char[strlen(str)+1])
		    ,_size(strlen(str))
		    ,_capacity(strlen(str))
	    {
    
    
            strcpy(_str, str);
        }

        
		string(const string& s)
		{
    
    
			_str = new char[s._capacity + 1];
			strcpy(_str, s._str);
			_size = s._size;
			_capacity = s._capacity;
		}


        ~string()
        {
    
    
            delete[] _str;
            _str = nullptr;
            _size = _capacity = 0;
        }

        const char* c_str() const
        {
    
    
            return _str;
        }

        size_t size() const
        {
    
    
            return _size;
        }

        char& operator[](size_t pos)
       {
    
    
	        assert(pos < _size);
	        return _str[pos];
       }

         const char& operator[](size_t pos) const
       {
    
    
	        assert(pos < _size);
	        return _str[pos];
       }

          void reserve(size_t n)
		{
    
    
			if (n > _capacity)
			{
    
    
				char* tmp = new char[n + 1];
				strcpy(tmp, _str);
				delete[] _str;
				_str = tmp;
				_capacity = n;
			}
		}
		void push_back(char ch)
		{
    
    
			if (_size == _capacity)
			{
    
    
				// 2倍扩容
				reserve(_capacity == 0 ? 4 : _capacity * 2);
			}

			_str[_size] = ch;

			++_size;
			_str[_size] = '\0';
		}

		void append(const char* str)
		{
    
    
			size_t len = strlen(str);
			if (_size + len > _capacity)
			{
    
    
				// 至少扩容到_size + len
				reserve(_size+len);
			}

			strcpy(_str + _size, str);
			_size += len;
		}

        string& operator+=(char ch)
		{
    
    
			push_back(ch);
			return *this;
		}

		string& operator+=(const char* str)
		{
    
    
			append(str);
			return *this;
		}

        	void insert(size_t pos, size_t n, char ch)
		{
    
    
			assert(pos <= _size);

			if (_size +n > _capacity)
			{
    
    
				// 至少扩容到_size + len
				reserve(_size + n);
			}

			// 添加注释最好
			size_t end = _size;
			while (end >= pos && end != npos)
			{
    
    
				_str[end + n] = _str[end];
				--end;
			}

			for (size_t i = 0; i < n; i++)
			{
    
    
				_str[pos + i] = ch;
			}

			_size += n;
		}

        	void insert(size_t pos, const char* str)
		{
    
    
			assert(pos <= _size);

			size_t len = strlen(str);
			if (_size + len > _capacity)
			{
    
    
				// 至少扩容到_size + len
				reserve(_size + len);
			}

			// 添加注释最好
			size_t end = _size;
			while (end >= pos && end != npos)
			{
    
    
				_str[end + len] = _str[end];
				--end;
			}

			for (size_t i = 0; i < len; i++)
			{
    
    
				_str[pos + i] = str[i];
			}

			_size += len;
		}

        string& erase(size_t pos, size_t len = npos)
		{
    
    
			assert(pos <= _size);

			if (len == npos || pos + len >= _size)
			{
    
    
				_str[pos] = '\0';
				_size = pos;

				_str[_size] = '\0';
			}
			else
			{
    
    
				size_t end = pos + len;
				while (end <= _size)
				{
    
    
					_str[pos++] = _str[end++];
				}
				_size -= len;
			}

            return *this;
		}

        size_t find(char ch, size_t pos = 0)
		{
    
    
			assert(pos < _size);

			for (size_t i = pos; i < _size; i++)
			{
    
    
				if (_str[i] == ch)
				{
    
    
					return i;
				}
			}

			return npos;
		}

		size_t find(const char* str , size_t pos = 0)
		{
    
    
			assert(pos < _size);

			const char* ptr = strstr(_str + pos, str);
			if (ptr)
			{
    
    
				return ptr - _str;
			}
			else
			{
    
    
				return npos;
			}
		}

        string substr(size_t pos = 0, size_t len = npos)
		{
    
    
			assert(pos < _size);

			size_t n = len;
			if (len == npos || pos + len > _size)
			{
    
    
				n = _size - pos;
			}

			string tmp;
			tmp.reserve(n);
			for (size_t i = pos; i < pos + n; i++)
			{
    
    
				tmp += _str[i];
			}

			return tmp;
		}

    private :
        char*  _str;
        size_t _size;
        size_t _capacity;

    public:
		const static size_t npos;

};
    
	const size_t string::npos = -1;
} 


6.2 test.cpp

#include "Mystring.h"

void test_string1()
{
    
    
    w ::string s1("hello world");
    cout << s1.c_str() << endl;

    for (size_t i = 0; i < s1.size(); i++)
    {
    
    
        cout << s1[i] << " ";
    }
    cout << endl;

    w ::string::iterator it = s1.begin();
    while (it != s1.end())
    {
    
    
        cout << *it << " ";
        ++it;
    }
    cout <<endl;
    
    for (auto ch : s1)
    {
    
    
        cout << ch <<" ";
    }
    cout <<endl;
}

void test_string2()
{
    
    

	w::string s1("hello world");
	cout << s1.c_str() << endl;

	s1.push_back(' ');
	s1.push_back('#');
	s1.append("hello");
	cout << s1.c_str() << endl;

    w::string s2("hello world");
	cout << s2.c_str() << endl;

	s2 += ' ';
	s2 += '#';
	s2 += "hello code";
	cout << s2.c_str() << endl;

}

void test_string3()
{
    
    
	w::string s1("helloworld");
	cout << s1.c_str() << endl;

	s1.insert(5, 3, '#');
	cout << s1.c_str() << endl;

	s1.insert(0, 3, '#');
	cout << s1.c_str() << endl;

    w::string s2("helloworld");
	s2.insert(5, "%%%%%");
	cout << s2.c_str() << endl;
	
}

void test_string4()
{
    
    
	w::string s1("helloworld");
	cout << s1.c_str() << endl;

	s1.erase(5, 3);
	cout << s1.c_str() << endl;

	s1.erase(5, 30);
	cout << s1.c_str() << endl;

	s1.erase(2);
	cout << s1.c_str() << endl;
}

void test_string5()
{
    
    
	w::string s1("helloworld");
	cout << s1.find('w',2) << endl;

	
}

void test_string6()
{
    
    
	w::string s1("hello world");
	w::string s2(s1);

	cout << s1.c_str() << endl;
	cout << s2.c_str() << endl;

}




int main()
{
    
    
    test_string6();
    return 0;
}

7. Summary

The length of the article is limited, and the remaining content will be explained in the next article.

Guess you like

Origin blog.csdn.net/weixin_69423932/article/details/132773877