C++: Simulation implementation of string

1. Preliminary instructions

Note: This article will use the functions strcpy, strstr, strncpy, and strlen
I will explain their functions and usage
If you If you want to thoroughly understand these functions
, you can take a look at my previous blog:
Conquer the C language string functions (super detailed explanation, full of useful information)< /span>

1. Simulate the purpose of implementing the string container

Insert image description here
For exampleleetcode string addition question
Insert image description here
Insert image description here
Insert image description here
Since the simulation implementation of string is so important to us
Then let us embark on the journey of string simulation implementation

2. The general framework we want to implement

After learning about the use of string containers
we know that the string container is actually a sequence table, but this sequence table only stores char type data So our string container can be defined like this
And there is an implicit '\0' at the end

#define _CRT_SECURE_NO_WARNINGS 1
#pragma once
#include <iostream>
using namespace std;
#include <assert.h>
namespace wzs
{
    
    
	class string
	{
    
    
	public:
		//npos
		static size_t npos;
	private:
		char* _str;//这就是string容器中的那个字符数组
		int _size;
		int _capacity;
	};
	//static size_t string::npos = -1;//err
	size_t string::npos = -1;//yes
}

The following is the framework we want to implement

#define _CRT_SECURE_NO_WARNINGS 1
#pragma once
#include <iostream>
using namespace std;
#include <assert.h>
namespace wzs
{
    
    
	class string
	{
    
    
	public:
		//npos
		static size_t npos;
		//1.全缺省的构造函数
		string(const char* str = "");
		~string();
		//2.拷贝构造函数
		string(const string& s);
		//3.赋值运算符重载
		string& operator=(const string& s);
		//4.返回C风格的字符串
		const char* c_str() const;
		//5.iterator
		typedef char* iterator;
		iterator begin();
		iterator end();
		typedef const char* const_iterator;
		const_iterator begin() const;
		const_iterator end() const;
		//6.流插入运算符重载
		//friend ostream& operator<<(ostream& out, const string& s);
		//7.operator[]重载
		char& operator[](int index);
		const char& operator[](int index) const;
		//8.size() capacity()
		const int size() const;
		const int capacity() const;
		//9..reserve:这里不允许缩容
		void reserve(int capacity);
		//10.push_back
		void push_back(const char c);
		//11.append
		void append(const char* str);
		//12.+=
		string& operator+=(const char c);
		string& operator+=(const char* str);
		//13.insert
		void insert(size_t pos, const char c);
		void insert(size_t pos, const char* str);
		//14.erase
		void erase(size_t pos = 0, size_t len = npos);
		//15.resize
		void resize(size_t n, char c = '\0');
		//16.swap
		void swap(string& s);
		//17.find
		size_t find(const string& s, size_t pos = 0) const;
		size_t find(const char c, size_t pos = 0) const;
		//18.substr
		string substr(size_t pos = 0, size_t len = npos)const;
		//19.clear
		void clear();
		//20.empty
		bool empty()const;
		//21.>>
		//friend istream& operator>>(istream& in, string& s);
	private:
		char* _str;
		int _size;
		int _capacity;
	};
	//static size_t string::npos = -1;//err
	size_t string::npos = -1;//yes
	ostream& operator<<(ostream& out, const string& s);
	istream& operator>>(istream& in, string& s);
}

Below we will complete these functions one by one

You will find that I have commented out the friends of stream insertion and stream extraction here
In fact, for the string class, stream insertion and stream extraction do not need to be in the class Using friends internally
is because:
First of all, friends are a way to break through encapsulation and will increase coupling, so it is not recommended to use them
Secondly, we use friends because we want to access private data within the class
But for the string class
its private data :size,capacity,_str have all provided external interfaces
, so stream insertion and stream extraction only need to call those interfaces
There is no need to use friends to access the class Private data inside

2. Default member function

1. Constructor

//1.全缺省的构造函数
string(const char* str = "")
{
    
    
	//1.申请空间
	int len = strlen(str);
	_str = new char[len + 1];
	//2.拷贝数据
	strcpy(_str, str);
	//3.修改_size和_capacity
	_size = len;
	_capacity = len;
}

What we implement here is a fully default constructor
Note a few points:
1. strlen will not work when calculating the length of a string. Counting '\0', so when we open space, we need to open len+1 size, because we also need to store '\0' at the end
2. Here

strcpy(_str,str);
会把str字符串中的从首元素开始一直延伸到'\0'为止的数据拷贝给_str
'\0'也会被拷贝进去!!!

Insert image description here
For this constructor:
In fact, it is
1. Apply for a space
2. Copy data a>
3. Modify _size and _capacity

2.Copy constructor

1. Traditional writing method

Note: _str in string is opened in the heap area
So we need to implement deep copy, otherwise it will cause problems such as multiple releases of the same space

//2.拷贝构造函数
string(const string& s)
{
    
    
	//1.申请空间
	int len = s._size;
	_str = new char[len + 1];
	//2.拷贝数据
	strcpy(_str, s._str);
	//3.修改_size和_capacity
	_size = len;
	_capacity = len;
}

Basically the same as the constructor just now

2. Modern writing style

string(const string& s)
	:_str(nullptr)
	,_size(0)
	,_capacity(0)
{
    
    
	string tmp(s._str);
	swap(tmp);//this->swap(tmp)
}

Use s._str to call the constructor to construct tmp
and then exchange tmp with *this

very clever

3. Destructor

~string()
{
    
    
	//释放空间
	delete[] _str;
	_str=nullptr;
	//重置_size和_capacity
	_size = 0;
	_capacity = 0;
}

1. Release space
2. Reset _size and _capacity

4. Assignment operator overloading

1. Traditional writing method

//3.赋值运算符重载
string& operator=(const string& s)
{
    
    
	if (this != &s)//这里是为了防止自己给自己拷贝的冗余操作
	{
    
    
		//1.开辟一块新空间
		int len = s._size;
		char* tmp = new char[len + 1];
		//2.拷贝数据  把s._str的数据拷贝给tmp
		strcpy(tmp, s._str);
		//3.释放原有空间
		delete[] _str;
		_str = tmp;
		tmp = nullptr;
		//4.修改_size和_capacity
		_size = len;
		_capacity = len;
	}
	return *this;
}

In fact, it is
1. Open up a new space
2. Copy data
3. Release the original space
4. Modify _size and _capacity

2. Modern writing style

//现代写法1
string& operator = (const string& s)
{
    
    
	if (this != &s)
	{
    
    
		//调用拷贝构造函数,然后交换
		string tmp(s);
		swap(tmp);
	}
	return *this;
}
//现代写法2
//直接传值传参,然后交换
string& operator = (string s)
{
    
    
	swap(s);
	return *this;
}

3. Traversal and access

1.operator[] operator overloading

//7.operator[]重载
//同STL库中的operator[]重载:有两个重载版本
//一个没有被const修饰
//另一个是被const修饰的
char& operator[](int index)
{
    
    
	//STL中的string允许[]访问最后的'\0'
	assert(index >= 0 && index <= _size);
	return _str[index];
}
const char& operator[](int index) const
{
    
    
	//STL中的string允许[]访问最后的'\0'
	assert(index >= 0 && index <= _size);
	return _str[index];
}

Just return the reference to the corresponding location directly.

2.iterator iterator

We mentioned the iterator at the end ofC++ takes you through the use of string containers
Here it is No need to go into details

同STL库,iterator有两个版本:const和非const
//5.iterator
typedef char* iterator;
iterator begin()
{
    
    
	return _str;
}
iterator end()
{
    
    
	return _str + _size;
}
typedef const char* const_iterator;
const_iterator begin() const
{
    
    
	return _str;
}
const_iterator end() const
{
    
    
	return _str + _size;
}

4. Capacity related functions

1.size,capacity

Because size and capacity are read-only attributes outside the class, they are decorated with const.

//8.size() capacity()
const int size() const
{
    
    
	return _size;
}
const int capacity() const
{
    
    
	return _capacity;
}

2.reserve

The version we implement here does not allow shrinkage
Whether it can be reduced depends on the specific implementation of the compiler
This is what we Blog: C++ takes you through the use of string containers also mentioned

//9.reserve:这里不允许缩容
void reserve(int capacity)
{
    
    
	//只有需要扩容时才会进行扩容
	if (_capacity < capacity)
	{
    
    
		//1.开辟新空间
		char* tmp = new char[capacity + 1];
		//2.将原有数据拷贝至新空间
		strcpy(tmp, _str);
		//3.释放原有空间
		delete[] _str;
		//4.将_str指向新空间
		_str = tmp;
		tmp = nullptr;
		//5.修改_capacity
		_capacity = capacity;
	}
}

1. Open up a new space
2. Copy the original data to the new space
3. Release the original space
4. Point _str to the new space
5. Modify _capacity

3.resize

Here I reuse erase and push_back
You can look at this resize last, because this resize will not be called in other interfaces, so it doesn’t hurt to look at it last

//15.resize
//这里给字符设置了缺省参数'\0',将两个版本合二为一
void resize(size_t n, char c = '\0')
{
    
    
	//1.n<_size  :  删除多余字符,修改_size,但不修改_capacity
	if (n < _size)
	{
    
    
		erase(n);//erase负责修改_size
	}
	//2._size<=n<=_capacity  :  尾插字符c直到_size == n
	else if (n <= _capacity)
	{
    
    
		//尾插数据
		while (_size < n)
		{
    
    
			push_back(c);//push_back负责修改_size
		}
	}
	//3.n>_capacity:  需要reserve
	else
	{
    
    
		//扩容:将容量扩为n
		reserve(n);//reserve负责修改_capacity
		//尾插数据
		while (_size < n)
		{
    
    
			push_back(c);//push_back负责修改_size
		}
	}
}

Divided into three situations:
Insert image description here

5. Tail insertion

1.push_back

//10.push_back
void push_back(const char c)
{
    
    
	//1.容量不够了的话:扩容
	if(_capacity==_size)
	{
    
    
		int newcapacity = _capacity == 0 ? 4 : _capacity * 2;
		reserve(newcapacity);
	}
	//2.尾插
	_str[_size] = c;
	//3.修改_size
	_size++;
	//4.不要忘了末尾的'\0'
	_str[_size] = '\0';
}

1. If the capacity is not enough: expand the capacity
2. Tail insertion
3. Modify_size
4 .Don't forget the '\0' at the end

2.append

//11.append
void append(const char* str)
{
    
    
	//1.容量不够的话: 扩容
	int len = strlen(str);
	int newcapacity = _size + len;
	//如果容量够,reserve不会执行任何语句(具体原因请看reserve函数)
	reserve(newcapacity);
	//2.拷贝数据
	strcpy(_str + _size, str);
	//3.修改_size
	_size += len;
}

1. If the capacity is not enough: expand the capacity
2. Copy data
3. Modify _size

Note:
_str+_size points to the end of _str ('\0')
Then strcpy starts from the end of _str ('\ 0')Start copying

Here comes the question:
Can I use strcat?
It is not recommended to use it
because strcat will First traverse the target string to find '\0'
and then copy
. In this case, there will be an additional cost of traversing the target string< a i=6> And we directly use _str+_size to find the end of the target string ('\0')

It can be seen that after we master the underlying implementation of some library functions, we can optimize a lot of unnecessary consumption.

3.operator+= operator overloading

Just reuse push_back and append
There are also two versions

//12.+=
string& operator+=(const char c)
{
    
    
	push_back(c);
	return *this;
}
string& operator+=(const char* str)
{
    
    
	append(str);
	return *this;
}

6. Insert and delete at specified position

1.insert

1.Insert one character version

Note: There is a pitfall below
Let’s take a look at the code below. What’s wrong with it?
Tip:
1. Error in the position of while loop
2. And this error only occurs when I perform head plugging

void insert(size_t pos, const char c)
{
    
    
	assert(pos >= 0 && pos <= _size);
	//1.容量不够的话要扩容
	if (_capacity == _size)
	{
    
    
		int newcapacity = _capacity == 0 ? 4 : _capacity * 2;
		reserve(newcapacity);
	}
	//2.将[pos,_size]的数据往后挪动
	//这里把_size位置的数据(也就是'\0')也往后挪动是为了
	//让我们最后的时候就不用单独再去在末尾添加'\0'了
	//因为'\0'已经被我们挪动到最后了
	size_t end = _size;
	while (end >= pos)
	{
    
    
		_str[end + 1] = _str[end];
		end--;
	}
	//3.在pos位置插入字符c
	_str[pos]=c;
	//4.修改_size
	_size++;
}

1. If the capacity is not enough, expand it
2. Move the data of [pos,_size] back
3. Insert characters at the pos position c
4. Modify _size

Let’s debug and take a look
Insert image description here
At the beginning, you can focus on the changes in _str (to understand the process of moving data back)
Wait until end When it is about to reduce to 0, look at the change of end
Insert image description here
Because end is of type size_t (that is, an unsigned integer)
and is always greater than or equal to 0< a i=5> When pos==0, while(end>=pos) is always true Therefore, it falls into an infinite loop

So what should we do?
The first solution: forced type conversion

//把[pos,_size]的数据右移
//解决方案1:强制类型转换
size_t end = _size;
while ((int)end >= (int)pos)
{
    
    
	_str[end + 1] = _str[end];
	end--;
}

The insert will run normally at this time
Insert image description here
Although I end will still reduce to -1 and become more than 4.2 billion
But in the while loop I end is forced to type conversion to int type (actually, a temporary variable of type int is generated, and then the temporary variable is used for comparison)
So the real comparison scenario is while(-1>=0) , then exit the while loop

But there is one small point:
What if I write it like this? Will it run smoothly?

int end = _size;
while (end >= pos)
{
    
    
	_str[end + 1] = _str[end];
	end--;
}

Insert image description here
The answer is: still not possible
Because when the int type end is compared with the size_t type pos
the int type will be reshaped and promoted to produce the size_t type. Temporary variables, and then use the temporary variables to compare with pos
That is, size_t type -1,-2,-3... are compared with size_t type pos
We all know that -1, -2, and -3 of the size_t type are the first few maximum values ​​of the unsigned integer type.
They must be larger than my pos
So it’s still an infinite loop

Can we only force type conversion?
Of course not, all roads lead to Rome
You can also move data in this way

//13.insert
void insert(size_t pos, const char c)
{
    
    
	assert(pos >= 0 && pos <= _size);
	if (_capacity == _size)
	{
    
    
		int newcapacity = _capacity == 0 ? 4 : _capacity * 2;
		reserve(newcapacity);
	}
	//把[pos,_size]的数据右移
	//解决方案2
	size_t end = _size + 1;
	while (end > pos)
	{
    
    
		_str[end] = _str[end - 1];
		end--;
	}
	_str[pos] = c;
	_size++;
}

At this time, when end is reduced to 0, the data has been moved
and the while loop can also exit normally
Insert image description here
Here we will use the first Two solutions
You can also use the first solution

2. Insert a string version

Similar to the operation of inserting characters
except that the moving step is not 1 but the length of the string

void insert(size_t pos, const char* str)
{
    
    
	assert(pos >= 0 && pos <= _size);
	//1.容量不够的话要扩容
	int len = strlen(str);
	int newcapacity = _size + len;
	reserve(newcapacity);
	//2.将[pos,_size]的数据往后挪动len个位置
	int end = _size + 1;
	while (end > pos)
	{
    
    
		_str[end + len - 1] = _str[end - 1];
		end--;
	}
	//3.在pos位置插入字符串的前len个字符(也就是不插入'\0')
	strncpy(_str + pos, str, len);//使用strncpy,而不能使用strcpy
	//因为我们不要拷贝'\0'
	//而strcpy是拷贝到'\0'才结束
	//4.修改_size
	_size += len;
}

1. If the capacity is not enough, expand it
2. Move the data of [pos,_size] back len positions
3. In pos Insert the first len ​​characters of the string at the position (that is, do not insert '\0')
4. Modify _size

strncpy can specify to copy n characters
instead of directly copying to ‘\0’ like strcpy
Insert image description here

2.erase

//14.erase
void erase(size_t pos = 0, size_t len = npos)
{
    
    
	assert(pos >= 0 && pos <= _size);
	//1.判断要删除的长度和[pos,_size)长度的大小
	//要删除的长度大于等于[pos,_size)的长度
	//此时直接将pos位置置为'\0'并且修改_size即可
	if (len >= _size - pos)
	{
    
    
		_str[pos] = '\0';
		_size = pos;
	}
	//要删除的长度小于[pos,_size)的长度
	//那么把[pos+len,_size]的数据往前移动len位置来覆盖我们要删除的字符
	//然后修改_size即可
	else
	{
    
    
		strcpy(_str + pos, _str + pos + len);
		_size -= len;
	}
}

1. Determine the length to be deleted and the length of [pos,_size)
2. If the length to be deleted is greater than or equal to the length of [pos,_size)
Then just set the pos position to '\0' and modify _size
3. If the length to be deleted is less than the length of [pos,_size)
Then move the data of [pos+len,_size] forward to the len position to cover the characters we want to delete
and modify _size

Note: len>=_size-pos
must be used in the if statement here, but len+pos>=_size
cannot be used because if len is If npos
then len+pos will overflow
and some unexpected errors will occur
such as this:
Insert image description here
Insert image description here
Insert image description here
The final result:
Insert image description here
We noticed that len+pos became 1 after overflowing
and our pos is 2
Therefore:
strcpy(_str + pos, _str + pos + len);
is to copy the data at position 1 to position 2 until it encounters Until '\0'
This is very absurd, so len>=_size-pos is used here

If you use len>=_size-pos
Because len itself is npos
, the branch of the if statement will be executed instead of entering else. Executing strcpy results in all kinds of strange errors
Insert image description here

7. Search, exchange, interception operations

1.find

Same as the STL library, we implement two versions
Search strings and search characters

//17.find
size_t find(const char c, size_t pos = 0) const
{
    
    
	assert(pos >= 0 && pos < _size);
	for (size_t i = pos; i < _size; i++)
	{
    
    
		if (_str[i] == c)
		{
    
    
			return i;
		}
	}
	return npos;
}

This search for characters is very simple, just traverse it once

But for this search string we need to use the strstr string matching function
Insert image description here

size_t find(const string& s, size_t pos = 0) const
{
    
    
	assert(pos >= 0 && pos < _size);
	char* index = strstr(_str + pos, s._str);
	if (index == nullptr)
	{
    
    
		return npos;
	}
	else
	{
    
    
		return index - _str;
	}
}

The only thing you need to pay attention to is:
Returning a null pointer means that it is not found, and you need to add an if judgment
if it is found
Just return index-_str (pointer-pointer problem is the basic knowledge of C language)

2.swap

//16.swap
void swap(string& s)
{
    
    
	//这里调用的是std(标准库中的那个swap函数)
	std::swap(_str, s._str);
	std::swap(_size, s._size);
	std::swap(_capacity, s._capacity);
}

The implementation of swap here is actually
exchanging pointers, exchanging _size and _capacity
The advantage of this is that there is no need to copy strings. , but just exchange pointers directly, which greatly optimizes the efficiency of swap
It can be seen that this design is very clever

3.substr

Please take a look at this code to see if there is anything wrong with it.

string& substr(size_t pos = 0, size_t len = npos)const
{
    
    
	assert(pos >= 0 && pos < _size);
	string s;
	//1.判断要截取的长度和[pos,_size)长度的大小
	//要截取的长度大于[pos,_size)的长度  ->  那么修改len方便后续形成子串
	if (len >= _size - pos)
	{
    
    
		len = _size - pos;
	}
	//2.插入数据形成子串
	for (size_t i = 0; i < len; i++)
	{
    
    
		s += _str[i + pos];
	}
	//3.返回子串
	return s;
}

1. Determine the length to be intercepted and the length of [pos,_size). If it is larger, then modify len
2. Insert data to form a substring< a i=2> 3. Return substring

In fact, there is a problem with this code, but it is not a problem with our implementation logic
but a problem with this reference as the return value
Insert image description here
Insert image description here

Because our s string is a local variable in the substr function
It will be automatically destroyed (calling the destructor) after going out of scope
But because we use a reference as the return value
, the object s1.substr(1,7) is the s string we just destructed
and then use this The destructed string is copied to s2 for construction
Insert image description here
, so during the copy construction
strcpy accesses s._str, resulting in a dereference operation of the null pointer
Therefore an error is reported

What should I do?
Just change the reference as the return value to the value as the return value
and it will be fine
But a good premise must be that your copy construction and assignment operator overloading use deep copies instead of shallow copies
Otherwise, errors of multiple releases of the same space will occur< a i=5> Next, I will kill the copy constructor Then the copy constructor formed by the compiler by default will be used (the default is a shallow copy) At this time, although you are copying the value, you are shallow copy In other words, the pointer returned is my pointer to the destructed space Therefore, in When s2 is destroyed, that space will be destroyed again
Insert image description here


Insert image description here


That is, the problem of multiple releases of the same space occurs
That is, the problem of wild pointers

Next I restore the copy constructor

string substr(size_t pos = 0, size_t len = npos)const
{
    
    
	assert(pos >= 0 && pos < _size);
	string s;
	//1.判断要截取的长度和[pos,_size)长度的大小
	//要截取的长度大于[pos,_size)的长度  ->  那么修改len方便后续形成子串
	if (len >= _size - pos)
	{
    
    
		len = _size - pos;
	}
	//2.插入数据形成子串
	for (size_t i = 0; i < len; i++)
	{
    
    
		s += _str[i + pos];
	}
	//3.返回子串
	return s;
}

It is running normally at this time
Insert image description here
The whole process is:
The s string return value is copied and constructed to form a temporary variable
Then return the temporary variable
and then use that temporary variable to perform copy construction on s2

Knowledge points about references:
Introduction to C++ - References
Knowledge points about deep and shallow copies
C++ classes and Detailed explanation of objects (constructor, destructor, copy constructor)

8. Several small functions

1.c_str

//4.返回C风格的字符串
const char* c_str() const
{
    
    
	return _str;
}

2.clear

//19.clear
void clear()
{
    
    
	_str[0] = '\0';//关于这里为什么要把第一个字符置为'\0'
	//是因为我们要保证
	//此时即使使用cout<<c_str()来打印string对象时也不会跟使用流插入打印string对象产生歧义
	//因为使用cout<<c_str()是打印到'\0'才会停止
	//而流插入是按照size来打印的
	_size = 0;
}

3.empty

//20.empty
bool empty()const
{
    
    
	return _size == 0;
}

9. Stream insertion and stream extraction

For stream insertion and stream extraction, we have touched on the date class before.
cannot be overloaded as a member because it will occupy the first position with this pointer.
So it needs to be defined as global
But there is no need to use friends, because the interface can be called to access the data in the class

1.<<Stream insertion

Print according to _size

ostream& operator<<(ostream& out, const string& s)
{
    
    
	for (int i = 0; i < s.size(); i++)
	{
    
    
		out << s[i];
	}
	return out;
}

The difference between << and c_str(): <<Print according to size, it has nothing to do with \0, and c_str() ends when it encounters \0
Insert image description here
Therefore it is recommended to use <<

2.>>Stream extraction

cin, like scanf, will stop after reading ' ' or '\n'
Therefore, we need to read character by character

So use the get() function to read characters
First version:

istream& operator>>(istream& in, string& s)
{
    
    
//注意:使用流提取时,会覆盖原有数据
//因此需要先clear一下
	s.clear();
	char ch = in.get();
	while (ch != ' ' && ch != '\n')
	{
    
    
		s += ch;
		ch = in.get();
	}
	return in;
}

But the first version has a flaw:
When the data we want to read is particularly large,
will frequently expand the capacity. Cause unnecessary trouble
Therefore we can set up a "buffer"-like array
Second version:

istream& operator>>(istream& in, string& s)
{
    
    
	s.clear();
	char tmp[128] = {
    
     '\0' };
	char ch = in.get();
	size_t index = 0;
	while (ch != ' ' && ch != '\n')
	{
    
    
		if(index == 127)
		{
    
    
			s += tmp;
			index = 0;
		}
		tmp[index++] = ch;
		ch = in.get();
	}
	if (index >= 0)
	{
    
    
		tmp[index] = '\0';
		s += tmp;
	}
	return in;
}

10. Complete code

1.my_string.h:

#define _CRT_SECURE_NO_WARNINGS 1
#pragma once
#include <string>
#include <iostream>
using namespace std;
#include <assert.h>
//小函数定义在.h里面,弄成内联函数
namespace wzs
{
    
    
	class string
	{
    
    
	public:
		//npos
		static size_t npos;
		//1.全缺省的构造函数
		string(const char* str = "");
		~string();
		//2.拷贝构造函数
		string(const string& s);
		//18.substr
		string substr(size_t pos = 0, size_t len = npos)const;
		//3.赋值运算符重载
		string& operator=(string s);
		//4.返回C风格的字符串
		const char* c_str() const
		{
    
    
			return _str;
		}
		//5.iterator
		typedef char* iterator;
		iterator begin()
		{
    
    
			return _str;
		}
		iterator end()
		{
    
    
			return _str + _size;
		}
		typedef const char* const_iterator;
		const_iterator begin() const
		{
    
    
			return _str;
		}
		const_iterator end() const
		{
    
    
			return _str + _size;
		}
		//6.流插入运算符重载
		//friend ostream& operator<<(ostream& out, const string& s);
		//7.operator[]重载
		char& operator[](int index);
		const char& operator[](int index) const;
		//8.size() capacity()
		const int size() const
		{
    
    
			return _size;
		}
		const int capacity() const
		{
    
    
			return _capacity;
		}
		//9..reserve:这里不允许缩容
		void reserve(int capacity);
		//10.push_back
		void push_back(const char c);
		//11.append
		void append(const char* str);
		//12.+=
		string& operator+=(const char c);
		string& operator+=(const char* str);
		//13.insert
		void insert(size_t pos, const char c);
		void insert(size_t pos, const char* str);
		//14.erase
		void erase(size_t pos = 0, size_t len = npos);
		//15.resize
		void resize(size_t n, char c = '\0');
		//16.swap
		void swap(string& s);
		//17.find
		size_t find(const string& s, size_t pos = 0) const;
		size_t find(const char c, size_t pos = 0) const;
		//19.clear
		void clear()
		{
    
    
			_str[0] = '\0';
			_size = 0;
		}
		//20.empty
		bool empty()const
		{
    
    
			return _size == 0;
		}
		//21.>>
		//friend istream& operator>>(istream& in, string& s);
	private:
		char* _str;
		int _size;
		int _capacity;
	};
	ostream& operator<<(ostream& out, const string& s);
	istream& operator>>(istream& in, string& s);
	void test1();
	void test2();
	void test3();
	void test4();
	void test5();
	void test6();
	void test7();
	void test8();
	void test9();
	void test10();
	void test11();
	void test12();
	void test13();
	void test14();
	void test15();
	void test16();
	void test17();
}

2.my_string.cpp

#include "my_string.h"
namespace wzs
{
    
    
		//1.全缺省的构造函数
		string::string(const char* str)
		{
    
    
			int len = strlen(str);
			_str = new char[len + 1];
			strcpy(_str, str);
			_size = len;
			_capacity = len;
		}
		string::~string()
		{
    
    
			delete[] _str;
			_str = nullptr;
			_size = 0;
			_capacity = 0;
		}
		//2.拷贝构造函数
		/*
		string(const string& s)
		{
			int len = s._size;
			_str = new char[len + 1];
			strcpy(_str, s._str);
			_size = len;
			_capacity = len;
		}
		*/
		//拷贝构造的现代写法
		//有些编译器上有问题
		//因为如果_str没有被初始化为nullptr
		//那么就完了,因为tmp除了这个构造函数就会调用析构
		//所以我们可以先给缺省值或者初始化列表来让这个_str初始化为nullptr
		string::string(const string& s)
			:_str(nullptr)
			, _size(0)
			, _capacity(0)
		{
    
    
			//先构造
			string tmp(s.c_str());
			//再交换
			swap(tmp);
		}
		//18.substr
		string string::substr(size_t pos, size_t len)const
		{
    
    
			assert(pos >= 0 && pos < _size);
			string s;
			if (len >= _size - pos)
			{
    
    
				len = _size - pos;
			}
			for (size_t i = 0; i < len; i++)
			{
    
    
				s += _str[i + pos];
			}
			return s;
		}
		//3.赋值运算符重载
		/*
		string& operator=(const string& s)
		{
			if (this != &s)
			{
				int len = s._size;
				char* tmp = new char[len + 1];
				strcpy(tmp, s._str);
				delete[] _str;
				_str = tmp;
				tmp = nullptr;
				_size = len;
				_capacity = len;
			}
			return *this;
		}
		*/
		//赋值运算符重载的现代写法1
		/*
		string& operator=(const string& s)
		{
			if (this != &s)
			{
				//先拷贝构造
				string tmp(s);
				//再交换
				swap(tmp);
				return *this;
			}
		}
		*/
		//赋值运算符重载的现代写法2
		//直接传值调用
		//传参的时候值传参  ->  调用拷贝构造函数传参
		string& string::operator=(string s)
		{
    
    
			//没有必要判断是否是自己给自己赋值
			//因为这个对象已经拷贝出来了,覆水已经难收
			swap(s);
			//*this原来的空间是这个形参s释放的
			return *this;
		}
		//7.operator[]重载
		char& string::operator[](int index)
		{
    
    
			//stl中的string允许[]访问最后的'\0'
			assert(index >= 0 && index <= _size);
			return _str[index];
		}
		const char& string::operator[](int index) const
		{
    
    
			//stl中的string允许[]访问最后的'\0'
			assert(index >= 0 && index <= _size);
			return _str[index];
		}
		//9..reserve:这里不允许缩容
		void string::reserve(int capacity)
		{
    
    
			if (_capacity < capacity)
			{
    
    
				char* tmp = new char[capacity + 1];
				strcpy(tmp, _str);
				delete[] _str;
				_str = tmp;
				tmp = nullptr;
				_capacity = capacity;
			}
		}
		//10.push_back
		void string::push_back(const char c)
		{
    
    
			if (_capacity == _size)
			{
    
    
				int newcapacity = _capacity == 0 ? 4 : _capacity * 2;
				reserve(newcapacity);
			}
			_str[_size] = c;
			_size++;
			_str[_size] = '\0';
		}
		//11.append
		void string::append(const char* str)
		{
    
    
			int len = strlen(str);
			int newcapacity = _size + len;
			reserve(newcapacity);
			strcpy(_str + _size, str);
			_size += len;
		}
		//12.+=
		string& string::operator+=(const char c)
		{
    
    
			push_back(c);
			return *this;
		}
		string& string::operator+=(const char* str)
		{
    
    
			append(str);
			return *this;
		}
		//13.insert
		void string::insert(size_t pos, const char c)
		{
    
    
			assert(pos >= 0 && pos <= _size);
			if (_capacity == _size)
			{
    
    
				int newcapacity = _capacity == 0 ? 4 : _capacity * 2;
				reserve(newcapacity);
			}
			//把[pos,_size]的数据右移
			//size_t end = _size;
			//while (end >= pos)
			//{
    
    
			//	_str[end + 1] = _str[end];
			//	end--;
			//}
			//解决方案1:强制类型转换
			//int end = _size;
			//while (end >= pos)
			//{
    
    
			//	_str[end + 1] = _str[end];
			//	end--;
			//}

			//解决方案2
			size_t end = _size + 1;
			while (end > pos)
			{
    
    
				_str[end] = _str[end - 1];
				end--;
			}

			_str[pos] = c;
			_size++;
		}
		void string::insert(size_t pos, const char* str)
		{
    
    
			assert(pos >= 0 && pos <= _size);
			int len = strlen(str);
			int newcapacity = _size + len;
			reserve(newcapacity);
			//[pos,_size]的数据右移len位
			int end = _size + 1;
			while (end > pos)
			{
    
    
				_str[end + len - 1] = _str[end - 1];
				end--;
			}
			strncpy(_str + pos, str, len);//使用strncpy,而不能使用strcpy
			//因为我们不要拷贝'\0'
			//而strcpy是拷贝到'\0'才结束
			_size += len;
		}
		//14.erase
		void string::erase(size_t pos, size_t len)
		{
    
    
			assert(pos >= 0 && pos <= _size);
			//要删除的长度大于[pos,_size)的长度
			if (len >= _size - pos)
				//if (len + pos >= _size)
			{
    
    
				_str[pos] = '\0';
				_size = pos;
			}
			//把[pos+len,_size]的数据往前移动len位置
			else
			{
    
    
				strcpy(_str + pos, _str + pos + len);
				_size -= len;
			}
		}
		//15.resize
		void string::resize(size_t n, char c)
		{
    
    
			//1.n<_size  :  删除多余字符,修改_size,但不修改_capacity
			if (n < _size)
			{
    
    
				erase(n);//erase负责修改_size
			}
			//2._size<=n<=_capacity  :  尾插字符c直到_size == n
			else if (n <= _capacity)
			{
    
    
				while (_size < n)
				{
    
    
					push_back(c);//push_back负责修改_size
				}
			}
			//3.n>_capacity:  需要reserve
			else
			{
    
    
				reserve(n);//reserve负责修改_capacity
				while (_size < n)
				{
    
    
					push_back(c);//push_back负责修改_size
				}
			}
		}
		//16.swap
		void string::swap(string& s)
		{
    
    
			std::swap(_str, s._str);
			std::swap(_size, s._size);
			std::swap(_capacity, s._capacity);
		}
		//17.find
		size_t string::find(const string& s, size_t pos) const
		{
    
    
			assert(pos >= 0 && pos < _size);
			char* index = strstr(_str + pos, s._str);
			if (index == nullptr)
			{
    
    
				return npos;
			}
			else
			{
    
    
				return index - _str;
			}
		}
		size_t string::find(const char c, size_t pos) const
		{
    
    
			assert(pos >= 0 && pos < _size);
			for (size_t i = pos; i < _size; i++)
			{
    
    
				if (_str[i] == c)
				{
    
    
					return i;
				}
			}
			return npos;
		}
	//static size_t string::npos = -1;//err
	size_t string::npos = -1;//yes
	ostream& operator<<(ostream& out, const string& s)
	{
    
    
		for (int i = 0; i < s.size(); i++)
		{
    
    
			out << s[i];
		}
		return out;
	}
	istream& operator>>(istream& in, string& s)
	{
    
    
		//库里的流提取输入的时候会覆盖之前的旧内容!!
		//运行看一下
		//因此要clear
		s.clear();//clear一般不会释放空间,就是把第一个元素置为'\0'并且把size置为0
		//如果clear不把第一个元素置为'\0'
		//那么>>和c_str就不一样了!!!!
		//因为>>是按size打印,c_str打印到'\0'之前为止
		char tmp[128] = {
    
     '\0' };
		//get是istream类型调用的成员函数
		char ch = in.get();
		//get()是C++的,getchar()是C语言的,C++和C语言的缓冲区是不一样的,(C++是可以设计成不兼容C语言的)
		//
		//尽管C++是兼容C语言的,不过尽量还是用C++的
		size_t index = 0;
		//getline其实是当ch遇到'\n'之后才会停止while循环!!!!
		while (ch != ' ' && ch != '\n')
		{
    
    
			//index是下一个要插入数据的下标
			//注意:这里是127时就要放s里面并清空
			//因为tmp的最后一个数据一定要保证是'\0'!!!!!!
			if (index == 127)
			{
    
    
				s += tmp;
				index = 0;
			}
			tmp[index++] = ch;
			ch = in.get();
		}
		if (index >= 0)
		{
    
    
			tmp[index] = '\0';
			s += tmp;
		}
		return in;
	}
}

3.test.cpp

#include "my_string.h"
namespace wzs
{
    
    
	void test1()
	{
    
    
		const string s1("hello world");
		string s2;
		cout << s1.c_str() << endl;
		cout << s2.c_str() << endl;
		string s3(s1);
		cout << s3.c_str() << endl;
		s2 = s1;
		cout << s2.c_str() << endl;
	}
	void test2()
	{
    
    
		//string s2 = "hello iterator";
		//string::iterator it = s2.begin();
		//while (it != s2.end())//注意:我们使用iterator访问和遍历时要注意左闭右开使用[begin,end)
		//{
    
    
		//	cout << *it << " ";//这里可以暂时理解为像是指针解引用的用法一样
		//	it++;//这里可以暂时理解为像是指针自增(也就是后移)的用法一样
		//}
		//cout << endl;
		//cout << s2 << endl;
		string s2 = "hello iterator";
		string::iterator it = s2.begin();
		while (it != s2.end())//注意:我们使用iterator访问和遍历时要注意左闭右开使用[begin,end)
		{
    
    
			*it += 1;//(*it)++;这样也可以,不过不要忘了加小括号(运算符优先级的问题)
			cout << *it << " ";//这里可以暂时理解为像是指针解引用的用法一样
			it++;//这里可以暂时理解为像是指针自增(也就是后移)的用法一样
		}
		cout << endl;
		cout << s2 << endl;


	}
	void test3()
	{
    
    
		std::string s("hello world");
		char c = s[s.size()];

		cout << s[s.size()] << endl;
	}
	void test4()
	{
    
    
		string s1("hello world");
		s1.reserve(100);

	}
	void test5()
	{
    
    
		string s1;
		int old_capacity = s1.capacity();
		cout << old_capacity << endl;
		cout << s1 << endl;
		for (int i = 0; i < 100; i++)
		{
    
    
			s1.push_back('w');//将'w'这个字符尾插进入s1当中
			if (old_capacity != s1.capacity())
			{
    
    
				cout << s1.capacity() << endl;
				old_capacity = s1.capacity();
			}
		}
		cout << s1 << endl;
	}
	void test6()
	{
    
    
		string s("hello world");
		/*s.push_back('2');
		s.push_back('3');
		s.append(" 1124");*/
		s += "2";
		s += '3';
		s += " 1124";
		cout << s << endl;
	}
	void test7()
	{
    
    
		string s("[hello world]");
		s.insert(0, 'w');//从0位置头插一个字符:'w'
		cout << s << endl;
	}
	void test8()
	{
    
    
		string s("0123456789");
		s.erase(3, 4);//从3号下标位置开始删除4个字符,也就是删除了3456
		cout << s << endl;
		s.erase(2);//默认从2号下标开始的删除所有字符
		cout << s << endl;
		s.erase();//默认删除所有字符
		cout << s << endl;
	}
	void test9()
	{
    
    
		//string s1("hello world");
		1.n<size
		//s1.resize(4);
		//cout << s1 << endl;
		//string s1("hello world");
		//2.size<n<capacity
		//cout << s1.capacity() << endl;
		s1.resize(13,'q');
		//s1.resize(30, '2');
		//cout << s1;

		string s1;
		s1.resize(10, 'w');
		cout << s1 << endl;
	}
	void test10()
	{
    
    
		string s1("123");
		string s2("456");
		s1.swap(s2);
		cout << s1 << endl;
		cout << s2 << endl;
	}
	void test11()
	{
    
    
		string s1("abcd 1234 xxxx");
		size_t index1 = s1.find("cd", 1);//从1号下标开始查找字符串cd
		cout << index1 << endl;
		size_t index3 = s1.find(' ', 1);//从1号下标开始查找字符' '
		cout << index3 << endl;

		//查不到的情况:
		size_t index4 = s1.find("abcd", 1);//从1号下标开始查找字符串abcd  ->  查不到,返回npos
		cout << index4 << endl;//无符号整形最大值

	}
	void test12()
	{
    
    
		string s1("hello world");
		string s2 = s1.substr(1, 7);
		cout << s2 << endl;
	}
	void test13()
	{
    
    
		string s("http://www.baidu.com/index.html?name=mo&age=25#dowell");
		string substr1, substr2, substr3;
		//我们的目标是:
		//substr1:http
		//substr2:www.baidu.com
		//substr3:index.html?name=mo&age=25#dowell
		size_t pos1 = s.find(':', 0);//从0下标开始出发查找':'
		substr1 = s.substr(0, pos1 - 0);//[0,pos1):左闭右开的区间:长度是右区间-左区间 也就是pos1-0
		size_t pos2 = s.find('/', pos1 + 3);//pos1位置此时是':'  我们下一次要从pos1+3的位置开始查找 也就是第一个'w'的位置
		substr2 = s.substr(pos1 + 3, pos2 - (pos1 + 3));//[pos1+3,pos2):这个区间内的子串
		substr3 = s.substr(pos2 + 1);//从pos2+1开始一直截取到最后即可
		//substr传值返回,是临时变量,临时变量具有常性
		cout << substr1 << endl;
		cout << substr2 << endl;
		cout << substr3 << endl;

	}
	void test14()
	{
    
    
		//char ch1, ch2;
		//cin >> ch1 >> ch2;
		//因为输入多个值时,' '和'\n'默认是分割符,不是有效数据!!!!!!
		//scanf也是拿不到的
		string s1, s2, s3;
		cin >> s1 >> s2 >> s3;
		cout << s1 << endl << s2 << endl << s3 << endl;
	}
	void test15()
	{
    
    
		string s1("012345");
		cout << s1 << endl;
		cout << s1.c_str() << endl;

		s1.insert(2, '\0');
		cout << s1 << endl;
		cout << s1.c_str() << endl;
	}
	void test16()
	{
    
    
		string s1("hello world");
		cout << s1 << endl;
		cout << s1.c_str() << endl;
		//此时clear没有把第一个元素置为'\0'
		s1.clear();
		cout << s1 << endl;
		cout << s1.c_str() << endl;
	}

	//整形转字符串:to_string
	//stoi:字符串转整形
	void test17()
	{
    
    
		//整形转字符串:to_string
		std::string s = std::to_string(123);
		//stoi:字符串转整形
		int i = stoi(s);
		cout << s << endl;
		cout << i << endl;
	}
}

The above is the entire content of C++: string simulation implementation. I hope it can be helpful to everyone!

Guess you like

Origin blog.csdn.net/Wzs040810/article/details/134591693