[C++] Simple simulation to implement string template

Table of contents

I. Introduction

2.string.h header file

1.class construction

2.Constructor and destructor

①Constructor

②Destructor

3.c_str function (get c string)

4.empty function (judging empty)

5.push_back function (tail insertion)

6.reserve function (expansion)

 7.append function (merge)

8. Overload ‘+=’

9. Overload ‘=’

10. Overload ‘[]’

11.insert function (insert)

①insert character

②Insert string

12.find function (find)

①find characters

②find string

13.erase function (delete)

14.resize function (redefine length)

15.substr function (get substring)

16.swap function (exchange)

17.<<Stream insertion and>>Stream extraction

①Stream insertion

②Stream extraction

18. Iterator settings

19. Size comparison

 3. Source code


I. Introduction

        This article will implement some commonly used functions in the string template in C++ through simulation, including the simulation of push_back, reserve and other functions. However, it does not take into account all situations, so there are not too many function overloads (because of the heavy load of string). There are really too many...)

        This article has two parts, one part is used to implement string, the header file is named string.h, and the other part is the source code implementation part (the compiler environment is VS2019). The return value type and parameter type of each function are the same as string in the library.

2.string.h header file

1.class construction

        The string class has three members in total, _size represents the length, _capacity represents the maximum capacity, and the last one is used to represent the stored string _str. The specific implementation is as follows:

#pragma once
#include <assert.h>

namespace str
{
	class string
	{
    public:

    private:
		size_t _size;
		size_t _capacity;
		char* _str;
    }
};

2.Constructor and destructor

        When it comes to custom classes, constructors and destructors are indispensable, so let’s simply implement these two functions.

①Constructor

//构造函数(带缺省也可为默认构造)
string(const char* str = "")
{
	_size = strlen(str);
	_capacity = _size;
	_str = new char[_capacity + 1];
	//拷贝_size+1是因为后面还要跟个'\0'
	memcpy(_str, str, _size + 1);
}

//深拷贝
string(const string& str)
{
	_str = new char(str._capacity + 1);
	_size = str._size;
	_capacity = str._capacity;
	memcpy(_str, str._str, _size + 1);
}

         There are two commonly used constructors, one is the initialization of a constant string, and the other is the initialization of the same type of string. Since it is the same type of string initialization, a deep copy is required.

②Destructor

//析构函数
~string()
{
	delete[] _str;
    _str = nullptr;
	_size = _capacity = 0;
}

3.c_str function (get c string)

        This function is simple and easy to understand, just return the member string directly.

//c_str函数(返回C字符串)
const char* c_str() const
{
	return _str;
}

4.empty function (judging empty)

        It is very simple, directly determine whether the size is 0.

//empty函数(判断空)
bool empty() const
{
	return _size == 0;
}

5.push_back function (tail insertion)

        The simplest function to insert elements, but it is worth noting that the class must always pay attention to the size of the memory allocation, so when inserting data at the end, be sure to check whether the allocated space is enough.

//push_back函数(尾插)
void push_back(char ch)
{
	if (_size == _capacity)
	    //2倍扩容
	    reserve(_capacity == 0 ? 4 : _capacity * 2);
			
	_str[_size++] = ch;
	_str[_size] = '\0';
}

        The reserve function used here is the expansion function, and the specific expansion size can be adjusted according to the actual situation.

6.reserve function (expansion)

        The function of the reserve function is only for capacity expansion. If the size of the class is larger than the redefined size when the function is called, no action is required. Although the size can be destroyed and rebuilt to save space, such repeated copies will make the operation The time is greatly increased, so it is not recommended to reconstruct in any situation. Just expand the capacity. The implementation is as follows:

//reserve函数(扩容)
void reserve(size_t n)
{
	if (n > _capacity)
	{
		char* tmp = new char[n + 1];
		memcpy(tmp, _str, _size + 1);
		delete[] _str;
		_capacity = n;
		_str = tmp;
	}
}

 7.append function (merge)

        This function is not much different from the push_back implementation function above. It is just that the target string is added at the end of the class, so I won’t go into details here.

//append函数(合并)
void append(const char* str)
{
	size_t len = strlen(str);
	if (_size + len > _capacity)
	{
		//至少扩容至刚好
		reserve(_size + len);
	}

	memcpy(_str + _size, str, len + 1);
	_size += len;
}

8. Overload ‘+=’

        Implementation and library function There are two commonly used += in the string class: += character and += string. The implementation process is very simple. Characters directly call the previous push_back function, while strings call the append function.

//重载'+='
string& operator+=(const char ch)
{
	push_back(ch);
	return *this;
}

string& operator+=(const char* str)
{
    append(str);
	return *this;
}

9. Overload ‘=’

        Because a deep copy is required, all must be overloaded =. You can leave the original class size blank and then use the append function to add it to the class.

//重载'='
string& operator=(const char* ch)
{
	_size = 0;
	_str[0] = '\0';

	append(ch);
	return *this;
}

10. Overload ‘[]’

        [] is to obtain the data stored in the corresponding lower part, which is the same as obtaining the subscript of an array string, and can be obtained directly.

//获取对应下标
char& operator[](size_t pos)
{
	assert(_str[pos]);

	return _str[pos];
}

const char& operator[](size_t pos) const
{
	assert(_str[pos]);

	return _str[pos];
}

11.insert function (insert)

        The insert function has three parameters. Commonly used characters and strings are inserted, so you need to overload this function. The specific implementation is as follows. The required attention points are marked in the following code.

①insert character

//insert函数(插入字符)
void insert(size_t pos,size_t n,char ch)
{
	//排除下标越界
	assert(pos <= _size);
			
	//检查扩容
	if (_size + n > _capacity)
	{
		reserve(_size + n);
	}

	//挪动数据
	//此处防止整形提升,故使用size_t类型,并使用npos加以区分
	size_t end = _size;
	while (end >= pos && end != npos)
	{
		_str[end + n] = _str[end];
		--end;
	}

	//插入数据
	for (size_t i = 0; i < n; ++i)
	{
		_str[pos + i] = ch;
	}

	_size += n;
}

②Insert string

//insert函数(插入字符串)
void insert(size_t pos, const char* str)
{
	//排除下标越界
	assert(pos <= _size);

	//检查扩容
	size_t len = strlen(str);
	if (pos + len > _capacity)
	{
		reserve(pos + len);
	}

	//挪动数据
	size_t end = _size;
	while (end >= pos && end != npos)
	{
		_str[end + len] = _str[end];
		end--;
	}

	//插入数据
	for (int i = 0; i < len; ++i)
	{
		_str[pos + i] = str[i];
	}

	_size += len;
}

        Among them, npos is the static area unsigned integer type we defined in the namespace of this class. It is sounded in the string class and defined in the namespace, with a size of "-1".

12.find function (find)

        The search function is similar to insert. The search is divided into characters and strings. The other parameter is where to start. The return value is the target subscript. The specific implementation is as follows:

①find characters

//find函数(查找字符)
size_t find(char ch, size_t pos = 0)
{
	assert(pos < _size);

	size_t end = _size;
	while (pos < end)
	{
		if (_str[pos] == ch)
			return pos;
		pos++;
	}
	return npos;
}

②find string

//find函数(查找字符串)
size_t find(const char* str, size_t pos = 0)
{
	size_t len = strlen(str);
	assert(pos + len < _size);
			
	//使用strstr比较字符串
	const char* tmp = strstr(_str + pos, str);
	if (tmp)
	{
		return tmp - _str;
	}
	return npos;
}

13.erase function (delete)

        The function of this function is to delete a specified amount of data from the target subscript. If the size is not specified, it will be deleted to the end of the saved data by default. The most important thing to pay attention to in this function is the judgment of the target subscript and the judgment of the specified deletion size.

//erase函数(删除)
void erase(size_t pos, size_t len = npos)
{
	if (len == npos || pos + len >= _size)
	{
		_str[pos] = '\0';
		_size = pos;
	}
	else
	{
		size_t end = pos + len;
		while (end <= _size)
		{
			_str[pos++] = _str[end++];
		}
		_size -= len;
	}
}

14.resize function (redefine length)

        This function is very similar to reserve, but the difference is that the function of reserve is only capacity expansion; resize can not only expand capacity (of course, capacity reduction cannot be done), but also update the length of member strings in the class.

//resize函数(重定义长度)
void resize(size_t n, char s = '\0')
{
	if (n >= _size)
	{
		reserve(n);

		for (size_t i = _size; i < n; ++i)
		{
			_str[i] = s;
		}
	}

	_size = n;
	_str[n] = '\0';
}

15.substr function (get substring)

        The acquisition of substring requires two parameters: target subscript and substring size. The return here uses a deep copy of the constructor, if not, an error will be reported directly.

//substr函数(子字符串获取)
string substr(size_t pos = 0, size_t len = npos)
{
	assert(pos < _size);

	//特殊情况更改后一起处理
	size_t n = len;
	if (len == npos || len + pos > _size)
	{
		n = _size - pos;
	}

	string retun;
	retun.reserve(n);
	for (int i = pos; i < pos + n; ++i)
	{
		//此处使用+=更方便
		retun += _str[i];
	}

	return retun;
}

16.swap function (exchange)

        When calling the swap function, you can directly call the string swap function in the library for implementation. It should be noted that _size and _capacity will not be exchanged.

//swap函数(交换)
void swap(string& str)
{
	std::swap(_str, str._str);

	size_t tmp = _size;
	_size = str._size;
	str._size = tmp;

	tmp = _capacity;
	_capacity = str._capacity;
	str._capacity = tmp;
}

17.<<Stream insertion and>>Stream extraction

        There are two points that need to be noted in stream insertion and stream extraction: the first is that both need to be defined in the namespace outside the class, and secondly, the first parameter type of the two is the alias of the insertion stream\extraction stream, because Zu Shiye The anti-copy function is set up...The implementation is as follows:

①Stream insertion

//流插入
ostream& operator<<(ostream& out, const str::string& str)
{
	for (auto i : str)
	{
		out << i;
	}

	return out;
}

        The for here uses an iterator, but we have not defined it ourselves. We will set it up in the next point.

②Stream extraction

        Stream extraction is different from stream insertion. The issue of capacity expansion needs to be considered. You can directly call the reserve function continuously, but the operating efficiency is really poor, so the following expansion is an optimized version.

//流提取(定义在类外的命名空间内)
istream& operator>>(istream& in, str::string& str)
{
	str.clear();

    //使用get()防止无法写入' '
	char ch = in.get();
	while (ch == ' ' || ch == '\n')
	{
		ch = in.get();
	}

	int i = 0;
	char tmp[128];
	while (ch != ' ' && ch != '\n')
	{
		tmp[i++] = ch;
		
		if (i == 127)
		{
			tmp[i] = '\0';
			str += tmp;

			i = 0;
		}
		ch = in.get();
	}
	if (i != 0)
	{
		tmp[i] = '\0';
		str += tmp;
	}

	return in;
}

//清空数据(定义在类内)
void clear()
{
	_str[0] = '\0';
	_size = 0;
}

18. Iterator settings

        There is nothing much to say about iterators. Just set the initial position and end position, but the names must be the same as the original aliases such as begin and end, otherwise the ancestor’s iterator will not be recognized~

//迭代器设置
typedef char* iterator;
typedef const char* const_iterator;
iterator begin()
{
	return _str;
}

iterator end()
{
	return _str + _size;
}

const_iterator begin() const
{
	return _str;
}

const_iterator end() const
{
	return _str + _size;
}

19. Size comparison

        Only two function overloads are needed to compare the size, and the others can be introduced one by one. The comparison is as follows:

//大小比较
bool operator<(const string& str) const
{
	return _size < str._size && memcmp(_str, str._str, _size) < 0;
}

bool operator==(const string& str) const
{
	return _size == str._size && memcmp(_str, str._str, _size) == 0;
}

bool operator>(const string& str) const
{
	return !(*this < str || *this == str);
}

bool operator>=(const string& str) const
{
	return !(*this < str);
}

bool operator<=(const string& str) const
{
	return !(*this > str);
}

bool operator!=(const string& str) const
{
	return !(*this == str);
}

 3. Source code

#pragma once
#include <assert.h>

namespace str
{
	class string
	{
	public:
		//构造函数(带缺省也可为默认构造)
		string(const char* str = "")
		{
			_size = strlen(str);
			_capacity = _size;
			_str = new char[_capacity + 1];
			//strcpy(_str, str);
			memcpy(_str, str, _size + 1);
		}

		//深拷贝
		string(const string& str)
		{
			_str = new char(str._capacity + 1);
			_size = str._size;
			_capacity = str._capacity;
			memcpy(_str, str._str, _size + 1);
		}

		//析构函数
		~string()
		{
			delete[] _str;
			_str = nullptr;
			_size = _capacity = 0;
		}

		//迭代器设置
		typedef char* iterator;
		typedef const char* const_iterator;
		iterator begin()
		{
			return _str;
		}

		iterator end()
		{
			return _str + _size;
		}

		const_iterator begin() const
		{
			return _str;
		}

		const_iterator end() const
		{
			return _str + _size;
		}

		//reserve函数(扩容)
		void reserve(size_t n)
		{
			if (n > _capacity)
			{
				//new的[]是容量大小,()是初始化内容
				char* tmp = new char[n + 1];
				strcpy(tmp, _str);
				delete[] _str;
				_capacity = n;
				_str = tmp;
			}
		}

		//push_back函数(尾插)
		void push_back(char ch)
		{
			if (_size == _capacity)
				//2倍扩容
				reserve(_capacity == 0 ? 4 : _capacity * 2);
			
			_str[_size++] = ch;
			_str[_size] = '\0';
		}

		//append函数(合并)
		void append(const char* str)
		{
			size_t len = strlen(str);
			if (_size + len > _capacity)
			{
				//至少扩容至刚好
				reserve(_size + len);
			}

			memcpy(_str + _size, str, len + 1);
			_size += len;
		}

		//重载'='
		string& operator=(const char* ch)
		{
			_size = 0;
			_str[0] = '\0';

			append(ch);
			return *this;
		}

		//重载'+='
		string& operator+=(const char ch)
		{
			push_back(ch);
			return *this;
		}

		string& operator+=(const char* str)
		{
			append(str);
			return *this;
		}

		//insert函数(插入)
		void insert(size_t pos,size_t n,char ch)
		{
			//排除下标越界
			assert(pos <= _size);
			
			//检查扩容
			if (_size + n > _capacity)
			{
				reserve(_size + n);
			}

			//挪动数据
			//此处防止整形提升,故使用size_t类型,并使用npos加以区分
			size_t end = _size;
			while (end >= pos && end != npos)
			{
				_str[end + n] = _str[end];
				--end;
			}

			//插入数据
			for (size_t i = 0; i < n; ++i)
			{
				_str[pos + i] = ch;
			}

			_size += n;
		}

		void insert(size_t pos, const char* str)
		{
			//排除下标越界
			assert(pos <= _size);

			//检查扩容
			size_t len = strlen(str);
			if (pos + len > _capacity)
			{
				reserve(pos + len);
			}

			//挪动数据
			size_t end = _size;
			while (end >= pos && end != npos)
			{
				_str[end + len] = _str[end];
				end--;
			}

			//插入数据
			for (int i = 0; i < len; ++i)
			{
				_str[pos + i] = str[i];
			}

			_size += len;
		}

		//erase函数(删除)
		void erase(size_t pos, size_t len = npos)
		{
			if (len == npos || pos + len >= _size)
			{
				_str[pos] = '\0';
				_size = pos;
			}
			else
			{
				size_t end = pos + len;
				while (end <= _size)
				{
					_str[pos++] = _str[end++];
				}
				_size -= len;
			}
		}

		//find函数(查找字符或字符串)
		size_t find(char ch, size_t pos = 0)
		{
			assert(pos < _size);

			size_t end = _size;
			while (pos < end)
			{
				if (_str[pos] == ch)
					return pos;
				pos++;
			}
			return npos;
		}

		size_t find(const char* str, size_t pos = 0)
		{
			size_t len = strlen(str);
			assert(pos + len < _size);
			
			//使用strstr比较字符串
			const char* tmp = strstr(_str + pos, str);
			if (tmp)
			{
				return tmp - _str;
			}
			return npos;
		}

		//substr函数(子字符串获取)
		string substr(size_t pos = 0, size_t len = npos)
		{
			assert(pos < _size);

			//特殊情况更改后一起处理
			size_t n = len;
			if (len == npos || len + pos > _size)
			{
				n = _size - pos;
			}

			string retun;
			retun.reserve(n);
			for (int i = pos; i < pos + n; ++i)
			{
				//此处使用+=更方便
				retun += _str[i];
			}

			return retun;
		}

		//resize函数(重定义长度)
		void resize(size_t n, char s = '\0')
		{
			if (n >= _size)
			{
				reserve(n);

				for (size_t i = _size; i < n; ++i)
				{
					_str[i] = s;
				}
			}

			_size = n;
			_str[n] = '\0';
		}

		//大小比较
		bool operator<(const string& str) const
		{
			return _size < str._size && memcmp(_str, str._str, _size) < 0;
		}

		bool operator==(const string& str) const
		{
			return _size == str._size && memcmp(_str, str._str, _size) == 0;
		}

		bool operator>(const string& str) const
		{
			return !(*this < str || *this == str);
		}

		bool operator>=(const string& str) const
		{
			return !(*this < str);
		}

		bool operator<=(const string& str) const
		{
			return !(*this > str);
		}

		bool operator!=(const string& str) const
		{
			return !(*this == str);
		}

		//swap函数(交换)
		void swap(string& str)
		{
			std::swap(_str, str._str);

			size_t tmp = _size;
			_size = str._size;
			str._size = tmp;

			tmp = _capacity;
			_capacity = str._capacity;
			str._capacity = tmp;
		}

		//empty函数(判断空)
		bool empty() const
		{
			return _size == 0;
		}

		//清空数据
		void clear()
		{
			_str[0] = '\0';
			_size = 0;
		}

		//c_str函数(返回C字符串)
		const char* c_str() const
		{
			return _str;
		}

		//获取对应下标
		char& operator[](size_t pos)
		{
			assert(_str[pos]);

			return _str[pos];
		}

		const char& operator[](size_t pos) const
		{
			assert(_str[pos]);

			return _str[pos];
		}

		
	private:
		size_t _size;
		size_t _capacity;
		char* _str;

	public:
		const static size_t npos;
	};

	const size_t string::npos = -1;
};

//流插入
ostream& operator<<(ostream& out, const str::string& str)
{
	for (auto i : str)
	{
		out << i;
	}

	return out;
}

//流提取
istream& operator>>(istream& in, str::string& str)
{
	str.clear();

	char ch = in.get();
	while (ch == ' ' || ch == '\n')
	{
		ch = in.get();
	}

	int i = 0;
	char tmp[128];
	while (ch != ' ' && ch != '\n')
	{
		tmp[i++] = ch;
		
		if (i == 127)
		{
			tmp[i] = '\0';
			str += tmp;

			i = 0;
		}
		ch = in.get();
	}

	if (i != 0)
	{
		tmp[i] = '\0';
		str += tmp;
	}
	return in;
}

Guess you like

Origin blog.csdn.net/qq_74641564/article/details/131666872