【C++】STL---string

1. Strings in C language

In C language, a string is a collection of characters ending with '\0'. For convenience of operation, the C standard library provides some str series library functions, but these library functions are separated from strings and are not very specific. It conforms to the idea of ​​OOP, and the underlying space needs to be managed by the user himself. If you are not careful, you may have out-of-bounds access.

2. String class

  1. string is a string class that represents strings
  2. The interface of this class is basically the same as that of a regular container, with the addition of some regular operations specifically used to operate strings.
  3. At the bottom level , string is actually: basic_string alias of template class , typedef basic_string<char, char_traits, allocator> string;
  4. Multi-byte or variable-length character sequences cannot be operated on.
  5. When using the string class, you must include the #include header file and using namespace std;

Among them, many interfaces of the string class can be viewed by clicking on the link -> string .

Here we need to introduce iterators . In the string class, iterators are actually native pointers. The use of iterators is as follows:

		#include <iostream>
		#include <string>
		using namespace std;
		
		int main()
		{
			string s1("hello,world");
		
			// 迭代器的使用,iterator 就是迭代器,它需要指定作用域
			string::iterator it = s1.begin();
			while (it != s1.end())
			{
				cout << *it << ' ';
				it++;
			}
			cout << endl;
		
			return 0;
		}

Among them, s1.begin();it is actually the pointer to the beginning of the string, s1.end()which is the pointer to '\0'.

3. Simulate the implementation of the string class

Next, we directly start to simulate the interface that implements the string class, and explain the usage during the implementation process. Note that we only simulate the more common and important interfaces.

Let’s first observe the declaration part of the string class and preview what interfaces we need to implement:

0. Declaration of string class

		namespace Young
		{
			class String
			{
			public:
		
				// 迭代器
				typedef char* iterator;
				typedef const char* const_iterator;
				iterator begin();
				iterator end();
				const_iterator begin() const;
				const_iterator end() const;
		
		
				// 构造
				String(const char* str = "")
					:_str(new char [strlen(str) + 1])
					,_size(strlen(str))
					,_capacity(_size)
				{
					strcpy(_str, str);
				}
		
		
				// 析构
				~String()
				{
					delete[] _str;
					_str = nullptr;
					_size = _capacity = 0;
				}
		
				// 交换
				void swap(String& tmp)
				{
					::swap(_str, tmp._str);
					::swap(_size, tmp._size);
					::swap(_capacity, tmp._capacity);
				}
		
		
				// s2(s1)
				// 拷贝构造
				String(const String& str)
					:_str(nullptr)
					,_size(0)
					,_capacity(0)
				{
					String tmp(str._str);
					swap(tmp);
				}
		
		
				// s2 = s1
				// 赋值运算符重载
				String& operator=(const String& str)
				{
					if (this != &str)
					{
						String tmp(str._str);
						swap(tmp);
					}
		
					return *this;
				}
		
		
				// 申请空间 -- 不改变 _size
				void reserve(size_t n);
		
				// 将空间调整为 n -- 改变 _size
				void resize(size_t n, char c = '\0');
		
		
				// 尾插字符
				void push_back(char c);
				String& operator+=(char c);
		
		
				// 尾插字符串
				void append(const char* str);
				String& operator+=(const char* str);
		
		
				// 清空字符串
				void clear();
		
		
				// 获取字符串长度
				size_t size() const;
		
		
				// 获取容量
				size_t capacity() const;
		
		
				// [] 重载   s[1]
				const char& operator[](size_t index) const;
				char& operator[](size_t index);
		
		
				// 比较符号运算符重载
				bool operator>(const String& s) const;
				bool operator==(const String& s) const;
				bool operator>=(const String& s) const;
				bool operator<(const String& s) const;
				bool operator<=(const String& s) const;
				bool operator!=(const String& s) const;
		
		
				// 返回它的字符串 -- 返回 char* 类型
				const char* c_str() const;
		
				// 判断是否为空字符串
				bool empty() const;
		
				// find -- 从pos位置开始查找字符/字符串
				size_t find(char ch, size_t pos = 0) const;
				size_t find(const char* str, size_t pos = 0) const;
		
		
				// 获得从 pos 位置开始到 len 的子字符串;如果 len 不给值,默认到结尾
				String substr(size_t pos, size_t len = npos) const;
		
				// 在 pos 位置插入插入字符 ch 或字符串 str
				String& insert(size_t pos, char ch);
				String& insert(size_t pos, const char* str);
		
				// 删除从 pos 位置开始 len 长度的字符串;如果 len 不给值就默认删到末尾 
				String& erase(size_t pos, size_t len = npos);
		
				// 打印数据
				void Print();
		
			private:
				char* _str;
				size_t _size;
				size_t _capacity;
		
			public:
				const static size_t npos = -1;
			};
		
			// 流插入、流提取      cout << s1;
			ostream& operator<<(ostream& out, const String& s);
			istream& operator>>(istream& in, String& s);
		}

1. Constructor

The constructor is often the first thing to consider when implementing a class, and the member variables of the string class are separate char* _str; size_t _size; size_t _capacity;. Although they are all built-in types, the char * type requires us to manually apply for space, so we need to explicitly write the constructor:

			// 构造函数
			String(const char* str = "")
				:_str(new char [strlen(str) + 1])
				,_size(strlen(str))
				,_capacity(_size)
			{
				strcpy(_str, str);
			}

We gave the default value in the constructor "", which is an empty string; when we apply for space, we often have to apply for one more space than str because it needs to be stored '\0'; finally, we use the strcpy function to copy str to _str .

2. Destructor

Because we apply for space manually for the string class, we need to release it manually. The destructor is as follows:

			// 析构函数
			~String()
			{
				delete[] _str;
				_str = nullptr;
				_size = _capacity = 0;
			}

It should be noted that the use of delete needs to be matched.

3. Copy constructor

In the copy constructor here, we need to review shallow copy and deep copy . We have also learned about classes and objects (Part 2) before , so let’s review them now.

  1. Shallow copy: Also called value copy, the compiler just copies the value in the object. If resources are managed in an object, it will eventually lead to multiple objects sharing the same resource. When an object is destroyed, the resource will be released. At this time, other objects do not know that the resource has been released and think it is still valid, so When access operations continue on the resource, an access violation occurs.

We can use deep copy to solve the shallow copy problem, that is: each object has an independent resource and should not be shared with other objects.

  1. Deep copy: If a class involves resource management, its copy constructor, assignment operator overloading, and destructor must be explicitly given. Generally, it is provided in deep copy mode.

Next we explicitly write a copy constructor of the string class ourselves:

			// String s2(s1);
			// 拷贝构造
			String(const String& str)
				:_str(nullptr)
				,_size(0)
				,_capacity(0)
			{
				String tmp(str._str);
				swap(tmp);
			}

Note that in the above copy construction, the assumption is this: String s2(s1);then _str belongs to the s2 object, and str belongs to the s1 object. Our idea is to first empty the initialization list of _str , set _size and _capacity to zero, and then use the constructor String tmp(str._str);to instantiate A tmp object, at this time tmp is equivalent to s1. Finally, the resources in s2 and tmp objects are exchanged, that is, the copy of s1 to s2 is completed ; and tmp will automatically call the destructor when it goes out of scope .

Note that the above swapfunctions are also in the string class, so we also need to implement them ourselves. The implementation is as follows:

			// 交换
			void swap(String& tmp)
			{
				std::swap(_str, tmp._str);
				std::swap(_size, tmp._size);
				std::swap(_capacity, tmp._capacity);
			}

In the implementation, we then use the swap function in the standard library to help us complete the swap function of the string class.

The use and results are as follows:

Insert image description here

4. Assignment operator overloading

Assignment operator overloading is similar to copy construction and is implemented as follows:

			// s2 = s1
			// 赋值运算符重载
			String& operator=(const String& str)
			{
				if (this != &str)
				{
					String tmp(str._str);
					swap(tmp);
				}
	
				return *this;
			}

The usage and results are as follows:

Insert image description here

5. Iterator

The iterator of the string class is actually a native pointer, which is declared in the string class declaration above. Let’s implement it directly below:

		// 迭代器
		Young::String::iterator Young::String::begin()
		{
			return _str;
		}
		
		Young::String::iterator Young::String::end()
		{
			return _str + _size;
		}
		
		Young::String::const_iterator Young::String::begin() const
		{
			return _str;
		}
		
		Young::String::const_iterator Young::String::end() const
		{
			return _str + _size;
		}

Since we write declarations and definitions separately, we must specify our scope before iterator/const_iteratorand ; where iterator/const_iterator are the iterators called by ordinary objects and the iterators called by const objects respectively; begin()/end() are respectively Pointers to the beginning and end of the string.begin()/end()

6. Element access: [] overload

In order to facilitate access to string , we can overload [] to directly access the subscript, as follows:

const object:

		const char& Young::String::operator[](size_t index) const
		{
			assert(index < _size);
		
			return _str[index];
		}

Ordinary objects:

		char& Young::String::operator[](size_t index)
		{
			assert(index < _size);
		
			return _str[index];
		}

7. Stream Insertion and Stream Extraction Overloads

When using string , in order to facilitate viewing of strings, we can overload stream insertion and stream extraction to facilitate printing and viewing of strings; as mentioned before, in order to facilitate our use and reflect the use value of stream insertion and extraction, we To be implemented outside the class to prevent this pointer from preempting the first parameter position, the implementation is as follows:

		// 流插入    cout << s1;
		ostream& Young::operator<<(ostream& out, const String& s)
		{
			for (size_t i = 0; i < s.size(); i++)
			{
				out << s[i];
			}
			return out;
		}

In stream insertion, we only need to print out each character;

		// 流提取    cin >> s
		istream& Young::operator>>(istream& in, String& s)
		{
			s.clear();
		
			char buff[129];
			size_t i = 0;
		
			char ch = in.get();
		
			while (ch != ' ' && ch != '\n')
			{
				buff[i++] = ch;
				if (i == 128)
				{
					buff[i] = '\0';
					s += buff;
					i = 0;
				}
				ch = in.get();
			}
		
			if (i != 0)
			{
				buff[i] = '\0';
				s += buff;
			}
		
			return in;
		}

We create a buff array to store the input characters. When the buff array is full, insert it into the object to avoid frequently opening up space; because stream extraction ends when it encounters ' 'or '\0'by default, so we need to use the member function of cinget() to extract to ' 'or '\0', which makes it easier for us to determine the ending conditions.

8. Capacity-related interfaces

(1)size

Get the effective length of the string and implement:

		size_t Young::String::size() const
		{
			return _size;
		}

(2)capacity

Get the capacity of the string and implement:

		size_t Young::String::capacity() const
		{
			return _capacity;
		}

(3)clear

Clear the contents of the string to achieve:

		void Young::String::clear()
		{
			_str[0] = '\0';
			_size = 0;
		}

Clearing the content of the string does not destroy the space, so you only need to add at the index 0 '\0'and set the length to 0.

(4)empty

To determine whether a string is an empty string, implement:

		bool Young::String::empty() const
		{
			return _size == 0;
		}

Just need to determine whether _size is 0.

The above four interfaces are used as follows:

Insert image description here

(5)reserve

We can view the relevant documentation of the reserve interface:

Insert image description here

In fact, it means applying for n spaces. Reserve means to reserve n spaces . If n is greater than _capacity , the space will be changed. If n is less than _capacity, the space will not be changed. Note that reserve does not change the value of _size ; its implementation is as follows:

		// 申请空间
		void Young::String::reserve(size_t n)
		{
			if (n > _capacity)
			{
				char* tmp = new char[n + 1];
				strcpy(tmp, _str);
		
				delete[] _str;
				_str = tmp;
		
				_capacity = n;
			}
		}

Suppose you need to apply for n spaces, you need to apply for n+1 space and '\0'reserve a space for ; then copy the contents of the original string to the newly opened space, then destroy the original space _str and let the original space _str point to New space tmp .

(6)resize

The difference between reserve and resize is that resize adjusts the size of the space and can initialize the space, while resize can change the value of _size .

		// 调整空间+初始化
		void Young::String::resize(size_t n, char c)
		{
			// 如果 n 大于 _size,直接申请 n 个空间,然后从原来的尾部开始初始化 
			if (n > _size)
			{
				reserve(n);
		
				for (size_t i = _size; i < n; i++)
				{
					_str[i] = c;
				}
		
				_str[n] = '\0';
				_size = n;
			}
		
			// 否则,删数据
			else
			{
				_str[n] = '\0';
				_size = n;
			}
		}

If the initialization character is not passed explicitly, the default value we gave at the declaration will be used '\0.

9. Modify string related interfaces

(1)push_back

Tail insertion, inserting a character at the end of the string, let's look at the original document first:

Insert image description here
The implementation is as follows:

		// 尾插字符
		void Young::String::push_back(char c)
		{
			if (_size == _capacity)
			{
				reserve(_capacity == 0 ? 4 : _capacity * 2);
			}
		
			_str[_size++] = c;
			_str[_size] = '\0';
		}

Before tail plugging, you need to determine whether the capacity is full. If it is full, you need to expand it; or if the capacity is 0, we will open 4 capacities by default.

(2)append

To append a string, let's look at the document first:

Insert image description here

There are many interfaces overloaded in the document. We only implement one interface here, which is to insert a string at the end, which is the third interface in the figure above. The implementation is as follows:

		// 尾插字符串
		void Young::String::append(const char* str)
		{
			int len = strlen(str);
		
			// 空间不够扩容
			if (_size + len > _capacity)
			{
				reserve(_size + len);
			}
		
			strcpy(_str + _size, str);
			_size += len;
		}

(3) += operator overloading

Insert image description here

The += operator also appends characters, strings, and string objects. Here we implement appending characters and strings, which is tail insertion. The implementation is as follows:

		//尾插字符
		Young::String& Young::String::operator+=(char c)
		{
			push_back(c);
		
			return *this;
		} 

With the previously implemented push_backand append, we only need to reuse them to implement;

		// 尾插字符串
		Young::String& Young::String::operator+=(const char* str)
		{
			append(str);
		
			return *this;
		}

The use of the above four interfaces and the use of stream insertion and stream extraction are as follows:

Insert image description here

(4)insert

Insert is to insert the character ch or the string str at the pos position . We implement the interface for inserting characters or strings as follows:

		// 插入字符
		Young::String& Young::String::insert(size_t pos, char ch)
		{
			assert(pos < _size);
		
			// 满了就扩容
			if (_size == _capacity)
			{
				reserve(_capacity == 0 ? 4 : _capacity * 2);
			}
		
			// 挪动数据
			size_t end = _size + 1;
			while (end > pos)
			{
				_str[end] = _str[end - 1];
				end--;
			}
		
			// 插入字符
			_str[pos] = ch;
			_size++;
		
			return *this;
		}

Insert string:

		Young::String& Young::String::insert(size_t pos, const char* str)
		{
			assert(pos < _size);
		
			// 判断插入字符串的长度是否会满
			size_t len = strlen(str);
			if (_size + len > _capacity)
			{
				reserve(_size + len);
			}
		
			// 挪动数据
			size_t end = _size + len;
			while (end > pos)
			{
				_str[end] = _str[end - len];
				end--;
			}
		
			// 拷贝数据1.
			/*for (size_t i = pos; i < len; i++)
			{
				_str[i] = str[i];
			}*/
		
			// 拷贝数据2.
			strncpy(_str + pos, str, len);
			_size += len;
		
			return *this;
		}

(5)erase

Erase deletes a string of length len starting from position pos ; if len is not given a value, it will be deleted to the end by default;

At the end, we need to define a static unsigned variable of npos at the declaration , and define it as -1 . Because it is unsigned, it is the maximum value of the integer. We give npos the default value and we can get it to the end. . Note that if declarations and definitions are written separately, default values ​​can only be given at the declaration point.

The implementation is as follows:

		Young::String& Young::String::erase(size_t pos, size_t len)
		{
			assert(pos < _size);
			
			// 删到末尾
			if (len == npos || pos + len > _size)
			{
				_str[pos] = '\0';
				_size = pos;
			}
		
			// 删 len 长度
			else
			{
				strcpy(_str + pos, _str + pos + len);
				_size -= len;
			}
		
			return *this;
		}

The usage of insert and erase is as shown below:

Insert image description here

10. Interface for operating strings

(1)c_str

Return its string - returns char* type, implements:

		const char* Young::String::c_str() const
		{
			return _str;
		}

(2)find

find is an interface for search functions. It searches for characters/strings starting from the pos position. If pos does not give a value, the default subscript starts from 0. The implementation is as follows:

Find characters:

		size_t Young::String::find(char ch, size_t pos) const
		{
			assert(pos < _size);
		
			for (size_t i = pos; i < _size; i++)
			{
				if (_str[i] == ch)
				{
					// 返回下标
					return i;
				}
			}
		
			return npos;
		}

Find a string:

		size_t Young::String::find(const char* str, size_t pos) const
		{
			assert(pos < _size);
			assert(str);
		
			const char* ret = strstr(_str + pos, str);
			if (ret == nullptr)
				return npos;
		
			// 下标相减,返回下标
			return ret - _str;
		}

strstr is a library function that searches for matching strings. Its return value is the beginning of the matching string if found, otherwise it returns empty.

(3)substr

substr is to obtain the substring starting from pos position to len ; if len does not give a value, it defaults to the end, that is, npos . The implementation is as follows:

		Young::String Young::String::substr(size_t pos, size_t len) const
		{
			assert(pos < _size);
		
			// 创建一个临时对象 tmp
			String tmp;
		
			// end 为取到的子串的结尾的下标
			size_t end = pos + len;
		
			// 取到末尾
			if (len == npos || end > _size)
			{
				len = _size - pos;
				end = _size;
			}
		
			// 申请 len 的空间
			tmp.reserve(len);
		
			// 开始取子串
			for (size_t i = pos; i < end; i++)
			{
				tmp += _str[i];
			}
		
			return tmp;
		}

Generally find and substr are used together. Their usage scenarios can divide a URL into protocols, domain names, and resource names. They are used as follows:

Insert image description here

11. Comparison operator overloading

As before, we only need to implement the >and ==operators, and we can reuse these two for everything else. The implementation is as follows:

		bool Young::String::operator>(const String& s) const
		{
			return strcmp(_str, s._str) > 0;
		}
		
		bool Young::String::operator==(const String& s) const
		{
			return strcmp(_str, s._str) == 0;
		}
		
		bool Young::String::operator>=(const String& s) const
		{
			return *this > s || *this == s;
		}
		
		bool Young::String::operator<(const String& s) const
		{
			return !(*this >= s);
		}
		
		bool Young::String::operator<=(const String& s) const
		{
			return !(*this > s);
		}
		
		bool Young::String::operator!=(const String& s) const
		{
			return !(*this == s);
		}

The usage and results are as follows:

Insert image description here

Guess you like

Origin blog.csdn.net/YoungMLet/article/details/132198723