[C++] Learn about STL from string

1. Getting to know STL first

1. What is STL

STL ( standard template libaray - standard template library): It is an important part of the C++ standard library , not only a reusable component library, but also a software framework including data structures and algorithms

2. The version of STL

  • Original copy

    The original version completed by Alexander Stepanov and Meng Lee in Hewlett-Packard Labs, in the spirit of open source, they declare that they allow anyone to freely use, copy, modify, disseminate, and commercially use these codes without payment. The only condition is that it also needs to be used as open source as the original version. HP version – the granddaddy of all STL implementations

  • PJ version

    Developed by PJ Plauger, inherited from HP version, adopted by Windows Visual C++, cannot be disclosed or modified, defect: low readability, weird symbol naming

  • RW version

    Developed by Rouge Wage Company, inherited from HP version, adopted by C++ Builder, cannot be disclosed or modified, readability is average

  • SGI version

    Developed by Silicon Graphics Computer Systems, Inc., inherited from the HP version. Adopted by GCC (Linux), it has good portability and can be disclosed, modified or even sold. From the perspective of naming style and programming style, it is very readable. When we learn STL later, we need to read part of the source code. The main reference is this version.

3. The six major components of STL

image-20230413011601095

For STL learning, here is a recommended website cplusplus.com

2.string

Next, we will officially enter the study of STL. In the process of learning STL, we need to pay attention to

First, be familiar with the commonly used interfaces of various templates of STL in the library

Second, try to simulate the implementation of the class template in the library

Note: In the process, we may encounter some unsolvable problems. At this time, we will check the source code in the library. Since the implementation methods in each library are different, we will set it here and we will Refer to the SGI version of the code, combined with Mr. Hou Jie's STL source code analysis to simulate the implementation.

1. string class template

When we open the website recommended above and search for string, we will find that the string we usually use is actually basic_string<char>, which is the char class instantiated from the class template basic_string.

image-20230413164133833

basic_string is a class template that can instantiate different classes by passing in different type parameters.

image-20230413164426971

So, for string, why use a class template? In C language, it is enough to store char type data for string?

This is because of the encoding problem. We know that the data stored in the computer is all 0 and 1. In order to convert the data into something that humans can understand, people define a specific 01 sequence as a certain character. In this way The data in the computer can be translated, which is the encoding . The initial encoding method is to correspond some symbols and 26 uppercase and lowercase letters to the 01 sequence, which is the ASCII code.

Common encoding methods today

  • ASCII code

    American Standard Code for Information Interchange, which is the corresponding relationship between computer stored value and text symbol, only 256 characters

  • Unicode

    Unicode is created to solve the limitations of traditional character encoding schemes. It sets a unified and unique binary encoding for each character in each language, including utf-8, utf-16, utf-32

    utf-8 is compatible with ASCII, utf-8 is more commonly used and saves space

  • gbk

    gbk is the national standard, a code designed for Chinese, using double-byte code

2. The constructor of the string class

First of all, the essential thing in a class is the constructor. Let's take a look at what the constructor of the string class has

image-20230413210421519

The above are all the constructors of string. Next, let's experiment with code:

void Test_Construct()
{
    
    
	char str[] = "hello string";//创建一个C语言的字符串
	string s1;//默认构造函数,不用传递任何参数,最终是s1中只有一个\0
	string s2(str);//使用C语言的字符串构造一个string类型的对象
	string s3(s2);//使用s2拷贝构造一个string类型的对象
	string s4(s3, 2, 5);//使用s3中的第二个位置开始长度为5的子串构造对象
	string s5(str, 8);//使用C语言的字符串构造一个指定长度的string类型
	string s6(10, 'a');//使用指定字符构造一个长度为10,内容为a的对象
	auto first = s2.begin();
	auto last = s2.end();
	string s7(first, last);//使用迭代器区间构造一个[first,last)的对象
	cout << "s1: " << s1 << endl;
	cout << "s2: " << s2 << endl;
	cout << "s3: " << s3 << endl;
	cout << "s4: " << s4 << endl;
	cout << "s5: " << s5 << endl;
	cout << "s6: " << s6 << endl;
	cout << "s7: " << s7 << endl;
}

Running the above code, the result is:

image-20230413212541978

In the above function prototype, it seems to find something that has not been seen before. nposWhat is npos? check the documentation

image-20230413212851969

It can be seen that npos is a public static member constant defined in the class. The value of this constant definition is -1, but since the type of the constant is size_t, which is unsigned int, it represents the maximum possible value of this type. We understand it as the end of the string.

Since npos is a member constant in the string class, you need to specify the class domain when using npos, ie string::npos.

3. string internal data access

image-20230413233845379

It can be seen that for data access, string provides 4 interfaces, two of which are newly added by C++11. You can also tell from the name that it is used to obtain the first element and the last element of the string, which is not commonly used, so I won't go into details here. We mainly focus on the at and operator[] interfaces.

image-20230413234409535

image-20230413234429978

These two interfaces are able to get the value of the specified position, where operator[] is an operator overload for []. It can be seen that both interfaces overload the normal version and the const version to deal with the problem of privilege amplification.

void Test_Element()
{
    
    
	string s = "hello string";
	cout << s.at(4) << endl;
	cout << s[4] << endl;
}

image-20230414000051982

4. String traversal

For string traversal, we have the following methods:

  1. operator[]
  2. scope for
  3. iterator
void Test_Element2()
{
    
    
	string s = "0123456789";
	//operator[]
	for (size_t i = 0; i < s.size(); ++i)
	{
    
    
		cout << s[i] << " ";
	}
	cout << endl;
	//范围for
	for (auto e : s)
	{
    
    
		cout << e << " ";
	}
	cout << endl;
	//迭代器
	string::iterator it = s.begin();
	while (it != s.end())
	{
    
    
		cout << *it << " ";
		++it;
	}
	cout << endl;
}

image-20230414001152126

5. Iterator of string class

In traversal, we talked about iterators traversing strings, so what are iterators? Why do there exist iterators?

Iterators are a universal way to access all containers that support traversal. Iterators behave like pointers, but they are not all pointers in essence.

In the string class, the native pointer can already support the behavior of the iterator, so the iterator of the string is a pointer of the element type.

image-20230413213824032

You can see that string provides many different iterators, we can classify them

1. Forward iterator

begin and end, begin returns the beginning position of the string, and end returns the next position of the last valid data, that is, the iterator is closed before and opened after *[begin,end)*

image-20230414001904042

image-20230414001919818

It can be seen that both begin and end overload the const version to deal with permission changes.

2. Reverse iterator

The usage of the reverse iterator is exactly the same as that of the forward iterator, except that when the reverse iterator is called, the order of traversing the data is reversed.

void Test_Iterator()
{
    
    
	string s = "0123456789";
	cout << "正向迭代器" << endl;
	string::iterator it1 = s.begin();
	while (it1 != s.end())
	{
    
    
		cout << *it1 << " ";
		++it1;
	}
	cout << endl;
	cout << "反向迭代器" << endl;
	string::reverse_iterator it2 = s.rbegin();
	while (it2 != s.rend())
	{
    
    
		cout << *it2 << " ";
		++it2;
	}
	cout << endl;
}

image-20230414002535594

3. const iterator and const reverse iterator

These four iterator interfaces are all added by C++11 to standardize the code, but in fact, the const version has been overloaded in the first four interfaces, so these four are basically not used, so I won’t introduce too much. Use The method is exactly the same as the previous one.

6.String's Capacity related interface

image-20230414002824152

Among so many interfaces, the most commonly used ones are:

  • size: returns the length of the string
  • resize: reset the string length
  • capacity: return string capacity
  • reserve: Reset the string capacity. If the parameter passed in is less than capacity, no operation will be performed. If it is greater than capacity, a space with a capacity of n will be opened, the original data will be copied in, and the original space will be released.
  • empty: returns whether the string is empty

The rest of the interfaces are not commonly used, you just need to understand them.

void Test_capacity()
{
    
    
	string s = "0123456789";
	cout << "size" << s.size() << endl;
	cout << "capacity" << s.capacity() << endl;
	s.reserve(20);
	cout << "capacity" << s.capacity() << endl;
	s.resize(5);
	cout << "size" << s.size() << endl;
	if (!s.empty())
	{
    
    
		string::iterator it1 = s.begin();
		while (it1 != s.end())
		{
    
    
			cout << *it1 << " ";
			++it1;
		}
		cout << endl;
	}
	else
	{
    
    
		cout << "string is empty" << endl;
	}

}

image-20230414003555655

7. string modification related interface

image-20230414003646756

Among them, the commonly used

  • operator+=: Append a string, which has three overloads, which are appending a string (reusing append), appending a C-type string (reusing append), and appending a single character (reusing push_back)
  • insert: Insert a character or string at a certain position
  • erase: delete a character or a string of length len at a certain position
void Test_Modify()
{
    
    
	string s = "abcdefg";
	cout << s << endl;
	s += 'h';
	cout << s << endl;
	s += "ijklm";
	cout << s << endl;
	s.insert(5, 1, 'A');
	cout << s << endl;
	s.erase(5, 1);
	cout << s << endl;
}

image-20230414004529694

8. Other interfaces

  • c_str: Returns a character pointer in the form of a C language string (because Linux is written in C language, in the reading and writing of strings, the reading and writing of string type is not supported, so this interface is provided)
  • find: Find a value in a certain position of the string, if found, return the subscript, otherwise return npos
  • getline: read the data in the buffer until a newline character is encountered, this is to prevent cin from stopping reading when it encounters a space, and the subsequent content cannot be put into the same string
  • operator>> and operator<<: Overload stream insertion and stream extraction, so that strings also support the usage of cin and cout

Write at the end:

1. Since this is the first time I have come into contact with STL, I will explain many member function interfaces in detail, and the interfaces of subsequent STL containers will omit some repetitive and highly similar functions.
2. Regarding the use of a certain class, it is impossible to use a blog to explain it clearly. It is still necessary to learn in practice and read more about the content in the document

Guess you like

Origin blog.csdn.net/weixin_63249832/article/details/130143704