[C++] Introduction to the basic interface of the String class (see more English documents)

string directory

Table of contents

 If you are in a hurry, just read my content under this title! !

1. Introduction to STL

1.1What is STL

1.2STL version

1.3STL six components

1.4 Importance of STL

1.5 How to learn STL

2. What is string? ? (Essentially a class)

3. String class template (What? String actually uses templates??)

3. Three constructions of string (copy construction is also a construction)

4. Three traversal methods of stirng

4.1 [ ] + subscript (overloaded function of operator[ ])

4.2 Range-based for loop (C++11 syntactic sugar) 

4.3 Iterator traversal

5. String iterator (an iterator is a class that encapsulates a pointer and simulates the operation of a pointer)

5.1begin interface and end interface

5.2const iterator

5.3 The difference between iterators and pointers

6. Capacity of string (resize, reserve, clear)

6.1resize(Resizes the string to a length of n characters) 

6.2reserve reserved space (generally used for expansion) (the size will not be changed)

6.3clear (clear string)

7. Element Access of string (element access)

8. String modification (+=, append, insert, erase, swap)

8.1operator+= (the most awesome tail insertion)

8.2append (append data at the end, but no += awesome)

8.3insert (insert at any position) (generally not used, a waste of time)

8.4erase (delete at any position)

8.5 swap (is a parameter, different from the common one)

9. String operation operation function (c_str, find,, substr)

9.1c_str (in order to make the interface between C++ and C language, it is regarded as an embassy)

9.2find (search string, also contains npos)

9.3rfind (rfind is very comfortable for finding file suffixes)

9.4find_first_of (find the first occurrence of a character subset)

 9.5substr (truncate string and return)

10. Overloaded functions of non-member functions (getline)


 If you are in a hurry, just read my content under this title! !

Of course, please take a look at the following list, which contains common member functions and operators of the `std::string` class in C++, as well as a brief description and an example usage of each function and operator:

1. `begin`: Get the Iterator pointing to the beginning of the string

std::string::iterator it = str.begin();

2. `end`: Get the Iterator pointing to the end of the string

std::string::iterator it_end = str.end();

3. `size` (or `length`): Get the size of the string (capacity returns the size of the space opened up by the string)

std::string str = "Hello";
std::cout << "Size: " << str.size() << std::endl;

4. `empty`: Determine whether it is empty

if (str.empty()) {
    std::cout << "String is empty." << std::endl;
}

5. `operator[]`: takes the nth element, equivalent to an array

char firstChar = str[0];

6. `c_str`: Get C-style const char* string

const char* cString = str.c_str();

7. `data`: Get the string content address

const char* data = str.data();

8. `operator+=`: string end insertion operator

str1 += str2;

9. `find`: Find the position of a substring in a string

size_t found = str.find("World");

10. `substr`: get substring

std::string sub = str.substr(6, 5);

11. `compare`: compare strings

int result = str1.compare(str2);
if (result == 0) {
    std::cout << "Strings are equal." << std::endl;
}

12. `operator+`: string concatenation

std::string result = str1 + str2;

13. `operator==`: Determine whether they are equal

if (str1 == str2) {
    std::cout << "Strings are equal." << std::endl;
}

14. `operator!=`: Determine whether it is not equal to

if (str1 != str2) {
    std::cout << "Strings are not equal." << std::endl;
}

15. `operator<`: Determine whether it is less than

if (str1 < str2) {
    std::cout << "str1 is less than str2." << std::endl;
}

16. `insert`: insert characters

str.insert(3, " inserted");

17. `resize`: Reallocate space
   - Description: The `resize` function is used to change the size of a string, which can increase or decrease the length of the string. If the new size is larger than the current size, the new element will be initialized by default. (three conditions)

 std::string str = "Hello";
     str.resize(8); // 增加字符串大小
     std::cout << str << std::endl; // 输出 "Hello\0\0\0"
     
     str.resize(3); // 缩小字符串大小
     std::cout << str << std::endl; // 输出 "Hel"

18. `reserve`: Reserve space
   - Description: The `reserve` function is used to reserve storage space for strings to avoid reallocation of memory in subsequent operations. This is useful for reducing the overhead of dynamic memory allocation.

 std::string str;
     str.reserve(100); // 预留至少能容纳100个字符的空间
     str = "Hello, World!"; // 不会触发重新分配内存

Please note that the above examples are for reference only


1. Introduction to STL

1.1What is STL

STL (standard template libaray): It is an important part of the C++ standard library. It is not only a reusable component library, but also a software framework including data structures and algorithms.

1.2STL version

Original copy:
The original version completed by Alexander Stepanov and Meng Lee at HP Labs. In the spirit of open source, they stated that anyone can do anything
There is no need to pay for using, copying, modifying, disseminating, and commercially using these codes. The only condition is that it needs to be open sourced like the original version.
use. HP version - the ancestor of all STL implementation versions.
PJ version:
Developed by PJ Plauger , inherited from the HP version, adopted by Windows Visual C++ , cannot be made public or modified, defect: low readability,
The symbol naming is weird.
RW version:
Developed by Rouge Wage Company, it is inherited from the HP version and adopted by C++Builder . It cannot be made public or modified, and its readability is average.
SGI version:
Developed by Silicon Graphics Computer Systems , Inc. , it is inherited from the HP version. Adopted by GCC (Linux) , with good portability,
It can be made public, modified or even sold. From the naming style and programming style, it is very readable. When we learn STL later , we need to read part of the source code.
The main reference is this version.

1.3STL six components

STL has six major components: container, adapter, iterator, space assembler, algorithm, and functor. The specific contents included are as follows:

1.4 Importance of STL

There is a saying on the Internet: "If you don't understand STL, don't say you know C++." STL is an excellent work in C++. With its company, many underlying data structures and algorithms do not need to be reinvented and can be used directly, which greatly improves the efficiency of problem solving and development; therefore, STL is used in written examinations and It is a key subject of inspection in interviews and at work.

1.5 How to learn STL

Let me tell you about an English document query interface website: cplusplus.com - The C++ Resources Network

(Note: cplusplus needs to be registered before it can be used after it is updated. We can click "Legacy version" in the upper right corner to return to the old version. Personally, I think the old version has a better experience than the new version), because cplusplus is more suitable for beginners, we encountered in the process of learning STL Any function interface, function parameters, etc. can be solved by searching on cplusplus

Read excellent C++ books: C++ is a difficult language with a lot of details. I now occasionally read STL source code analysis.

If you want the electronic version, you can message me privately! !


2. What is string? ? (Essentially a class)

In C language, a string is a collection of characters ending with '\0'. For convenience of operation, the C language string.h header file provides a series of library functions, but these library functions are separated from strings. , does not conform to object-oriented thinking, and the underlying space needs to be managed by the user himself. If you are not careful, you may even access it out of bounds.

Based on the above reasons, the C++ standard library provides the string class. The string class provides various functional interfaces, such as the six default member functions of the class, string insertion and deletion, operator overloading, etc. We can use string to instantiate objects. , and then complete various operations on the object through various interfaces of string

The implementation framework of the string class is roughly as follows:

namespace std {
	template<class T>
	class string {
	public:
		// string 的各种成员函数
		
	private:
		T* _str;
		size_t _size;
		size_t _capacity;
		//string 的其他成员变量,比如npos
	};
}

Note: Strictly speaking, string does not actually belong to STL, because string appeared earlier than STL (this is why .length() and .size() appeared to calculate length), but due to the different characteristics of string This interface is very similar to the interfaces of other containers in STL, so we can regard string as a kind of STL and study it together.


3. String class template (What? String actually uses templates??)

When we open the document URL and search for string, we will find that string is actually a class instantiated by the basic_string class template using the character type char (you must read the following in English with patience)

In fact, it is a dynamically growing character array

So what is basic_string? ?

basic_string is a class template that can be instantiated with any character type:

I really couldn’t understand it, so I used translation software and found that it was a generalization! ! ! You ask me what is generalization? Check out my template blog! ! Template Beginner

Therefore, the string we usually use is essentially basic_string<char> . We don’t need to explicitly instantiate it ourselves because string is typedefed internally:

typedef       basic_string<char, char_traits, allocator>     string 


3. Three constructions of string (copy construction is also a construction)

String provides many constructors. We only need to master the most commonly used ones. If necessary, we can query the documentation for the rest: (the most commonly used are the three )

(constructor)function name Function Description
string() (emphasis) Construct an empty string class object, that is, an empty string
string(constchar*s) (emphasis) Use C-string to construct string class objects
string(size_tn,charc) The string class object contains n characters c
string(conststring&s) (emphasis) copy constructor


4. Three traversal methods of stirng

4.1 [ ] + subscript (overloaded function of operator[ ])

We will not introduce how to use [ ] now. Let’s first have an in-depth understanding of the overloading implementation and advantages of [ ].

In an array, we can also use [ ] for data access, but the boundary between out-of-bounds reading and out-of-bounds writing is very vague. The following is the array overloaded square bracket function we wrote ourselves:

T& operator[](int index) {
        // 使用assert检查越界
        assert(index >= 0 && index < size);
        return data[index];
    }

This further ensures the encapsulation of the C++ language, which is better and more excellent; in the string library, square brackets are also overloaded

The overloading of square brackets in string is similar to the above, that is, change the scope of assert to the following

index >= 0 && index < data.size()

 In general, the way we use [ ] traversal access is as follows:

void test_string2()
{
	string s1("1234");
	//需求:让对象s1里面的每个字符都加1

	//如果要让字符串的每个字符都加1,肯定离不开遍历,下面学习三种遍历string的方式。
	//1.下标 + []
	for (size_t i = 0; i < s1.size(); i++)
	{
		s1[i]++;//本质上
	}
	cout << s1 << endl;//GB2312兼容ascll编码,所以++后的结果为2345.
}

4.2 Range-based for loop (C++11 syntactic sugar) 

void test_string2()
{
	//2.范围for
	for (auto& ch : s1)//自动推导s1数组的每个元素后,用元素的引用作为迭代变量,通过引用达到修改s1数组元素的目的。
	{
		ch++;
	}
	cout << s1 << endl;
	//在上面这种需求下,范围for看起来似乎更为方便	
}

The above two pictures are excerpted from the explanation of Xiaoyang C++ entry scope for 

4.3 Iterator traversal

[ ] It’s just a North Korea. If you want it to be universal, you have to look at my iterator!

Iterators are more like pointers in the way they are used and behave, but they are still different from pointers. They may be pointers or not pointers. When defining, you need to specify the class domain. For example, in the definition of it1 in the code below, you need to specify the iterator type in the class domain. As can be seen from the code below, it1 can not only access but also modify the contents of object s1. And in addition to string, our troublesome list can also use iterator as usual, which also verifies the universality of iterator .

list<int> lt;
	list<int>::iterator ltit = lt.begin();
	while (ltit != lt.end())
	{
		cout << *ltit << " ";
		ltit++;
	}
	cout << endl;

So when we traverse the string, we use the begin and end interfaces:

int main()
{
	string s1("i love gao_peng_yan");
	string::iterator it1=s1.begin();
	while(it1!=s1.end())
	{
		cout << *it1 ;
		it1++;//别忘了迭代器++,要不然走不后去
	}

	return 0;
}

It would be too unfair to put such important knowledge like iterators in the subtitle of traversal, so please read on!


5. String iterator (an iterator is a class that encapsulates a pointer and simulates the operation of a pointer)

Iterators are iterators in C++. They are universal and applicable to most containers. You can understand them as pointers. Of course, not all iterators are implemented using pointers at the bottom:

typedef char* iterator;  //简单理解string中的迭代器

Actual code writing:

string::iterator it1=s1.begin();//(it1的类型是属于string类域当中的)

function name Function Description
begin() Returns an iterator pointing to the first character in the string
end() Returns an iterator pointing to the position next to the last character of the string ('\0')
rbegin() Starting in reverse, returns an iterator pointing to the position next to the last character of the string ('\0')
rend() Starting in reverse, returns an iterator pointing to the first character in the string

There is a misunderstanding about where the end iterator is. I will draw a picture for everyone to see: 

5.1begin interface and end interface

When we traverse, we need to use the interfaces begin and end. By consulting the documentation, we find that their return values ​​are all iterators. begin() will return an iterator for the first character , and end() will return an iterator for the next position of the last character, which is usually the identification character \0. In fact, it is similar to a pointer in use. You can get the corresponding character by dereferencing the iterator, and then you can operate on the character.

So when we traverse the string, we use the begin and end interfaces:

int main()
{
	string s1("i love gao_peng_yan");
	string::iterator it1=s1.begin();
	while(it1!=s1.end())
	{
		cout << *it1 ;
		it1++;//别忘了迭代器++,要不然走不后去
	}

	return 0;
}

5.2const iterator

When we added const in front of the string, the code we found actually reported an error. The const object used an iterator of a non-const object. After checking the documentation, we found that there is indeed a const iterator.

 So we modify the code:

Ouch passed! ! !

Lamb Note:

const string::iterator it1=s1.begin();

string::const_iterator it1=s1.begin();

These two completely different ways of writing:

For the first iterator, it means not to modify the iterator's pointer, but we generally don't write it like this, because the iterator is supposed to go backwards.

For the second iterator, it means that the content of the element accessed by the iterator is not modified .

5.3 The difference between iterators and pointers

Iterators and pointers are two different concepts. Although they have some similarities in some aspects, they are used in different programming contexts and have different characteristics and uses. Here are the main differences between them:

1. Purpose:
   - Iterator: Iterator is an abstract data access method, usually used to traverse elements in a collection (such as array, list, map, etc.). Iterator provides a general way to access collections. elements without knowing the underlying data structure.
   - Pointer: A pointer is a variable type that stores the address of an object in memory. Pointers are usually used for memory operations in low-level languages ​​(such as C and C++), including direct access to memory addresses, dynamic memory allocation, etc.

2. Safety:
   - Iterators: Iterators are generally designed as a safer way to traverse collections as they provide some protection mechanisms to avoid out-of-bounds access and memory errors. Iterators in different programming languages ​​may have different safety characteristics.
   - Pointers: When pointers are used in low-level languages, they can easily cause memory errors, such as null pointer references, out-of-bounds access, etc. Therefore, using pointers requires more caution and requires programmers to ensure safety.

3. Level of abstraction:
   - Iterators: Iterators provide a higher level of abstraction that hides the details of the underlying data structure. This makes the code more readable and maintainable, and helps reduce errors in the program.
   - Pointers: Pointers are a low-level abstraction that directly manipulates memory addresses, requiring programmers to understand the details of memory layout and data structures.

4. Language dependencies:
   - Iterators: Iterators are often used with high-level programming languages ​​(such as Python, C#, Java, etc.) that provide built-in iterators or collection traversal mechanisms.
   - Pointers: Pointers are more common in low-level programming languages ​​(such as C and C++), which directly support memory operations and therefore require the programmer to have a deeper understanding of computer hardware and memory management.


6. Capacity of string (resize, reserve, clear)

String provides some functions that operate on capacity:

function name function function
size() Returns the length of the string
capacity Returns the capacity of the string
empty Determine whether the string is empty

Note: size and capacity are two completely different function interfaces 


6.1resize(Resizes the string to a length of n characters) 

The resize function is used to adjust the size of a string. It is divided into three situations:

  • n is smaller than the size of the original string. At this time, the resize function will change the size of the original string to n, and will also change the initial value of the string, but will not change the capacity.
  • n is larger than the size of the original string, but smaller than its capacity. At this time, the resize function will set all the space after size to the character c
  • n is greater than the capacity of the original string. At this time, the resize function will expand the original string, and then set all the space after size to the character c

Situation 1 (truncation occurs):

Scenario 2 (no expansion, but assignment possible):

Changed the string size to 13, but did not expand it

Scenario 3 (expansion and assignment):


6.2reserve reserved space (generally used for expansion) (the size will not be changed)

reserve is used to expand and reserve space, which is equivalent to the realloc function in C language . It can be divided into two situations:

  • n is greater than the capacity of the original string. At this time, the reserve function will expand the capacity to n;
  • n is less than or equal to the capacity of the original string. The standard does not specify whether to reduce the capacity (no reduction under VS);
string s1("xiao_yang");
    cout << s1.size() <<endl;
    cout << s1.capacity() << endl << endl;

    s1.reserve(100);//预留100的空间
    cout << "预留后的size与capacity" << endl;
    cout << s1.size() << endl;
    cout << s1.capacity() << endl;

 Xiaoyang’s note: The reserve function will not change the size and data of the original string.


6.3clear (clear string)

The clear function is used to clear the string, that is, change the size to 0. As for whether it will change the capacity, the standard does not specify:


7. Element Access of string (element access)

string provides some interfaces to obtain individual characters in a string:

 operator[ ] (it has been explained in detail in the string traversal above, let’s review it again here)

A type of operator overloading, we can use opetator[] to obtain and modify the specific subscript characters in the string:

Although the output is correct here, why is there a prompt here? ?

By consulting the documentation, we can see that the parameter of [ ] is size_t, which is unsigned int, so it is unreasonable to write int here.


 

8. String modification (+=, append, insert, erase, swap)

string provides a series of functions for modifying the contents of strings:

8.1 operator+= (the most awesome tail insertion)

operator+= is a type of operator overloading, used to insert data to the end of a string. It supports inserting a string, inserting a character array, and inserting a character at the end:

string s1("xiao_yang");
    string s2="   hehe   ";

    s1+=s2;cout << s1 << endl;
    s1+="abcabc";cout << s1 << endl;
    s1+='c';cout << s1 << endl;

8.2append (append data at the end, but no += awesome)

The function of append is similar to that of operator+=, both append data to the end of the string:

8.3insert (insert at any position) (generally not used, a waste of time)

The insert function is used to insert data into the string at pos:

#include <iostream>
#include <string>

int main() {
    std::string original_string = "Hello, world!";
    
    // 在字符串的指定位置插入字符或子字符串
    original_string.insert(7, "there, ");
    std::cout << original_string << std::endl;
    // 输出: "Hello, there, world!"

    // 在字符串的末尾插入字符或子字符串
    original_string.insert(original_string.length(), " How are you?");
    std::cout << original_string << std::endl;
    // 输出: "Hello, there, world! How are you?"

    // 在字符串的开头插入字符或子字符串
    original_string.insert(0, "Hi, ");
    std::cout << original_string << std::endl;
    // 输出: "Hi, Hello, there, world! How are you?"

    return 0;
}

8.4erase (delete at any position)

erase is used to delete len characters backward starting from position pos:

  The npos here is a very particular thing!

npos is special here because its type value is size_t, not a simple int. Behind the first interface len of erase above, the default value of len is npos. Although the value of npos is -1, npos is an unsigned number. So npos is actually the maximum value of unsigned integer; therefore, if we don't know len, then the erase function will keep deleting until it encounters '\0'



8.5 swap (is a parameter, different from the common one)

The swap function is used to exchange the contents of two strings, including the pointed character array, the number of valid data, and the capacity:

string s1("dsjnaiodioasndosdaiosandsaoindsisdaoidnsaionds0asnisdanas");
    string s2;
    cout << "s1原来的size " << s1.size() << endl;
    cout << "s1原来的capacity " << s1.capacity() << endl;
    cout << "s2原来的size " << s2.size() << endl;
    cout << "s2原来的capacity " << s2.capacity() << endl;

    s1.swap(s2);
    cout << endl << endl << endl;
    cout << "s1交换后的size " << s1.size() << endl;
    cout << "s1交换后的capacity " << s1.capacity() << endl;
    cout << "s2交换后的size " << s2.size() << endl;
    cout << "s2交换后的capacity " << s2.capacity() << endl;


9. String operation operation function (c_str, find,, substr)

string provides a series of functions that operate on string:

9.1c_str (in order to make the interface between C++ and C language, it is regarded as an embassy)

In some scenarios, only operations on C-form strings, that is, character arrays, such as network transmission and fopen, are supported, but operations on string objects in C++ are not supported. Therefore, string provides c_str, which is used to return C-form strings. String:

#include<iostream>
#include<string.h>
using namespace std;
int main()
{
    string s1("haha");
    cout << strlen(s1.c_str()) << endl;
    return 0;
}


9.2find (search string, also contains npos)

find is used to return the position where a character or a character array or a string object first appears in string. If it cannot be found, npos is returned:

#include<iostream>
using namespace std;
int main()
{
    string s1("haha");
    string s2("ah");
    cout << s1.find(s2) << endl;
    cout << s1.find("h") << endl;
    cout << s1.max_size() << endl;
    return 0;
}


 

9.3rfind (rfind is very comfortable for finding file suffixes)

The find function searches from the starting position from front to back by default, while the rfind function searches from back to front by default:

 string s1("test.cpp");
    string s2("hehe.c.zip");
    cout << s1.find('.') << endl;
    cout << s2.rfind(".") << endl;


9.4find_first_of (find the first occurrence of a character subset)

 The find_first_of function is used to return the position of the element in the string that matches any character in the character/character array/string:

#include <iostream>
#include <string>

int main() {
    std::string str = "Hello, World!";
    std::string charactersToFind = "aeiou";  // 要查找的字符集合

    // 使用 find_first_of 查找第一个匹配的字符
    size_t found = str.find_first_of(charactersToFind);

    if (found != std::string::npos) {
        std::cout << "第一个匹配的字符 '" << str[found] << "' 在位置 " << found << " 上找到。" << std::endl;
    } else {
        std::cout << "未找到匹配的字符。" << std::endl;
    }

    return 0;
}

 9.5substr (truncate string and return)

string s1("haha hehe heihei");
    cout << s1.substr(0,s1.size()) << endl;
    cout << s1.substr(5,10) << endl;//从5开始 弄10个
    cout << s1.substr(10,10) << endl;


10. Overloaded functions of non-member functions (getline)

getline usually shines in OJ, because cin stops when it encounters a space:

 

string s1("haha hehe heihei");
    getline(cin,s1);
    cout << s1 << endl;//

    getline(cin,s1,' ');
    cout << s1 << endl;

 


Hope this article can help you! !

Guess you like

Origin blog.csdn.net/weixin_62985813/article/details/133000689