[Introduction to C++] Common methods of the string class (detailed explanation of 10,000 characters)

1. Introduction to STL

1.1 What is STL

STL (standard template libaray-standard template library): It is an important part of the C++ standard library, not only a reusable component library, but also a software framework including data structures and algorithms
insert image description here

1.2STL version

  • Original copy

The original version completed by Alexander Stepanov and Meng Lee in Hewlett-Packard Labs, in the spirit of open source, they declare that they allow anyone to freely use, copy, modify, disseminate, and commercially use these codes without payment. The only condition is that it also needs to be used as open source as the original version.
HP version - the granddaddy of all STL implementations

  • PJ version

Developed by PJ Plauger, inherited from HP version, adopted by Windows Visual C++, cannot be disclosed or modified, defect: low readability, weird symbol naming.

  • RW version

Developed by Rouge Wage Company, inherited from the HP version, adopted by C++ Builder, it cannot be disclosed or modified, and its readability is average.

  • SGI version

Developed by Silicon Graphics Computer Systems, Inc., inherited from the HP version. Adopted by GCC (Linux), it has good portability, can be made public, modified and even sold. From the point of view of naming style and programming style, it is very readable.When we learn STL later, we need to read part of the source code. The main reference is this version.

1.3 Six major components of STL

Here is the quote
Understand the six major components first, and learn slowly later.

1.4 Defects of STL

  1. The update of the STL library is too slow. This is a serious complaint. The previous version was C++98, and the C++03 in the middle was basically revised. It has been 13 years since C++11 came out, and STL was only further updated.
  2. STL does not currently support thread safety. In a concurrent environment, we need to lock ourselves. And the granularity of the lock is relatively large.
  3. STL's extreme pursuit of efficiency leads to internal complexity. Such as type extraction, iterator extraction.
  4. The use of STL has the problem of code expansion. For example, using vector/vector/vector will generate multiple copies of code. Of course, this is caused by the template syntax itself.

2. Use of string class

2.1 Strings in C language

In C language, a string is a collection of some characters ending with '\0'. For the convenience of operation, the C standard library provides some library functions of the str series, but these library functions are separated from the string and are not It conforms to the OOP idea (object-oriented idea), and the underlying space needs to be managed by the user, and it may be accessed out of bounds if you are not careful.

2.2 The string class in the standard library

The string class is actually a template class instantiated from a class template
Introduction to the documentation of the string class
insert image description here
We can see that it is actually basic_stringa typedef of the class instantiated from this class template.

Here, basic_stringbesides string, there are three other template classes instantiated.
insert image description here
They are all template classes instantiated from the basic_string class template, the difference is that the types of their corresponding template parameters are different

For the string class: In fact, its bottom layer is a dynamic character array,
string is a chartype of character array
wstring is the corresponding wchar_tcharacter array
u16string is char16_tthe character array
u32string is char32_tthe character array The
corresponding sizes of these different types of characters are also different .

So why come up with so many characters?

This is actually because
Please add a picture description
all the symbols and letters in the ASCII code have a corresponding ASCII code value.
In fact, the letters themselves are not stored in the memory, but their corresponding ASCII code values ​​(shown in hexadecimal here).
But ASCII is mainly used to display languages ​​such as English, and there are many countries and languages ​​in the world. For example, if we want computers to display Chinese, ASCII codes will not work.
Based on this reason, someone invented Unicode-Unicode (compatible with ASCII):
Unicode is divided into , UTF-8, UTF-16and UTF-32these.

So, in order to deal with these different encodings, these different character types are produced, so there is basic_stringthis generic string class template, which we can use to instantiate different types of string classes.

Summarize:

  1. string is a string class that represents strings
  2. The interface of this class is basically the same as that of regular containers, and some regular operations specially used to manipulate strings are added.
  3. string is actually at the bottom: basic_stringthe alias of the template class,typedef basic_string<char, char_traits, allocator> string;
  4. Sequences of multibyte or variable-length characters cannot be manipulated.
  5. When using the string class, you must include #include header files and using namespace std;

2.3Common interface description of string class (only explain the most commonly used interface)

2.3.1 Common construction of string class objects

Here is the quote

(constructor) function name Function Description
string() (empty string constructor default constructor focus) Constructs an empty string with length zero characters
string(const char* s) (emphasis) Constructs a string-like object from a constant string
string (const string& str, size_t pos, size_t len ​​= npos) (not often used) Copies the portion of str starting at character position pos and spanning len characters (or until the end of str if str is too short or len is string::npos)
string (const char* s, size_t n) Take s to point to the first n characters of the string to construct a string object
string (size_t n, char c) Take n characters c to construct a string object
string (const string& str) (emphasis) copy construction
template string (InputIterator first, InputIterator last) After the iterator

Let's start to explain one by one:

  1. string()
    insert image description here
    Here we construct an empty string.
  1. ** string (const char* s)**
    insert image description here
    This is also supported here:
    insert image description here
    here is what we talked about beforeThe single-parameter constructor supports implicit type conversion
    insert image description here
  1. string (const string& str, size_t pos, size_t len = npos)
    Here, a substring in str is used to construct a string object. This string starts from the subscript pos position in str and has a length of len.
    insert image description here
    If the one here stris shorter, or if it is given here len, string::nposthen this string goes to strthe end of .
    To give a simple example:
    insert image description here
    The len here is 30, so the length of the string here is not enough, it is shorter than 30, but no error will be reported here, and the end position of the string will be taken here. If it is
    given here , it will go to the end of str, and the default value will be given here, which is the default value . What is it here ?lenstring::nposlennpos
    insert image description here
    npos
    insert image description here
    It is a static member variable with a value of -1, but here its type is size_t(unsigned integer), so it is actually the maximum value of the integer here
  1. string (const char* s, size_t n)
    Use s to point to the first n characters of the string to construct a string object:
    insert image description here
  1. string (size_t n, char c)
    Use n characters c to construct a string object
    insert image description here
  1. string (const string& str)
    Copy construction:
    insert image description here
2.3.2 Capacity operation of string objects

Here is the quote

  1. sizeand lengh
    insert image description here
    both return the length of the string.
    insert image description here
    Here you may wonder why two interfaces have to be written for the same function.
    In fact, it has something to do with some historical reasons.string appeared earlier than STL, strictly speaking, string does not belong to STL, it is generated by C++ standard library, and existed before STL appeared.
    String was originally designed as length, but after the appearance of STL later, other data structures in it used size, so in order to maintain consistency, a size was added to string.
    thereforeThe underlying implementation principle of size() and length() methods is exactly the same. The reason for introducing size() is to be consistent with the interfaces of other containers. In general, size() is basically used
  1. max_size
    insert image description here
    What it does is return the maximum length the string can be
    insert image description here
    In fact, the string cannot be so long, and the value is different under different platforms
  1. capacity
    insert image description here
    Here is to return the capacity of the current string object (that is, how much space is currently allocated to it, expressed in bytes)
    insert image description here
    ==Here it does not include the space for '\0', because it thinks '\0' is not a valid character

For other veterans, you can take a look at it in conjunction with the document temporarily, and I will explain it to you later.

2.3.3 Modification operation of string class object

insert image description here

  1. push_back
    insert image description here
    As the name implies, it means push_backtail insertion ( append 1 character ).
    insert image description here
  1. append
    If you want to append a string you can useappend
    insert image description here
    There are many versions overloaded here, but the most commonly used one is to directly append a string
    insert image description here
  1. operator+=
    In fact, we usually don't like to use push_backand append. Instead use it operator+=.
    String is overloaded +=(as mentioned in the article before operator overloading), which is very convenient to use
    insert image description here
    insert image description here
2.3.4 resizeandreserve

resizeWith the above knowledge, let's look back at the sum in the capacity reserve.

Before that, let's observe how a string object expands in the process of continuously inserting data.

int main()
{
    
    
	string s;
	size_t sz = s.capacity();
	cout << "capacity changed: " << sz << '\n';
	for (int i = 0; i < 100; ++i)
	{
    
    
		s.push_back('c');
		if (sz != s.capacity())
		{
    
    
			sz = s.capacity();
			cout << "capacity changed: " << sz << '\n';
		}
	}
	return 0;
}

insert image description here
Here, almost every expansion on VS code is a 2-fold expansion.

After briefly understanding the expansion mechanism here, let's take a look reserve.r eservecan help us change the capacity, so that if we know how much space we need, we can open it in place at one time, so we don’t need to expand the capacity again and again.
insert image description here
We now specify reserve100 capacity,It doesn’t have to be 100, maybe due to some reasons such as alignment, it will give you some more space, but it will definitely not be smaller than 100
insert image description here
If we know how much space we need, reserve can help us open up space in advance, and then reduce expansion and improve efficiency

What's the use of that resize?

resizeNot only can the space be opened, but also the opened space can be initialized.
insert image description here
insert image description here
Here we do not specify the second parameter, the characters to be filled in are given by default \0, of course we can also specify the characters to be filled in by ourselves:
insert image description here
>
If the n we pass is less than the current string length, it can also help us Delete the extra content :
insert image description here
Note that only changes here size, capacitynot.
Under normal circumstances, it is not easy to shrink the capacity. If the capacity is reduced, it generally does not support in-place shrinking. Due to some reasons of the underlying memory management, it cannot be in-place shrinking.
If shrinking in place is supported, is it necessary to support the release of a part? We apply for a space, and only part of it is released if it is not used.
But it does not support only part of the free, like we free is not required to pass the pointer must point to the actual location.
So if you really want to shrink, you can only shrink in a different place, that is, open a new small space, copy the required data there, and then release the original space. Therefore, shrinking capacity is at the cost of performance, which is not supported by the system natively, and we need to do it ourselves. So don't shrink easily unless you have to.

2.3.5 Iterator (Forward)

Now we want to iterate over a string object. First, we can use [ ]traversal because string is overloaded [ ], or I can use range for. Besides these methods we can also use iterators.
insert image description here

Let's take a simple example:

int main()
{
    
    
	string s1("hello world");
	string::iterator it = s1.begin();
	while (it != s1.end())
	{
    
    
		cout << *it << " ";
		it++;
	}
	return 0;
}

Here is the quote
The it here is an iterator of the string class we defined ( string::iteratorit is a type). At this stage, you can think of an iterator as something like a pointer (not necessarily a pointer)
insert image description here
The begin here will return an iterator pointing to the first character of the string
insert image description here
The end here will return an iterator pointing to the position after the last character.
We can understand it as a pointer to two positions:
insert image description here

2.3.6 Reverse iterators

In addition to supporting forward and backward traversal as above, iterators also support reverse traversal, and reverse traversal is called reverse iterator.

Here is the quote

insert image description here
Here rbegin() returns a reverse iterator pointing to the last character of the string.
insert image description here
hererend() returns a reverse iterator that points to the previous character of the string

Let's look at the previous example again:

int main()
{
    
    
	string s1("hello world");
	string::reverse_iterator it = s1.rbegin();
	while (it != s1.rend())
	{
    
    
		cout << *it << " ";
		it++;
	}
	return 0;
}

insert image description here

2.3.7 const iterator (forward & reverse)

For const objects that cannot be modified, ordinary iterators can think of it as a pointer-like thing, thenWe can't modify it by dereferencing it, so we can't use ordinary iterators here, which will cause privilege amplification
insert image description here
We see begin()that if a const object calls begin, then the const iterator const_iterator is returned. Ordinary iterators can read and modify data, but const iterators can only be read and cannot be modified.

The const reverse iterator is the iterator that the const object calls rbegin()and rend()returnsconst_reverse_iterator
insert image description here
insert image description here

insert image description here
Here C++11 provides another set of iterators cbegin cend crbegin crend, which only return const iterators.

2.3.8 Element access

Here is the quote

string is overloaded [], we can use it directly:
insert image description here

operator[]There is also a normal version and a const version. The normal object calls [] and returns char&the const object const char&, which cannot be modified.
insert image description here

atThe effect []is the same. However, there is still a difference between the two, the difference is:
If []it is accessed out of bounds, it will directly report an error, and it uses assertions to judge internally. atis throwing an exception
insert image description here

backThe function of and frontis to return the last and first characters, but we can use this []to get it done, so you can simply understand it.

2.3.9 insertanderase

Using insertit, we can insert characters and strings into the string object:
insert image description here
here insert provides several versions, we only need to master a few commonly used ones.
insert image description here
Now we want to insert a string hello in front of the world, we can consider using this:
insert image description here
The first parameter is the position to insert, the second is the inserted string.
insert image description here
Now we want to insert a space at the fifth position can use this:
insert image description here
insert image description here

We can also consider using iterators:
insert image description here
insert image description here

Notice:For string, we do not recommend frequent use of insert. Because the bottom layer of the string is a character array, we have learned the data structure and know that inserting elements in the sequence table needs to move the data, and the efficiency is relatively low.

Let's look at it again erase:
eraseIs to delete the elements in the string object.
insert image description here

To give a simple example:

Here is the quote
Now we can take advantage of eraseremoving trailing spaces:
insert image description here
insert image description here

2.3.10 replace、find、rfind、substr

Let's take a look at replace:
insert image description here
The function of replace is actually to replace part of the string with new content. Here we also choose commonly used explanations.

Let's take an example:

Here is the quote
Now we need to replace the spaces in s with "hhh":
insert image description here
insert image description here

Let's look at find again:
insert image description here
find can search for a string or character in a string, and return the corresponding subscript. not found return npos

Let's give another example:
insert image description here
Now I want to find "w" in s:
insert image description here

Let's look at rfind again:
insert image description here
find is to find the first match from front to back, rfind is to find the last match from back to front

Let's look at substr again:
insert image description here
substrit can help us get a substring specified in the string object.
For example:
insert image description here
Here we get the substring with a length of five starting from the sixth position.

2.3.11 string::swap

insert image description here
Unlike the swap in the standard library, the swap here receives a string object and exchanges it with the current object
insert image description here

2.3.12 c_str

Let's take a look again c_str:
insert image description here
its function is to return a pointer to the character array corresponding to the current string object, and the type is const char*.
insert image description here

2.3.13 getline

Let's take an example:

int main()
{
    
    
	string s;
	cin >> s;
	cout << s << endl;
	return 0;
}

Now I want to input hello world, can it output normally?
insert image description here
The cin here, when we use them to input, it is possible to input multiple values, then when we input multiple values, they use spaces or newlines by default to distinguish the multiple values ​​​​we input.
So what we input here hello worldwill be considered as two values ​​separated by a space, so the cin value reads hello before the space, and the following world is left in the buffer.

We can getlinesolve this problem with:
insert image description here
getline ends when it reads a space, and of course it also supports us to specify the terminator. The first parameter is to receive cin, and the second parameter is to receive the string object we want to input
insert image description here

2.4 Summary

This is about the common interface of string here. There are many interfaces of string here. If you encounter something unclear later, I suggest you read the official document string.

Guess you like

Origin blog.csdn.net/weixin_69423932/article/details/132584567