Table of contents
-
-
- 1. Introduction to STL
- 2. Use of string class
-
- 2.1 Strings in C language
- 2.2 The string class in the standard library
- 2.3Common interface description of string class (only explain the most commonly used interface)
-
- 2.3.1 Common construction of string class objects
- 2.3.2 Capacity operation of string objects
- 2.3.3 Modification operation of string class object
- 2.3.5 Iterator (Forward)
- 2.3.6 Reverse iterators
- 2.3.7 const iterator (forward & reverse)
- 2.3.8 Element access
- 2.3.9 `insert` and `erase`
- 2.3.10 replace、find、rfind、substr
- 2.3.11 string::swap
- 2.3.12 c_str
- 2.3.13 getline
- 2.4 Summary
-
1. Introduction to STL
1.1 What is STL
STL (standard template libaray-standard template library): It is an important part of the C++ standard library, not only a reusable component library, but also a software framework including data structures and algorithms。
1.2STL version
- Original copy
The original version completed by Alexander Stepanov and Meng Lee in Hewlett-Packard Labs, in the spirit of open source, they declare that they allow anyone to freely use, copy, modify, disseminate, and commercially use these codes without payment. The only condition is that it also needs to be used as open source as the original version.
HP version - the granddaddy of all STL implementations。
- PJ version
Developed by PJ Plauger, inherited from HP version, adopted by Windows Visual C++, cannot be disclosed or modified, defect: low readability, weird symbol naming.
- RW version
Developed by Rouge Wage Company, inherited from the HP version, adopted by C++ Builder, it cannot be disclosed or modified, and its readability is average.
- SGI version
Developed by Silicon Graphics Computer Systems, Inc., inherited from the HP version. Adopted by GCC (Linux), it has good portability, can be made public, modified and even sold. From the point of view of naming style and programming style, it is very readable.When we learn STL later, we need to read part of the source code. The main reference is this version.。
1.3 Six major components of STL
Understand the six major components first, and learn slowly later.
1.4 Defects of STL
- The update of the STL library is too slow. This is a serious complaint. The previous version was C++98, and the C++03 in the middle was basically revised. It has been 13 years since C++11 came out, and STL was only further updated.
- STL does not currently support thread safety. In a concurrent environment, we need to lock ourselves. And the granularity of the lock is relatively large.
- STL's extreme pursuit of efficiency leads to internal complexity. Such as type extraction, iterator extraction.
- The use of STL has the problem of code expansion. For example, using vector/vector/vector will generate multiple copies of code. Of course, this is caused by the template syntax itself.
2. Use of string class
2.1 Strings in C language
In C language, a string is a collection of some characters ending with '\0'. For the convenience of operation, the C standard library provides some library functions of the str series, but these library functions are separated from the string and are not It conforms to the OOP idea (object-oriented idea), and the underlying space needs to be managed by the user, and it may be accessed out of bounds if you are not careful.
2.2 The string class in the standard library
The string class is actually a template class instantiated from a class template
Introduction to the documentation of the string class
We can see that it is actuallybasic_string
a typedef of the class instantiated from this class template.
Here,
basic_string
besides string, there are three other template classes instantiated.
They are all template classes instantiated from the basic_string class template, the difference is that the types of their corresponding template parameters are different。
For the string class: In fact, its bottom layer is a dynamic character array,
string is achar
type of character array
wstring is the correspondingwchar_t
character array
u16string ischar16_t
the character array
u32string ischar32_t
the character array The
corresponding sizes of these different types of characters are also different .
So why come up with so many characters?
This is actually because
all the symbols and letters in the ASCII code have a corresponding ASCII code value.
In fact, the letters themselves are not stored in the memory, but their corresponding ASCII code values (shown in hexadecimal here).
But ASCII is mainly used to display languages such as English, and there are many countries and languages in the world. For example, if we want computers to display Chinese, ASCII codes will not work.
Based on this reason, someone invented Unicode-Unicode (compatible with ASCII):
Unicode is divided into ,UTF-8
,UTF-16
andUTF-32
these.
So, in order to deal with these different encodings, these different character types are produced, so there is
basic_string
this generic string class template, which we can use to instantiate different types of string classes.
Summarize:
- string is a string class that represents strings
- The interface of this class is basically the same as that of regular containers, and some regular operations specially used to manipulate strings are added.
- string is actually at the bottom:
basic_string
the alias of the template class,typedef basic_string<char, char_traits, allocator> string;
- Sequences of multibyte or variable-length characters cannot be manipulated.
- When using the string class, you must include #include header files and using namespace std;
2.3Common interface description of string class (only explain the most commonly used interface)
2.3.1 Common construction of string class objects
(constructor) function name | Function Description |
---|---|
string() (empty string constructor default constructor focus) | Constructs an empty string with length zero characters |
string(const char* s) (emphasis) | Constructs a string-like object from a constant string |
string (const string& str, size_t pos, size_t len = npos) (not often used) | Copies the portion of str starting at character position pos and spanning len characters (or until the end of str if str is too short or len is string::npos) |
string (const char* s, size_t n) | Take s to point to the first n characters of the string to construct a string object |
string (size_t n, char c) | Take n characters c to construct a string object |
string (const string& str) (emphasis) | copy construction |
template string (InputIterator first, InputIterator last) | After the iterator |
Let's start to explain one by one:
string()
Here we construct an empty string.
- **
string (const char* s)
**
This is also supported here:
here is what we talked about beforeThe single-parameter constructor supports implicit type conversion。
string (const string& str, size_t pos, size_t len = npos)
Here, a substring in str is used to construct a string object. This string starts from the subscript pos position in str and has a length of len.
If the one herestr
is shorter, or if it is given herelen
,string::npos
then this string goes tostr
the end of .
To give a simple example:
The len here is 30, so the length of the string here is not enough, it is shorter than 30, but no error will be reported here, and the end position of the string will be taken here. If it is
given here , it will go to the end of str, and the default value will be given here, which is the default value . What is it here ?len
string::npos
len
npos
npos
It is a static member variable with a value of -1, but here its type issize_t
(unsigned integer), so it is actually the maximum value of the integer here。
string (const char* s, size_t n)
Use s to point to the first n characters of the string to construct a string object:
string (size_t n, char c)
Use n characters c to construct a string object
string (const string& str)
Copy construction:
2.3.2 Capacity operation of string objects
size
andlengh
both return the length of the string.
Here you may wonder why two interfaces have to be written for the same function.
In fact, it has something to do with some historical reasons.string appeared earlier than STL, strictly speaking, string does not belong to STL, it is generated by C++ standard library, and existed before STL appeared.
String was originally designed as length, but after the appearance of STL later, other data structures in it used size, so in order to maintain consistency, a size was added to string.
thereforeThe underlying implementation principle of size() and length() methods is exactly the same. The reason for introducing size() is to be consistent with the interfaces of other containers. In general, size() is basically used。
max_size
What it does is return the maximum length the string can be
In fact, the string cannot be so long, and the value is different under different platforms。
capacity
Here is to return the capacity of the current string object (that is, how much space is currently allocated to it, expressed in bytes)
==Here it does not include the space for '\0', because it thinks '\0' is not a valid character。
For other veterans, you can take a look at it in conjunction with the document temporarily, and I will explain it to you later.
2.3.3 Modification operation of string class object
push_back
As the name implies, it meanspush_back
tail insertion ( append 1 character ).
append
If you want to append a string you can useappend
There are many versions overloaded here, but the most commonly used one is to directly append a string
operator+=
In fact, we usually don't like to usepush_back
andappend
. Instead use itoperator+=
.
String is overloaded+=
(as mentioned in the article before operator overloading), which is very convenient to use
2.3.4 resize
andreserve
resize
With the above knowledge, let's look back at the sum in the capacity reserve
.
Before that, let's observe how a string object expands in the process of continuously inserting data.
int main()
{
string s;
size_t sz = s.capacity();
cout << "capacity changed: " << sz << '\n';
for (int i = 0; i < 100; ++i)
{
s.push_back('c');
if (sz != s.capacity())
{
sz = s.capacity();
cout << "capacity changed: " << sz << '\n';
}
}
return 0;
}
Here, almost every expansion on VS code is a 2-fold expansion.
After briefly understanding the expansion mechanism here, let's take a look
reserve
.reserve
can help us change the capacity, so that if we know how much space we need, we can open it in place at one time, so we don’t need to expand the capacity again and again.
We now specify reserve100 capacity,It doesn’t have to be 100, maybe due to some reasons such as alignment, it will give you some more space, but it will definitely not be smaller than 100。
If we know how much space we need, reserve can help us open up space in advance, and then reduce expansion and improve efficiency。
What's the use of that resize
?
resize
Not only can the space be opened, but also the opened space can be initialized.
Here we do not specify the second parameter, the characters to be filled in are given by default\0
, of course we can also specify the characters to be filled in by ourselves:
>
If the n we pass is less than the current string length, it can also help us Delete the extra content :
Note that only changes heresize
,capacity
not.
Under normal circumstances, it is not easy to shrink the capacity. If the capacity is reduced, it generally does not support in-place shrinking. Due to some reasons of the underlying memory management, it cannot be in-place shrinking.
If shrinking in place is supported, is it necessary to support the release of a part? We apply for a space, and only part of it is released if it is not used.
But it does not support only part of the free, like we free is not required to pass the pointer must point to the actual location.
So if you really want to shrink, you can only shrink in a different place, that is, open a new small space, copy the required data there, and then release the original space. Therefore, shrinking capacity is at the cost of performance, which is not supported by the system natively, and we need to do it ourselves. So don't shrink easily unless you have to.
2.3.5 Iterator (Forward)
Now we want to iterate over a string object. First, we can use
[ ]
traversal because string is overloaded[ ]
, or I can use range for. Besides these methods we can also use iterators.
Let's take a simple example:
int main()
{
string s1("hello world");
string::iterator it = s1.begin();
while (it != s1.end())
{
cout << *it << " ";
it++;
}
return 0;
}
The it here is an iterator of the string class we defined (string::iterator
it is a type). At this stage, you can think of an iterator as something like a pointer (not necessarily a pointer)。
The begin here will return an iterator pointing to the first character of the string。
The end here will return an iterator pointing to the position after the last character.
We can understand it as a pointer to two positions:
2.3.6 Reverse iterators
In addition to supporting forward and backward traversal as above, iterators also support reverse traversal, and reverse traversal is called reverse iterator.
Here rbegin() returns a reverse iterator pointing to the last character of the string.
hererend() returns a reverse iterator that points to the previous character of the string。
Let's look at the previous example again:
int main()
{
string s1("hello world");
string::reverse_iterator it = s1.rbegin();
while (it != s1.rend())
{
cout << *it << " ";
it++;
}
return 0;
}
2.3.7 const iterator (forward & reverse)
For const objects that cannot be modified, ordinary iterators can think of it as a pointer-like thing, thenWe can't modify it by dereferencing it, so we can't use ordinary iterators here, which will cause privilege amplification。
We seebegin()
that if a const object calls begin, then the const iterator const_iterator is returned. Ordinary iterators can read and modify data, but const iterators can only be read and cannot be modified.。
The const reverse iterator is the iterator that the const object calls
rbegin()
andrend()
returnsconst_reverse_iterator
Here C++11 provides another set of iteratorscbegin cend crbegin crend
, which only return const iterators.
2.3.8 Element access
string is overloaded
[]
, we can use it directly:
operator[]
There is also a normal version and a const version. The normal object calls [] and returnschar&
the const objectconst char&
, which cannot be modified.
at
The effect[]
is the same. However, there is still a difference between the two, the difference is:
If[]
it is accessed out of bounds, it will directly report an error, and it uses assertions to judge internally.at
is throwing an exception
back
The function of andfront
is to return the last and first characters, but we can use this[]
to get it done, so you can simply understand it.
2.3.9 insert
anderase
Using
insert
it, we can insert characters and strings into the string object:
here insert provides several versions, we only need to master a few commonly used ones.
Now we want to insert a string hello in front of the world, we can consider using this:
The first parameter is the position to insert, the second is the inserted string.
Now we want to insert a space at the fifth position can use this:
We can also consider using iterators:
Notice:For string, we do not recommend frequent use of insert. Because the bottom layer of the string is a character array, we have learned the data structure and know that inserting elements in the sequence table needs to move the data, and the efficiency is relatively low.。
Let's look at it again
erase
:
erase
Is to delete the elements in the string object.
To give a simple example:
Now we can take advantage oferase
removing trailing spaces:
2.3.10 replace、find、rfind、substr
Let's take a look at replace:
The function of replace is actually to replace part of the string with new content. Here we also choose commonly used explanations.
Let's take an example:
Now we need to replace the spaces in s with "hhh":
Let's look at find again:
find can search for a string or character in a string, and return the corresponding subscript. not found return npos
Let's give another example:
Now I want to find "w" in s:
Let's look at rfind again:
find is to find the first match from front to back, rfind is to find the last match from back to front
Let's look at substr again:
substr
it can help us get a substring specified in the string object.
For example:
Here we get the substring with a length of five starting from the sixth position.
2.3.11 string::swap
Unlike the swap in the standard library, the swap here receives a string object and exchanges it with the current object。
2.3.12 c_str
Let's take a look again
c_str
:
its function is to return a pointer to the character array corresponding to the current string object, and the type is const char*.
2.3.13 getline
Let's take an example:
int main()
{
string s;
cin >> s;
cout << s << endl;
return 0;
}
Now I want to input hello world, can it output normally?
The cin here, when we use them to input, it is possible to input multiple values, then when we input multiple values, they use spaces or newlines by default to distinguish the multiple values we input.
So what we input herehello world
will be considered as two values separated by a space, so the cin value reads hello before the space, and the following world is left in the buffer.
We can
getline
solve this problem with:
getline ends when it reads a space, and of course it also supports us to specify the terminator. The first parameter is to receive cin, and the second parameter is to receive the string object we want to input。
2.4 Summary
This is about the common interface of string here. There are many interfaces of string here. If you encounter something unclear later, I suggest you read the official document string.