String, vector iterator

  • The C++ language also defines a rich library of abstract data types. Among them, string and vector are the two most important standard library types. The former supports variable-length strings, and the latter represents variable-length collections. Another standard library type is the iterator, which is a supporting type of string and vector and is often used to access characters in string or elements in vector. The built-in array is a more basic type, and string and l vector are some abstractions of it . This chapter will introduce arrays and standard library types string and vector respectively.

  • Built-in types are directly defined by the C++ language. These types, such as numbers and characters, represent the capabilities inherent in most computer hardware . The standard library defines an additional set of types with more advanced properties that are not yet directly implemented into computer hardware . String represents a variable-length sequence of characters, and vector stores a variable-length sequence of objects of a given type. We will also introduce the built-in array types. Like other built-in types, the implementation of arrays is closely related to the hardware. Therefore, compared to the standard library types string and vector, arrays are slightly less flexible.

  • The library functions used basically belong to the namespace std, and the program also explicitly marks this point. For example, std::cin means reading from standard input. The meaning of using the scope operator ( :: ) here is that the compiler should look for the name on the right side of the scope indicated by the name on the left side of the operator. Therefore, std: :cin means to use the name cin in the namespace std. Members of the namespace can also be used in a simpler way. In this section, we'll learn one of the safest methods, using a using declaration. With the using declaration, the desired name can be used without a special prefix (such as namespace: :). The using statement has the following form:

    • using namespace::name;
      
  • Once you declare the above statement, you can directly access the names in the namespace:

    • Insert image description here
  • The form of the C++ language is relatively free, so you can put only one using statement on a line, or you can put multiple using statements on a line. Note, however, that each name used must have its own declaration statement, and each statement must end with a semicolon .

  • Code located in header files generally should not use using statements . This is because the contents of the header file willcopyGo to all the files that reference it. If there is a using statement in the header file, then every file that uses the header file will have this statement. For some programs, unexpected name conflicts may occur due to the inadvertent inclusion of some names.

  • The standard library type string represents a variable-length character sequence. To use the string type, you must first include the string header file. As part of the standard library, string is defined in the namespace std. The following examples assume that the following code is included:

    • #include <string>
      using std::string
      
  • On the one hand, the C++ standard specifies in detail the operations provided by library types, and on the other hand, it also imposes some performance requirements on library implementers. Therefore, the standard library types are efficient enough for general applications.

  • How an object of a class is initialized is determined by the class itself. A class can define many ways to initialize objects, but these methods must be different: either the number of initial values ​​is different, or the type of initial values ​​is different. Listed are some of the most common ways to initialize string objects. Here are a few examples:

    • 	string sl;//默认初始化,s1是一个空字符串
      	string s2 = sl;// s2是s1的副本
      	string s3 = "hiya";// s3是该字符串字面值的副本
      	string s4(10, 'c');// s4的内容是cccccccccc
      
  • You can initialize a string object by default, so you will get an empty string, that is, there are no characters in the string object . If a string literal is provided, all characters in the literal except the last null character are copied to the newly created string object. If a number and a character are provided, the content of the string object is the sequence obtained by repeating the given character several times in succession.

  • The C++ language has several different initialization methods. Through string, we can clearly see the differences and connections between these initialization methods. If you use the equal sign (=) to initialize a variable, copy initialization is actually performed. The compiler copies the initial value on the right side of the equal sign to the newly created object . In contrast, if the equal sign is not used, direct initialization is performed.

    • 	string s5 = "hiya";//拷贝初始化
      	string s6("hiya");//直接初始化
      	string s7(10, 'c');//直接初始化,s7的内容是cccccccccc
      	string s8 = string(10, 'c');//拷贝初始化,s8的内容是cccccccccc
      
  • The initial value of s8 is string(10, 'c'), which is actually a string object created with two parameters: the number 10 and the character c, and then this string object is copied to s8.

  • In addition to specifying the way to initialize its objects, a class must also define the operations that can be performed on the objects. Among them, the class can not only define operations called through function names, just like the isbn function of the sales_item class, but also define new meanings of various operators such as << and + on objects of this class.

    • os<<s Write s to the output stream os and return os
      is >> s Read the string from is and assign it to s. The string is separated by whitespace and return is
      getline(is, s) Read a row from is and assign it to s, returning is
      s.empty() Returns true if s is empty, otherwise returns false
      s.size() Returns the number of characters in s
      s[n] Returns a reference to the nth character in s, position n starts from 0
      s1 + s2 Returns the result after connecting s1 and s2
      sl = s2 Replace original characters in s1 with copies of s2
      s1 == s2 If the characters contained in s1 and s2 are exactly the same, they are equal; the string objects are equal.
      s1 != s2 Equality judgment is sensitive to the case of letters
      <, <= ,>, >= Comparisons are made using the order of characters in the dictionary and are sensitive to the case of letters.
  • Like input and output operations of built-in types, such operations on string objects also return the operand on the left side of the operator as its result. Therefore, multiple inputs or multiple outputs can be written together:

    • string s1,s2;
      cin>>s1>>s2;
      cout<<s1<<s2<<endl;
      
  • Sometimes we want to retain the whitespace characters during input in the final string. In this case, we should use the getline function instead of the original >> operator . The parameters of the getline function are an input stream and a string object. The function reads content from the given input stream until it encounters a newline character (note that the newline character is also read in), and then stores the read content into Go to that string object (note that the newline character is not stored) . getline onlyEnd the read operation as soon as a newline character is encounteredAnd returns the result, even if the input starts with a newline character. If the input really starts with a newline character, the result is an empty string .

  • The empty function returns a corresponding Boolean value based on whether the string object is empty. Like the isbn member of the sales_item class, empty is also a member function of string. The method of calling this function is very simple. Just use the dot operator to indicate which object executed the empty function . The size function returns the length of the string object (that is, the number of characters in the string object). You can use the size function to only output lines longer than 80 characters:

    • 	string line;
      	while (getline(cin, line))
      		if (!line.empty())
      			if(line.size()>80)
      				cout << line << endl;
      
  • The string class and most other standard library types define several supporting types. These supporting types embody the machine-independent nature of standard library types, and type size_type is one of them. When used specifically, the scope operator is used to indicate that the name size_type is defined in the class string.

  • Although we don't know much about the details of the string : :size_type type, one thing is for sure: it is an unsigned value and can hold the size of any string object . All variables used to store the return value of the size function of the string class should be of type string: :size_type.

  • Since the size function returns an unsigned integer, remember that mixing signed and unsigned numbers in an expression may produce unexpected results. For example, assuming n is an int with a negative value, the expression s.size () < n will almost certainly evaluate to true. This is becauseA negative value n is automatically converted to a larger unsigned value. If there is already a size() function in an expression, do not use int. This can avoid possible problems caused by mixing int and unsigned .

  • The equality operators (== and !=) respectively test whether two string objects are equal or not. Equality of string objects means that they have the same length and contain the same characters. The relational operators <, <=, >, and >= respectively test whether a string object is less than, less than or equal to, greater than, or greater than or equal to another string object. The above operators are all in (case-sensitive) lexicographic order:

    • If the lengths of the two string objects are different, and each character of the shorter string object is the same as the corresponding character of the longer string object, the shorter string object is said to be smaller than the longer string object.

    • If two string objects are inconsistent in some corresponding positions, the result of the string object comparison is actually the comparison result of the first pair of different characters in the string object .

  • Generally speaking, when designing standard library types, we strive to be in line with built-in types in terms of ease of use, so most library types support assignment operations . For the string class, it is allowed to assign the value of one object to another object:

    • string st1(10,"c"),st2;
      st1=st2;//此时均为空
      
  • Adding two string objects results in a new string object, whose content is formed by concatenating the operand on the left and the operand on the right . That is to say, the result of using the addition operator (+) on a string object is a new string object, and the characters it contains are composed of two parts: the first half is the characters contained in the string object on the left side of the plus sign, and the second half It is the character contained in the string object on the right side of the plus sign. In addition, the compound assignment operator (+=) is responsible for appending the contents of the string object on the right to the string object on the left :

    • 	string s1 = "hello, ", s2 = "world\n";
      	string s3 = s1 + s2; // s3的内容是hello, world\n
      	sl += s2;//等价于sl = s1 + s2
      
  • Because the standard library allows character literals and string literals to be converted into string objects, these two literals can be used instead where string objects are required . When mixing string objects with character literals and string literals in a statement, you must ensure that at least one of the operands on both sides of each addition operator (+) is a string.

  • Functions in the cctype header file

    • isalnum(c ) True when c is a letter or number
      isalpha( c) True when c is a letter
      iscntrl( c) True when c is a control character
      isdigit( c) True when c is a number
      isgraph( c) True when c is not a space but printable
      islower( c) True when c is a lowercase letter
      sprint (c) True when c is a printable character (i.e. c is a space or c has a visual form)
      ispunct( c) True when c is a punctuation mark (that is, c is not a control character, a number, a letter, or a printable whitespace)
      isspace( c) True when c is a blank (that is, c is one of the space, horizontal tab, vertical tab, carriage return, line feed, and feed characters)
      isupper( c) True when c is an uppercase letter
      isxdigit( c) True when c is a hexadecimal number
      tolower( c) If c is an uppercase letter, output the corresponding lowercase letter: otherwise, output c as it is.
      toupper( c) If c is a lowercase letter, output the corresponding uppercase letter; otherwise, output c as it is.
  • In addition to defining functions unique to the C+ language, the C++ standard library is also compatible with the C language standard library.. The C language header file is in the form of name.h, and C++ names these files cname. That is, the .h suffix is ​​removed, and the letter c is added before the file name. The c here indicates that this is a header file belonging to the C language standard library.

  • Therefore, the contents of the cctype header file and the ctype.h header file are the same, but they are more in line with the requirements of the C++ language in terms of naming conventions. In particular, names defined in a header file named cname belong to the namespace std, while names defined in a header file named .h do not .

  • Generally speaking, C++ programs should use the header file named cname instead of name.h. Names in the standard library can always be found in the namespace std. If you use .h format header files, programmers have to keep in mind which ones are inherited from the C language and which ones are unique to the C++ language.

  • If you want to do something with each character in the string object, the best way at present is to use a statement provided by the new C++11 standard: the range for (range for) statement. This type of statement iterates through each element in a given sequence and performs some operation on each value in the sequence. Its syntax is:

    • for (declaration : expression)
      statement
      
  • Among them, the expression part is an object used to represent a sequence. The declaration part is responsible for defining a variable that will be used to access the basic elements in the sequence. On each iteration, the variables in the declaration part are initialized to the value of the next element in the expression part .

  • A string object represents a sequence of characters, so a string object can be used as the expression part of a range for statement. To give a simple example, we can use the range for statement to output the characters in the string object one per line:

    • 	string str(" some string");//每行输出str中的一个字符。
      	for (auto c : str)//对于str中的每个字符
      		cout << c << endl;//输出当前字符,后面紧跟一个换行符
      
  • The for loop connects the variables c and str. The way we define the loop control variable is the same as defining any ordinary variable. In this example, the compiler determines the type of variable c by using the auto keyword, where the type of c is char . At each iteration, the next character of str is copied to c, so the loop can be read as "for each character c in the string str," perform such-and-such operation. The "XXX operation" in this example outputs one character and then breaks the line.

  • If you want to change the value of the characters in the string object, you must define the loop variable as a reference type. Remember, a reference is just an alias for a given object, so when you use a reference as a loop control variable, the variable is actually bound to each element of the sequence in turn . Using this reference, we can change the character it is bound to.

  • The new example is no longer counting the number of punctuation marks. Suppose we want to rewrite the string into uppercase letters. To do this you can use the standard library function toupper, which takes a character and outputs its corresponding uppercase form. In this way, in order to convert the entire string object to uppercase, just call the toupper function on each character and assign the result to the original character:

    • 	string s("Hello world!!!");//转换成大写形式。
      	for (auto& c : s)
      		//对于s 中的每个字符(注意:c是引用)
      		c = toupper(c);
      	//c是一个引用,因此赋值语句将改变s中字符的值
      	cout << s << endl;
      //输出:HELLO WORLD !!!
      
  • If you want to process every character in a string object, it's a good idea to use a range for statement. However, sometimes we need to access only one of the characters, or access multiple characters but stop when encountering a certain condition. For example, the same character is changed to uppercase, but the new requirement is no longer to do this for the entire string, but only to uppercase the first letter or first word in the string object.

  • There are two ways to access a single character in a string object: one is to use a subscript, and the other is to use an iterator . The input parameter received by the subscript operator ([ ]) is a value of type string: :size_type. This parameter indicates the position of the character to be accessed; the return value is a reference to the character at that position. The subscripts of string objects start from 0. If the string object s contains at least two characters, then s[0] is the first character, s[1] is the second character, and s[s.size()-1] is the last character.

  • The subscript of the string object must be greater than or equal to 0 and less than s.size(). Using a subscript beyond this range will cause unpredictable results. It can be inferred that using subscripts to access an empty string will also cause unpredictable results . The value of the subscript is called the "subscript" or "index", and any expression can be used as an index as long as its value is an integer value. However, if an index is of signed type the value will be automatically converted to the unsigned type represented by string::size_type.

  • Before accessing the specified character, first check whether s is empty. In fact, whenever you use a subscript on a string object, you must confirm that there is indeed a value at that position. If s is empty, the result of s[0] will be undefined.

  • Logical AND operator (&&). If both operands involved in the operation are true, the logical AND result is true; otherwise, the result is false. The most important thing about this operator is that the C++ language stipulates that the condition of the right-hand operand will only be checked if the left-hand operand is true . As shown in this example, this rule ensures that only when the subscript value is within a reasonable range, the subscript will actually be used to access the string. That is, s[index] will not be executed until index reaches s.size(). As index increases, it can never exceed the value of s.size (), so it is ensured that index is smaller than s.size ().

  • The standard library type vector represents a collection of objects, all of which are of the same type. Each object in the collection has a corresponding index, which is used to access the object . Because vector "holds" other objects, it is often called a container. To use vector, the appropriate header files must be included. In subsequent examples, it will be assumed that the following using statement is made:

    • 	#include <vector>
      	using std::vector;
      
  • The C++ language has both class templates and function templates, where vector is a class template. Only with a fairly in-depth understanding of C++ can you write templates. Fortunately, even if you don't know how to create a template yet, you can try using it first. The template itself is not a class or function. Instead, you can think of a template as an instruction for writing a compiler-generated class or function . The process by which the compiler creates a class or function based on a template is called instantiation. When using a template, you need to indicate what type the compiler should instantiate the class or function into.

  • For class templates, we specify what kind of class the template is instantiated by providing some additional information. What information needs to be provided is determined by the template . The way to provide information is always like this: follow the template name with a pair of angle brackets, and put the information inside the brackets.

    • 	vector<int> ivec;// ivec保存int类型的对象
      	vector<sales_item> sales_vec;//保存sales_item类型的对象
      	vector<vector<string>> file;//该向量的元素是vector对象
      
  • Vector can accommodate most types of objects as its elements, but because references are not objects, there is no vector containing references . In addition, most other (non-reference) built-in types and class types can form vector objects, and even the elements that make up a vector can also be vectors.

  • It should be pointed out that in earlier versions of the C++ standard, if the elements of vector were still vectors (or other template types), their definition form was slightly different from the current C++11 new standard. In the past, you had to add a space between the right angle bracket of the outer vector object and its element type , for example, it should be written vector<vector> instead of vector<vector>.

    • vector v1 v1 is an empty vector, its potential elements are of type T, and default initialization is performed.
      vector v2(v1) v2 contains copies of all elements of v1
      vector v2 = v1 Equivalent to v2(v1), v2 contains copies of all elements of v1
      vector v3(n, val) v3 contains n repeated elements, each element's value is val
      vector v4(n) v4 contains n objects that repeatedly perform value initialization
      vector v5{ a,b, c… } v5 contains the number of elements with initial values, and each element is assigned a corresponding initial value.
      vector v5 = {a, b, c… .} Equivalent to v5{ a,b,c… }
  • Of course, you can also specify the initial value of the element when defining the vector object. For example, it is allowed to copy the elements of one vector object to another vector object. At this time, the elements of the new vector object are copies of the corresponding elements of the original vector object. Note that the two vector objects must be of the same type :

    • vector<int> ivec;//初始状态为空
      //在此处给ivec添加一些值
      vector<int> ivec2(ivec);//把ivec的元素拷贝给ivec2
      vector<int> ivec3 = ivec; // 把ivec的元素拷贝给ivec3
      vector<string> svec(ivec2);//错误: svec的元素是string对象,不是int
      
  • Usually, you can only provide the number of elements that the vector object can hold without omitting the initial value. At this point the library will create a value-initialized element initial value and assign it to all elements in the container. This initial value is determined by the type of elements in the vector object.

  • If the elements of the vector object are built-in types, such as int, the initial value of the element is automatically set to 0. If the element is of a certain class type, such as string, the element is default-initialized by the class:

    • vector<int> ivec(10);//10个元素,每个都初始化为0
      vector<string> svec(10);//10个元素,每个都是空string对象
      
  • There are two special restrictions on this method of initialization: First, some classes require that the initial value must be provided explicitly. If the type of elements in the vector object does not support default initialization, we must provide the initial element value . For this type of object, initialization cannot be completed by simply providing the number of elements without setting an initial value.

    • vector<int> vi = 10;//错误:必须使用直接初始化的形式指定向量大小
      
  • The 10 here is used to illustrate how to initialize the vector object. Our original intention of using it is to create a vector object containing 10 elements with initialized values, rather than "copying" the number 10 into the vector. Therefore, it is not appropriate to use copy initialization at this time.

  • In some cases,The true meaning of initialization depends on whether curly braces or parentheses are used to pass the initial value. For example, when initializing a vector with an integer, the meaning of the integer may be the capacity of the vector object or the value of the element. Similarly, when two integers are used to initialize a vector, one of the two integers may be the capacity of the vector object and the other is the initial value of the element, or they may be the initial value of the two elements in the vector object with a capacity of 2. These meanings can be distinguished by using curly or parentheses:

    • vector<int> v1(10);// v1有10个元素,每个的值都是О
      vector<int> v2{
              
               10 };// v2有1个元素,该元素的值是10
      vector<int> v3(10, 1); // v3有10个元素,每个的值都是1
      vector<int> v4{
              
               101 }; // v4有2个元素,值分别是10和1
      
  • If parentheses are used, it can be said that the provided value is used to construct the vector object. For example, the initial value of v1 illustrates the capacity of the vector object; the two initial values ​​of v3 illustrate the capacity of the vector object and the initial value of the element respectively.

  • If curly braces are used, it can be stated that we want to list initialize the vector object. In other words, the initialization process will treat the values ​​in the curly braces as a list of initial values ​​of the elements as much as possible, and other initialization methods will only be considered when list initialization cannot be performed . In the above example, the initial values ​​provided to v2 and v4 can both be used as element values, so they will both perform list initialization. The vector object v2 contains one element and the vector object v4 contains two elements .

  • On the other hand, if the curly brace form is used during initialization but the provided value cannot be used for list initialization, consider using such values ​​to construct the vector object. For example, if you want to list-initialize a vector object containing a string object, you should provide an initial value that can be assigned to the string object. At this point, it is not difficult to distinguish whether to initialize the elements of the vector object with a list or to construct the vector object with a given capacity value:

    • vector<string> v5{
              
               "hi" }; //列表初始化:v5有一个元素
      vector<string> v6("hi");//错误:不能使用字符串字面值构建vector对象
      vector<string> v7{
              
              10};//v7有10个默认初始化的元素
      vector<string> v8{
              
               10,"hi" };// v8有10个值为"hi"的元素
      
  • Although curly braces are used in the above example except for the second statement, only v5 is actually list initialization. To list-initialize a vector object, the value in the curly braces must be of the same type as the element. Obviously string objects cannot be initialized with int, so the values ​​provided by v7 and v8 cannot be used as the initial value of the element. After confirming that list initialization cannot be performed, the compiler will try to initialize the vector object with default values.

  • For vector objects, direct initialization is suitable for three situations: the initial value is known and the number is small, the initial value is a copy of another vector object, and the initial value of all elements is the same . However, a more common situation is that when creating a vector object, the actual number of elements required is not known, and the values ​​of the elements are often uncertain . Sometimes even if the initial values ​​of the elements are known, if the total number of these values ​​is large and different, it will be too cumbersome to perform initialization operations when creating the vector object.

  • A better approach is to create an empty vector first, and then use the vector's member functions at runtimepush_backAdd elements to it . push_back is responsible for "push" a value to the "end (back)" of the vector object as the tail element of the vector object. For example:

    • vector<int> v2;//空vector对象
      for (int i = 0; i != 100; ++i)
      	v2.push_back(i); //依次把整数值放到v2尾端//循环结束后v2有100个元素,值从0到99
      
  • Similarly, if the exact number of elements in the vector object is not known until runtime, you should use the method just described to create the vector object and assign values ​​to it. For example, sometimes you need to read data in real time and assign it to a vector object:

    • // 从标准输入中读取单词,将其作为vector对象的元素存储string word;
      vector<string> text;// 空vector对象
      while (cin >> word) {
              
              
      	text.push_back(word); // 把 word添加到text后面
      
  • The C++ standard requires that vectors should be able to add elements efficiently and quickly at runtime. So since vector objects can grow efficiently, there is no need to set the size when defining the vector object. In fact, performance may be worse if you do so. The only exception is that all elements have the same value. Once the values ​​of the elements are different, it is more efficient to first define an empty vector object and then add specific values ​​to it at runtime. Vector also provides methods that allow us to further improve the performance of dynamically adding elements. Creating an empty vector object at the beginning and dynamically adding elements at runtime is different from the usage of built-in array types in C and most other languages. Especially if you are used to C or Java, you can expect that it is best to specify the capacity of the vector object when you create it. In fact, however, the opposite is usually the case.

  • Because elements can be added to vector objects efficiently and conveniently, many programming tasks are greatly simplified. However, this simplicity also comes with some higher requirements for writing programs: one of them is to ensure that the loops written are correct, especially when the loop may change the capacity of the vector object. As we use vector more, we will gradually learn about some other implicit requirements, one of which is to be pointed out now: if the loop body contains statements that add elements to the vector object,You cannot use a range for loop

  • In addition to push_back, vector also provides several other operations, most of which are similar to string-related operations. Some of the more important ones are listed below.

    • v.empty() Returns true if v does not contain any elements: otherwise returns false
      v.size() Returns the number of elements in v
      v.push_back(t) Add an element with value t to the end of v
      v[n] Returns a reference to the element at position n in v
      v1 = v2 Replace elements in v1 with copies of elements in v2
      v1 = { a, b,c… } Replace the elements in v1 with copies of the elements in the list
      v1==v2 v1 and v2 are equal if and only if they have the same number of elements and the element values ​​at the corresponding positions are the same.
      v1 != v2
      <, <= , >, >= As the name suggests, comparisons are done in dictionary order
  • 访问vector对象中元素的方法和访问string 对象中字符的方法差不多,也是通过元素在 vector对象中的位置。例如,可以使用范围for语句处理vector对象中的所有元素:

    • vector<int> v{
              
               1,2,3,4,5,6,7,8,9 }; 
      	for (auto& i : v)//对于v中的每个元素(注意:i是一个引用)
      		i *= i;//求元素值的平方
      	for (auto i : v)//对于v中的每个元素
      		cout << i << " ";//输出该元素
      	cout << endl;
      
  • 第一个循环把控制变量 i 定义成引用类型,这样就能通过 i 给 v 的元素赋值,其中 i 的类型由auto关键字指定。这里用到了一种新的复合赋值运算符。如我们所知,+=把左侧运算对象和右侧运算对象相加,结果存入左侧运算对象;类似的,*=把左侧运算对象和右侧运算对象相乘,结果存入左侧运算对象。最后,第二个循环输出所有元素。

  • vector的 empty和 size两个成员与string的同名成员功能完全一致: empty检查vector对象是否包含元素然后返回一个布尔值; size则返回vector对象中元素的个数,返回值的类型是由vector定义的size_type类型

  • 各个相等性运算符和关系运算符也与string 的相应运算符(参见3.2.2节,第79页)功能一致。两个vector对象相等当且仅当它们所含的元素个数相同,而且对应位置的元素值也相同。关系运算符依照字典顺序进行比较:如果两个vector对象的容量不同,但是在相同位置上的元素值都一样,则元素较少的vector对象小于元素较多的vector对象;若元素的值有区别,则 vector对象的大小关系由第一对相异的元素值的大小关系决定

  • 刚接触C++语言的程序员也许会认为可以通过vector对象的下标形式来添加元素,事实并非如此。下面的代码试图为vector对象ivec添加10个元素:

    • vector<int> ivec; // 空vector对象
      for (decltype (ivec.size()) ix = 0; ix != 10; ++ix)
      	ivec[ix] = ix; //严重错误:ivec不包含任何元素
      
  • 然而,这段代码是错误的: ivec是一个空vector,根本不包含任何元素,当然也就不能通过下标去访问任何元素!如前所述,正确的方法是使用push_back:

    • for (decltype (ivec.size()) ix = 0; ix != 10; ++ix)
      	ivec.push_back(ix); //正确:添加一个新元素,该元素的值是ix
      
  • 关于下标必须明确的一点是:只能对确知已存在的元素执行下标操作。例如,

    • vector<int> ivec;// 空vector对象
      cout << ivec[0];// 错误:ivec不包含任何元素
      vector<int> ivec2(10); // 含有10个元素的vector对象
      cout << ivec2[10];//错误:ivec2元素的合法索引是从0到9
      
  • 试图用下标的形式去访问一个不存在的元素将引发错误, 不过这种错误不会被编译器发现,而是在运行时产生一个不可预知的值。不幸的是, 这种通过下标访问不存在的元素的行为非常常见, 而且会产生很严重的后果。所谓的缓冲区溢出(buffer overflow)指的就是这类错误,这也是导致PC及其他设备上应用程序出现安全问题的一个重要原因

  • 我们已经知道可以使用下标运算符来访问 string对象的字符或vector对象的元素,还有另外一种更通用的机制也可以实现同样的目的,这就是迭代器(iterator)。除了vector之外,标准库还定义了其他几种容器。所有标准库容器都可以使用迭代器,但是其中只有少数几种才同时支持下标运算符。严格来说,string对象不属于容器类型,但是string支持很多与容器类型类似的操作。vector支持下标运算符,这点和 string一样; string支持迭代器,这也和vector是一样的。

  • 类似于指针类型,迭代器也提供了对对象的间接访问。就迭代器而言,其对象是容器中的元素或者string对象中的字符。使用迭代器可以访问某个元素,迭代器也能从一个元素移动到另外一个元素。迭代器有有效和无效之分,这一点和指针差不多。有效的迭代器或者指向某个元素,或者指向容器中尾元素的下一位置:其他所有情况都属于无效。

  • 和指针不一样的是,获取迭代器不是使用取地址符,有迭代器的类型同时拥有返回迭代器的成员。比如,这些类型都拥有名为begin和 end的成员,其中 begin 成员负责返回指向第一个元素(或第一个字符)的迭代器。如有下述语句:

    • //由编译器决定b和e的类型. b表示v的第一个元素,e表示v尾元素的下一位置
      auto b = v.begin(), e = v.end(); //b和e的类型相同
      
  • end成员则负责返回指向容器(或string对象)“尾元素的下一位置(one past the end)”的迭代器,也就是说,该迭代器指示的是容器的一个本不存在的“尾后(off the end)”元素。这样的迭代器没什么实际含义,仅是个标记而已,表示我们已经处理完了容器中的所有元素。end 成员返回的迭代器常被称作尾后迭代器(off-the-end iterator)或者简称为尾迭代器(end iterator)。特殊情况下如果容器为空,则 begin和 end返回的是同一个迭代器

  • 表列举了迭代器支持的一些运算。使用==和!=来比较两个合法的迭代器是否相等,如果两个迭代器指向的元素相同或者都是同一个容器的尾后迭代器,则它们相等;否则就说这两个迭代器不相等。

    • *iter 返回迭代器iter所指元素的引用
      iter->mem 解引用iter并获取该元素的名为mem的成员,等价于(*iter).mem
      ++iter 令iter指示容器中的下一个元素
      –iter 令iter指示容器中的上一个元素
      iter1 == iter2 判断两个迭代器是否相等(不相等),如果两个迭代器指示的是同一个元素或者它们是同一个容器的尾后迭代器,则相等; 反之,不相等
      iter1 != iter2
  • 和指针类似,也能通过解引用迭代器来获取它所指示的元素,执行解引用的迭代器必须合法并确实指示着某个元素。试图解引用一个非法迭代器或者尾后迭代器都是未被定义的行为。

  • 举个例子,利用下标运算符把string对象的第一个字母改为了大写形式,下面利用迭代器实现同样的功能:

    • string s(""some string" ) ;
      if (s.begin() != s.end()) {
              
              // 确保s 非空
      	auto it = s.begin();// it表示s 的第一个字符
      	*it = toupper(*it);//将当前字符改成大写形式
      }
      //输出:Some string
      
  • 迭代器使用递增(++)运算符来从一个元素移动到下一个元素。从逻辑上来说,迭代器的递增和整数的递增类似,整数的递增是在整数值上“加1”,迭代器的递增则是将迭代器“向前移动一个位置”。

  • 因为 end返回的迭代器并不实际指示某个元素,所以不能对其进行递增或解引用的操作

  • //依次处理s的字符直至我们处理完全部字符或者遇到空白
    for (auto it = s.begin(); it != s.end() && !isspace(*it); ++it)
    	*it = toupper(*it); //将当前字符改成大写形式
    
  • 和上文的那个程序一样,上面的循环也是遍历s的字符直到遇到空白字符为止,只不过之前的程序用的是下标运算符,现在这个程序用的是迭代器。循环首先用s.begin的返回值来初始化it,意味着it指示的是s 中的第一个字符(如果有的话)。条件部分检查是否已到达s 的尾部,如果尚未到达,则将it解引用的结果传入isspace函数检查是否遇到了空白。每次迭代的最后,执行++it令迭代器前移一个位置以访问s的下一个字符。循环体内部和上一个程序if语句内的最后一句话一样,先解引用it,然后将结果传入toupper函数得到该字母对应的大写形式,再把这个大写字母重新赋值给it所指示的字符。

  • 原来使用C或Java的程序员在转而使用C++语言之后,会对for循环中使用!=而非<进行判断有点儿奇怪。C++程序员习惯性地使用!=,其原因和他们更愿意使用迭代器而非下标的原因一样:因为这种编程风格在标准库提供的所有容器上都有效。之前已经说过,只有string和 vector等一些标准库类型有下标运算符,而并非全都如此。与之类似,所有标准库容器的迭代器都定义了==和!-,但是它们中的大多数都没有定义<运算符。因此,只要我们养成使用迭代器和!=的习惯,就不用太在意用的到底是哪种容器类型。

  • 就像不知道string和 vector的size_type成员到底是什么类型一样,一般来说我们也不知道(其实是无须知道)迭代器的精确类型。而实际上,那些拥有迭代器的标准库类型使用iterator和const_iterator来表示迭代器的类型:

    • vector<int>:: iterator it;// it能读写vector<int>的元素
      string :: iterator it2;// it2能读写string对象中的字符
      vector<int> :: const_iterator it3; // it3只能读元素,不能写元素
      string :: const_iterator it4; // it4 只能读字符,不能写字符
      
  • const_iterator和常量指针差不多,能读取但不能修改它所指的元素值。相反,iterator的对象可读可写。如果vector对象或string对象是一个常量,只能使用const_iterator;如果vector对象或string对象不是常量,那么既能使用iterator也能使用const_iterator。

  • 迭代器这个名词有三种不同的含义:可能是迭代器概念本身,也可能是指容器定义的迭代器类型,还可能是指某个迭代器对象。重点是理解存在一组概念上相关的类型,我们认定某个类型是迭代器当且仅当它支持一套操作,这套操作使得我们能访问容器的元素或者从某个元素移动到另外一个元素。每个容器类定义了一个名为 iterator 的类型,该类型支持迭代器概念所规定的一套操作。

  • begin和 end返回的具体类型由对象是否是常量决定,如果对象是常量,begin和end返回const_iterator;如果对象不是常量,返回iterator:

    • vector<int> v;
      const vector<int> cv;
      auto it1 = v.begin(); // it1的类型是vector<int> : : iterator
      auto it2 = cv.begin(); // it2的类型是vector<int> : : const_iterator
      
  • 有时候这种默认的行为并非我们所要。如果对象只需读操作而无须写操作的话最好使用常量类型(比如 const_iterator)。为了便于专门得到const_iterator类型的返回值,C++11新标准引入了两个新函数,分别是cbegin和 cend:

    • auto it3 = v.cbegin(); // it3的类型是vector<int> : : const_iterator
      
  • 解引用迭代器可获得迭代器所指的对象,如果该对象的类型恰好是类,就有可能希望进一步访问它的成员。例如,对于一个由字符串组成的vector对象来说,要想检查其元素是否为空,令it是该vector对象的迭代器,只需检查it所指字符串是否为空就可以了

    • (*it).empty()//解引用it,然后调用结果对象的empty成员
      * it.empty()//错误:试图访问it的名为empty的成员,但it是个迭代器,//没有empty成员
      
  • 上面第二个表达式的含义是从名为it的对象中寻找其empty成员,显然it是一个迭代器,它没有哪个成员是叫empty的,所以第二个表达式将发生错误。

  • 为了简化上述表达式,C++语言定义了箭头运算符(->)。箭头运算符把解引用和成员访问两个操作结合在一起,也就是说,it->mem和(*it).mem表达的意思相同。

  • 例如,假设用一个名为text的字符串向量存放文本文件中的数据,其中的元素或者是一句话或者是一个用于表示段落分隔的空字符串。如果要输出text中第一段的内容,可以利用迭代器写一个循环令其遍历text,直到遇到空字符串的元素为止:

    • //依次输出text的每一行直至遇到第一个空白行为止
      for (auto it = text.cbegin ();it != text.cend() && !it->empty0); ++it)
      	cout << *it << endl;
      
  • 我们首先初始化it令其指向text的第一个元素,循环重复执行直至处理完了text的所有元素或者发现某个元素为空。每次迭代时只要发现还有元素并且尚未遇到空元素,就输出当前正在处理的元素。值得注意的是,因为循环从头到尾只是读取text的元素而未向其中写值,所以使用了cbegin和 cend来控制整个迭代过程。

  • 虽然vector对象可以动态地增长,但是也会有一些副作用。已知的一个限制是不能在范围for循环中向vector对象添加元素。另外一个限制是任何一种可能改变vector对象容量的操作,比如 push_back,都会使该vector对象的迭代器失效。谨记,但凡是使用了迭代器的循环体,都不要向迭代器所属的容器添加元素

  • 迭代器的递增运算令迭代器每次移动一个元素,所有的标准库容器都有支持递增运算的迭代器。类似的,也能用==和!=对任意标准库类型的两个有效迭代器进行比较。string和 vector的迭代器提供了更多额外的运算符,一方面可使得迭代器的每次移动跨过多个元素,另外也支持迭代器进行关系运算。所有这些运算被称作迭代器运算( iterator arithmetic)。

  • 要访问顺序容器和关联容器中的元素,需要通过“迭代器(iterator)”进行。迭代器是一个变量,相当于容器和操纵容器的算法之间的中介。迭代器可以指向容器中的某个元素,通过迭代器就可以读写它指向的元素。从这一点上看,迭代器和指针类似。不同容器的迭代器,其功能强弱有所不同。容器的迭代器的功能强弱,决定了该容器是否支持 STL 中的某种算法。例如,排序算法需要通过随机访问迭代器来访问容器中的元素,因此有的容器就不支持排序算法。

  • 不同容器的迭代器的功能

    • 容器 迭代器功能
      vector 随机访问
      deque 随机访问
      list 双向
      set / multiset 双向
      map / multimap 双向
      stack 不支持迭代器
      queue 不支持迭代器
      priority_queue 不支持迭代器
  • 可以令迭代器和一个整数值相加(或相减),其返回值是向前(或向后)移动了若干个位置的迭代器。执行这样的操作时,结果迭代器或者指示原vector对象(或string对象)内的一个元素,或者指示原vector对象(或string对象)尾元素的下一位置。

    • // 计算得到最接近vi中间元素的一个迭代器
      auto mid = vi.begin() + vi.size() / 2;
      
  • 如果vi有20个元素,vi.size()/2得10,此例中即令mid等于vi.begin ( )+10。已知下标从О开始,则迭代器所指的元素是vi[10],也就是从首元素开始向前相隔10个位置的那个元素。对于string或vector的迭代器来说,除了判断是否相等,还能使用关系运算符(<、<=、>、>=)对其进行比较。参与比较的两个迭代器必须合法而且指向的是同一个容器的元素(或者尾元素的下一位置)

  • 只要两个迭代器指向的是同一个容器中的元素或者尾元素的下一位置,就能将其相减,所得结果是两个迭代器的距离。所谓距离指的是右侧的迭代器向前移动多少位置就能追上左侧的迭代器,其类型是名为difference_type 的带符号整型数。string 和vector都定义了difference_type ,因为这个距离可正可负,所以difference_type是带符号类型的

  • 使用迭代器运算的一个经典算法是二分搜索。二分搜索从有序序列中寻找某个给定的值。二分搜索从序列中间的位置开始搜索,如果中间位置的元素正好就是要找的元素,搜索完成;如果不是,假如该元素小于要找的元素,则在序列的后半部分继续搜素;假如该元素大于要找的元素,则在序列的前半部分继续搜索。在缩小的范围中计算一个新的中间元素并重复之前的过程,直至最终找到目标或者没有元素可供继续搜索。

    • // text必须是有序的
      // beg 和end表示我们搜索的范围
      auto beg - text.begin(), end = text.end();
      auto mid = text.begin() + (end - beg) / 2; // 初始状态下的中间点
      //当还有元素尚未检查并且我们还没有找到sought时执行循环
      while (mid != end & &*mid != sought) {
              
              
      	if (sought < *mid)//我们要找的元素在前半部分吗 ?
      		end = mid;// 如果是,调整搜索范围使得忽略掉后半部分
      	else//我们要找的元素在后半部分
      		beg = mid + 1; // 在mid之后寻找
      	mid = beg + (end - beg) / 2;//新的中间点
      
  • 程序的一开始定义了三个迭代器: beg 指向搜索范围内的第一个元素、end指向尾元素的下一位置、mid指向中间的那个元素。初始状态下,搜索范围是名为text 的vector的全部范围。

  • 循环部分先检查搜索范围是否为空,如果mid和end 的当前值相等,说明已经找遍了所有元素。此时条件不满足,循环终止。当搜索范围不为空时,可知 mid指向了某个元素,检查该元素是否就是我们所要搜索的,如果是,也终止循环。

  • 当进入到循环体内部后,程序通过某种规则移动beg 或者end来缩小搜索的范围。如果mid所指的元素比要找的元素sought大,可推测若text含有sought,则必出现在mid所指元素的前面。此时,可以忽略mid后面的元素不再查找,并把mid赋给end即可。另一种情况,如果*mid 比 sought小,则要找的元素必出现在mid所指元素的后面。此时,通过令 beg 指向mid 的下一个位置即可改变搜索范围。因为已经验证过mid不是我们要找的对象,所以在接下来的搜索中不必考虑它。

  • 循环过程终止时,mid或者等于end或者指向要找的元素。如果mid等于end,说明text中没有我们要找的元素。

  • 按照迭代器的功能强弱,可以把迭代器分为以下几种类型:

    • 输入迭代器 (input iterator)

    • 输出迭代器 (output iterator)

    • 前向迭代器 (forward iterator)

    • 双向迭代器 (bidirectional iterator)

    • random-access iterator

  • The iterator is one of the components of C++ STL. It is used to traverse the container, and it is a universal way to traverse the elements of the container. No matter what data structure the container is based on, although different data structures have different ways of traversing elements. , but the code for using iterators to traverse different containers is exactly the same.

Guess you like

Origin blog.csdn.net/weixin_43424450/article/details/132344639