C++ Getting Started Guide: Detailed Analysis of String Class Documents (very classic, recommended to collect)


1. Interpretation of string class

string class document
The details are as follows:
Insert image description here
Note: The following translation is for reference only.

  1. String is a class that represents a sequence of characters.
  2. Support for such objects is provided by the standard String class, whose interface is similar to that of the standard character container, but with the addition of design features specifically for manipulating single-byte character strings.
  3. The string class uses char as its character type, using its default char_traits and allocator type (see basic_string for more information on templates).
  4. The string class is an instance of the basic_string template class. It uses char to instantiate the basic_string template class, and uses char_traits and allocator as the default parameters of basic_string (for more template information, please refer to basic_string).
  5. Note that this class handles bytes independently of the encoding used: if used to handle sequences of multibyte or variable-length characters (such as UTF-8), all members of this class (such as length or size) and its iterators The processor will still operate in terms of bytes (rather than actual encoded characters).

2. Description of common interfaces of string class

2.1 Common constructions of string class objects

(constructor)Function name Function Description
string() Construct an empty string class object, that is, an empty string
string(const char* s) Use C-string to construct string class objects
string(size_t n, char c) The string class object contains n characters c
string(const string&s) copy constructor
void Teststring()
{
    
    
	 string s1; // 构造空的string类对象s1
	 string s2("hello bit"); // 用C格式字符串构造string类对象s2
	 string s3(s2); // 拷贝构造s3
}

2.2 Capacity operation of string class objects

function name function function
size(emphasis) Returns the valid character length of the string
length Returns the valid character length of the string
capacity Returns the total size of the space
empty (重点) If the detection string is released as an empty string, it will return true, otherwise it will return false.
clear (重点) Clear valid characters
reserve (emphasis) Reserve space for string
resize (emphasis) Reduce the number of valid characters to n, and fill the extra space with the character c

Small tips:

  1. The underlying implementation principles of the size() and length() methods are exactly the same. The reason for introducing size() is to be consistent with the interfaces of other containers. Under normal circumstances, size() is basically used.
  2. clear() only clears the valid characters in the string and does not change the underlying space size.
  3. resize(size_t n) and resize(size_t n, char c) both change the number of valid characters in the string to n. The difference is that when the number of characters increases: resize(n) fills the extra characters with 0 Element space, resize(size_t n, char c) uses character c to fill the extra element space. Note: When resize changes the number of elements, if the number of elements is increased, the size of the underlying capacity may be changed. If the number of elements is reduced, the total size of the underlying space remains unchanged.
  4. reserve(size_t res_arg=0): Reserve space for string without changing the number of valid elements. When the parameter of reserve is less than the total size of the underlying space of string, reserve will not change the capacity.

Code demo:


2.3 Access and traversal operations of string class objects

function name Function Description
operator[] (emphasis) Returns the character at pos position, called by const string class object
begin+ end begin gets the iterator of a character + end gets the iterator of the next position of the last character
rbegin + rend rbegin gets the iterator of the last character + end gets the iterator of the previous position of the first character
rangefor C++11 supports a more concise new traversal method of range for

2.4 Modification operations of string class objects

function name Function Description
push_back Insert character c at the end of the string
append Append a string after a string
operator+= (emphasis) Append string str after string
c_str(emphasis) Return C format string
find + npos(重点) Find the character c starting from the pos position in the string and return the position of the character in the string.
rfind Find the character c starting from the pos position of the string and return the position of the character in the string.
substr Starting from pos position in str, intercept n characters and return them
Small tips:
  • When appending characters to the end of string, the three implementation methods of s.push_back( c ) / s.append(1, c) / s += 'c' are similar. Generally, the += operation of string class is used. More often, the += operation can not only connect single characters, but also strings.
  • When operating on strings, if you can roughly estimate how many characters to put in, you can first reserve the space through reserve.

2.5 Non-member functions of string class

function Function Description
operator+ Use it as little as possible, because returning by value results in low deep copy efficiency.
operator>> (重点) Input operator overloading
operator<< (重点) Output operator overloading
getline (emphasis) Get a line of string
relational operators (重点) size comparison

3. Description of string structure under vs and g++

Note: The following structure is verified on a 32-bit platform. The pointer occupies 4 bytes on a 32-bit platform.

  • The structure of string under vs
    String occupies a total of 28 bytes, and the internal structure is a little more complicated. First there is a union, which is used to define the Chinese characters in string
    String storage space:
     . When the string length is less than 16, an internal fixed character array is used to store
     . When the string length is greater than or equal to 16, space is allocated from the heap
union _Bxty
{
    
     // storage for small buffer or pointer to larger one
 	value_type _Buf[_BUF_SIZE];
	 pointer _Ptr;
	 char _Alias[_BUF_SIZE]; // to permit aliasing
} _Bx;

This design also makes sense. In most cases, the length of the string is less than 16. After the string object is created, there are already 16 characters inside
The fixed space of the array does not need to be created through the heap, which is highly efficient.
Secondly: There is also a size_t field to save the length of the string, and a size_t field to save the total capacity of the space opened on the heap.
Finally: There is also a pointer to do some Other things.
Therefore, it occupies a total of 16+4+4+4=28 bytes.
Insert image description here

  • The structure of string under g++
    Under ++, string is implemented through copy-on-write. The string object occupies a total of 4 bytes and contains only one pointer internally.
    The needle points to a heap space in the future, which contains the following fields:
     . The total size of the space
     . The effective length of the string is
     . Reference count
     . Pointer to heap space, used to store strings
struct _Rep_base
{
    
    
	 size_type _M_length;
	 size_type _M_capacity;
	 _Atomic_word _M_refcount;
};

C++_String addition, deletion, checking and modification simulation implementation

Guess you like

Origin blog.csdn.net/Zhenyu_Coder/article/details/134228886