C++20: list-initialization, aggregate-initialization, mandatory omission of copy optimization, designated initialization...

what is initialization

The type of variable determines the size and layout of the memory occupied by the variable; the variable gets a specific value when it is created, and I say that the variable is initialized.

Classification and Standard

Variable initialization is an important part of the C++ standard. C++ initialization can be divided into direct initialization and copy initialization according to whether there is a copy; other initializations can be classified into these two categories.

The C++ standard stipulates that if a variable is initialized using the assignment operator (=), it is copy initialization, otherwise it is direct initialization.

C++ standard

According to the introduction of time in the C++ standard, the author summarizes the relevant standards of variable initialization as follows:

  • C++98 standard direct initialization, copy initialization, aggregate initialization, parenthesis initialization
  • C++11 standard list initialization and initializer_list;
  • C++17 standard mandatory omission of copy optimization, list initialization type deduction, aggregate initialization extension (allowing base classes)
  • An extension of the C++20 standard for aggregate initialization, designated initializer initialization.

C++98

The C++98 standard is the first major version of the C++ language; the initializations introduced by this standard include direct initialization, copy initialization, parenthesis initialization, curly braces, and aggregate initialization.

According to the aforementioned initialization standards, parenthesis initialization belongs to direct initialization, and aggregate initialization is a special kind of copy initialization.

direct initialization

The initialization that is not implemented through copy construction is direct initialization. This initialization method generally initializes variables by directly calling the constructor. For example:

string s(5,'c'); //  s被初始化为"ccccc"

copy initialization

Copy initialization is relative to direct initialization. Copy initialization implements object initialization by calling the copy construction of the type. For example:

string name = "liuguang";

In the C++98 standard, name assignment has gone through 2 steps based on implicit conversion:

  • Call the string input parameter as a const char* constructor to generate a temporary string variable temp;
  • Execute the assignment operation and call the string copy constructor whose input parameter is temp to complete the initialization of name.

But it is a pity that because of compiler optimization, we debugged and saw that the constructor initialization is directly called, and the name initialization will directly call the string input parameter as the const char* constructor.

Therefore, name initialization should belong to copy initialization, but after optimization, it will be misunderstood as direct initialization by others. GCC's -fno-elide-constructors and MSVC's /Od can turn off this compiler optimization.

parenthesis initialization

Parenthesis initialization uses () to initialize variables, which are based on the naming method of the usage form, for example:

double d(1.3);    // d被初始化为1.3
string s(5,'c');  // s被初始化为"ccccc"

aggregate initialization

Aggregate initialization is generally applicable to the following two initialization scenarios:

  • array initialization
char a[3] = {
    
    'a', 'b', '\0'};
  • A class, struct or union that satisfies the following conditions
    - all members are public non-static data members
    - no user-defined constructors
    - no virtual member functions
    - no base class
    - no in-class initialization
struct AggrPerson
{
    
    
	std::string name;
	int age;
};

AggrPerson aggrPerson = {
    
    "liuguang", 20};

curly brace initialization

In the C++98 standard, curly brace initialization has only one usage scenario other than aggregate initialization. For example

int units{
    
    2};

C++11

The C++11 standard is the second major version of the C++ language and has an important position in the history of C++. There are many forms of initialization in the C++98 standard, and C++11 introduces initialization lists to fundamentally improve this problem.

list initialization

Since there are various forms of initialization methods in the C++98 standard, the initialization problem has become extremely complicated. In order to solve the thorny problems in C++98, the variables initialized with curly braces in C++11 have been fully applied. This new form of initialization is called list initialization. List initialization allows us to uniformly initialize all types in the {} way.

int a[] = {
    
     1,2,3 };
int aa[]{
    
     1, 2,3 };
std::vector<int> aas{
    
     1,2,3 };
std::unordered_map<std::string, int> az{
    
     {
    
    "1", 1}, {
    
    "2", 2} };

Type narrowed

Type narrowing generally refers to implicit conversions that may result in data changes or loss of precision. When applied to built-in variables, list initialization has an important feature. List initialization can prevent type narrowing: if there is a risk of information loss in initialization list initialization variables, the compiler will report an error. Moreover, list initialization is the only initialization method in C++11 that can prevent type narrowing. This is also the most important feature of list initialization that distinguishes it from other initializations.

// 浮点型到整型的转换
int aVal = 2.0;    // C++98/0x 标准可以通过编译
int aVal2 = {
    
    2.0}; // C++11 标准提示编译错误,类型收窄

// 整型到浮点型的转换
float c = 1e70;    // C++98/0x 标准可以通过编译
float d = {
    
    1e70};  // C++11 标准提示编译错误,类型收窄

Typical scenarios that can lead to type narrowing:

  • convert float to integer
  • Convert high-precision floating-point numbers to low-precision floating-point numbers
  • Integer (non-strongly typed enum) conversion to float
  • Integers (non-strongly typed enumerations) are converted to lower-length integers

initializer_list

The initializer_list is introduced by C++11 STL. It can solve the initialization list of any length, but requires the type of the parameter to be the same as T or can be implicitly converted to T. The definition of initializer_list:

template <class T> class initializer_list

When making a function call, you need to use curly braces to enclose all parameters. For example:

void errorMsg(std::initializer_list<std::string> str)      //可变参数,所有参数类型一致
{
    
    
	for (auto beg = str.begin(); beg != str.end(); ++beg)
		std::cout << *beg << " ";
	std::cout << std::endl;
}

//调用
errorMsg({
    
    "hello","error",error});     // error为string类型
errorMsg({
    
    "hello2",well});             // well为string类型

It should be noted that the elements in the initializer_list object are always constant values, and we cannot change the value of the elements in the initializer_list object.

C++17

Further improve the initialization, introducing two features: mandatory omitting copy optimization and list initialization type derivation. List-initialization type deduction is about auto type deduction, forced omitting copy optimization is about object copy-initialization, aggregate initialization extensions (allowing base classes).

List initialization type deduction

The list initialization rules of C++11 often fail to meet the programmer's expectations and make mistakes when used in conjunction with auto. For this reason, C++17 enhances the list initialization rules.

C++17's enhancements to list initialization can be summarized in the following three rules:

  • auto var {element}; deduce var as the same type as element
  • auto var {element1, element2, …}; This form is illegal, prompting compilation error
  • auto var = {element1, element2, …}; var is deduced as std::initializer_list, where T is element type

auto var = {element1, element2, …};
This kind of type deduction requires that all element types must be the same, or can be converted to the same type.

auto v   {
    
    1};         // 正确:v 推导为 int
auto w   {
    
    1, 2};      // 错误: 初始化列表只能为单元素

/* C++17 增加的新规则 */
auto x = {
    
    1};         // 正确:x 推导为 std::initializer_list<int>
auto y = {
    
    1, 2};      // 正确:y 推导为 std::initializer_list<int>
auto z = {
    
    1, 2, 3.0}; // 错误: 初始化列表中的元素类型必须相同

Force omission of copy optimization

The C++ standard has been trying to reduce some temporary variables or copy operations to improve performance. This is a problem that the C++ standard has been trying to solve. Some compilers before C++17 already support this ignore copy operation, but C++ Standards are not mandatory. Fortunately, C++17 wrote omitted copy into the standard, so ignored copy will become mandatory from C++17 onwards.

C++17 stipulates that omitting copy optimization will appear in four scenarios: RVO (Return Value Optimization) and NRVO (Named Return Value Optimization), exception capture, and temporary construction of objects.

NRVO (Named Return Value Optimization)

If a function returns a Named Return Value Optimization, then force omission of copy optimization will take effect.

class Thing 
{
    
    
public:
  Thing();
  virtual ~Thing();
  Thing(const Thing&);
};

Thing f() 
{
    
    
  Thing t;
  return t;
}

Thing t2 = f();

The assembly code generated by the above code (https://godbolt.org/ can help us achieve code conversion)

f():
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     QWORD PTR [rbp-8], rdi
        mov     rax, QWORD PTR [rbp-8]
        mov     rdi, rax
        call    Thing::Thing() [complete object constructor]
        nop
        mov     rax, QWORD PTR [rbp-8]
        leave
        ret
t2:
        .zero   1
__static_initialization_and_destruction_0():
        push    rbp
        mov     rbp, rsp
        mov     eax, OFFSET FLAT:t2
        mov     rdi, rax
        call    f()
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:t2
        mov     edi, OFFSET FLAT:_ZN5ThingD1Ev
        call    __cxa_atexit
        nop
        pop     rbp
        ret
_GLOBAL__sub_I_f():
        push    rbp
        mov     rbp, rsp
        call    __static_initialization_and_destruction_0()
        pop     rbp
        ret

We can see that under the C++17 standard, f() only calls object construction once, and then transfers the generated object to t2 by means of move. The generation and construction of temporary variables returned by functions before C++17 are ignored.

RVO (Return Value Optimization)

In C++17, if a function returns RVO (Return Value Optimization), it will also be forced to omit copy optimization.

class Thing 
{
    
    
public:
  Thing();
  ~Thing();
  Thing(const Thing&);
};

Thing f()
{
    
    
  return Thing();
}

Thing t2 = f();

The assembly code generated by the above code (https://godbolt.org/ can help us achieve code conversion)

f():
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     QWORD PTR [rbp-8], rdi
        mov     rax, QWORD PTR [rbp-8]
        mov     rdi, rax
        call    Thing::Thing() [complete object constructor]
        mov     rax, QWORD PTR [rbp-8]
        leave
        ret
t2:
        .zero   1
__static_initialization_and_destruction_0():
        push    rbp
        mov     rbp, rsp
        mov     eax, OFFSET FLAT:t2
        mov     rdi, rax
        call    f()
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:t2
        mov     edi, OFFSET FLAT:_ZN5ThingD1Ev
        call    __cxa_atexit
        nop
        pop     rbp
        ret
_GLOBAL__sub_I_f():
        push    rbp
        mov     rbp, rsp
        call    __static_initialization_and_destruction_0()
        pop     rbp
        ret

We can see that under the C++17 standard, f() only calls object construction once, and then transfers the generated object to t2 by means of move. The generation and construction of temporary variables returned by functions before C++17 are ignored. The implementation form is the same as NRVO.

Constructed from a temporary

If an object is constructed from a temporary variable, it will also be forced to omit copy optimization at this time. Reference example:

class Thing 
{
    
    
public:
  Thing();
  ~Thing();
  Thing(const Thing&);
};

Thing t2 = Thing();
Thing t3 = Thing(Thing());

The generated assembly code (https://godbolt.org/ can help us achieve code conversion)

t2:
        .zero   1
t3:
        .zero   1
__static_initialization_and_destruction_0():
        push    rbp
        mov     rbp, rsp
        mov     edi, OFFSET FLAT:t2
        call    Thing::Thing() [complete object constructor]
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:t2
        mov     edi, OFFSET FLAT:_ZN5ThingD1Ev
        call    __cxa_atexit
        mov     edi, OFFSET FLAT:t3
        call    Thing::Thing() [complete object constructor]
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:t3
        mov     edi, OFFSET FLAT:_ZN5ThingD1Ev
        call    __cxa_atexit
        nop
        pop     rbp
        ret
_GLOBAL__sub_I_t2:
        push    rbp
        mov     rbp, rsp
        call    __static_initialization_and_destruction_0()
        pop     rbp
        ret

From the above assembly code, we can draw the conclusion that both t2 and t3 are the same, first complete an object construction, and then transfer the object to them by means of move. The multiple object copy construction is ignored in the middle.

Exception object caught by value

When exception throws are caught by value, force omission copy optimizations will occur. Reference example:

struct Thing
{
    
    
  Thing();
  Thing(const Thing&);
};
 
void foo() 
{
    
    
  Thing c;
  throw c;
}
 
int main() 
{
    
    
  try 
  {
    
    
    foo();
  }
  catch(Thing c) 
  {
    
      
  }             
}

The generated assembly code (https://godbolt.org/ can help us achieve code conversion)

foo():
        push    rbp
        mov     rbp, rsp
        push    r12
        push    rbx
        sub     rsp, 16
        lea     rax, [rbp-17]
        mov     rdi, rax
        call    Thing::Thing() [complete object constructor]
        mov     edi, 1
        call    __cxa_allocate_exception
        mov     rbx, rax
        lea     rax, [rbp-17]
        mov     rsi, rax
        mov     rdi, rbx
        call    Thing::Thing(Thing const&) [complete object constructor]
        mov     edx, 0
        mov     esi, OFFSET FLAT:typeinfo for Thing
        mov     rdi, rbx
        call    __cxa_throw
        mov     r12, rax
        mov     rdi, rbx
        call    __cxa_free_exception
        mov     rax, r12
        mov     rdi, rax
        call    _Unwind_Resume
main:
        push    rbp
        mov     rbp, rsp
        push    rbx
        sub     rsp, 24
        call    foo()
.L8:
        mov     eax, 0
        jmp     .L10
        mov     rbx, rax
        mov     rax, rdx
        cmp     rax, 1
        je      .L7
        mov     rax, rbx
        mov     rdi, rax
        call    _Unwind_Resume
.L7:
        mov     rax, rbx
        mov     rdi, rax
        call    __cxa_get_exception_ptr
        mov     rdx, rax
        lea     rax, [rbp-17]
        mov     rsi, rdx
        mov     rdi, rax
        call    Thing::Thing(Thing const&) [complete object constructor]
        mov     rax, rbx
        mov     rdi, rax
        call    __cxa_begin_catch
        call    __cxa_end_catch
        jmp     .L8
.L10:
        mov     rbx, QWORD PTR [rbp-8]
        leave
        ret
typeinfo for Thing:
        .quad   vtable for __cxxabiv1::__class_type_info+16
        .quad   typeinfo name for Thing
typeinfo name for Thing:
        .string "5Thing"

Analyzing the assembly code, we can find that the c object in catch(Thing c) is directly copied from the c object of void foo(), and only one copy construction is performed, so the forced omitting of copy optimization is effective here.

Aggregate initialization that allows inheritance

Starting from C++17, aggregates can also have base classes, which can initialize structs derived from other classes/structs:

struct NewPerson: AggrPerson
{
    
    
    bool isMan;
};

NewPerson person{
    
    {
    
    "liuguang", 20}, true};

Aggregate initialization now supports passing initial values ​​to base class member variables with nested parentheses, which can also be omitted:

NewPerson person{
    
    "liuguang", 20, true};

As of C++17, aggregates are defined as follows:

  • can be an array
  • or a custom type (class, struct, or union) that requires:
    – no user-defined or explicit constructors
    – no constructors inherited via using declarations
    – no private or protected non-static member variables
    – no virtual functions
    – no virtual, private or protected base classes

To be able to use an aggregate, it must also have no private or protected base class members or constructors during initialization.

struct A   // C++17聚合体
{
    
     
	A() = delete;
};
struct B   // C++17聚合体
{
    
     
	B() = default;
	int i = 0;
};
struct C  // C++17聚合体
{
    
    
	C(C&&) = default;
	int a, b;
};

A a{
    
    }; // C++17合法
B b = {
    
    1};  // C++17合法
auto* c = new C{
    
    2, 3};  // C++17合法

C++17 introduces a new type extraction is_aggregate<> for testing whether a type is an aggregate.

template<typename T>
struct D : std::string, std::complex<T> 
{
    
    
    std::string data;
};


D<float> s{
    
    {
    
    "hello"}, {
    
    4.5,6.7}, "world"};  // C++17开始正确
std::cout << std::is_aggregate<decltype(s)>::value; // 输出: 1

C++20

C++20 has further optimized and improved aggregate initialization. The specific optimization includes the following two aspects: first, aggregates are prohibited from having user-declared constructors, and second, they are compatible with the specified initialization of C language.

Disallow aggregate constructors

The C++20 [P1008R1] proposal requires that as long as a constructor is declared, then this class&struct will not be an aggregate. For example:

struct A   // C++17及以前是聚合体,C++20 非聚合体
{
    
     
	A() = delete;
};
struct B   // C++17及以前是聚合体,C++20 非聚合体
{
    
     
	B() = default;
	int i = 0;
};
struct C  // C++17及以前是聚合体,C++20 非聚合体
{
    
    
	C(C&&) = default;
	int a, b;
};

A a{
    
    }; // C++20不合法,C++17及以前合法
B b = {
    
    1};  // C++20不合法,C++17及以前合法
auto* c = new C{
    
    2, 3};  // C++20不合法,C++17及以前合法

specified initialization

We can use the following two designated initialization methods supported by C++20:

T 对象 = {
    
     .指定符1 = 实参1 , .指定符2 {
    
     实参2 } ... };
T 对象 {
    
     .指定符1 = 实参1 , .指定符2 {
    
     实参2 } ... };
  • Each specifier must name an immediate non-static data member of T, and all specifiers used in an expression must appear in the same order as the data members of T. For example:
struct A 
{
    
     
	int x; 
	int y; 
	int z; 
};
 
A a{
    
    .y = 2, .x = 1}; // Error:指定符的顺序不匹配声明顺序
A b{
    
    .x = 1, .z = 2}; // OK:b.y 被初始化为 0
  • Each immediate non-static data member named by the designated initializer is initialized from its designator following the corresponding curly-brace or equal-sign designator. Narrowing conversions are prohibited.
  • Designated initializers can be used to initialize a union to a state other than its first member. Only one initializer may be provided for a union.
union u 
{
    
     
	int a; 
	const char* b; 
};
 
u f = {
    
    .b = "asdf"};         // OK:联合体的活跃成员是 b
u g = {
    
    .a = 1, .b = "asdf"}; // Error:只可提供一个初始化器
  • Elements in non-union aggregates for which no designated initializer is provided are initialized according to the rules above for cases where the number of initializer clauses is less than the number of members (a default member initializer is used if it is provided, otherwise an empty list initialization)
struct A
{
    
    
    string str;
    int n = 42;
    int m = -1;
};
 
A{
    
    .m = 21}  // 以 {} 初始化 str,这样会调用默认构造函数
            // 然后以 = 42 初始化 n
            // 然后以 = 21 初始化 m
  • If the aggregate initialized by the designated initializer clause has an anonymous union member, then the corresponding designated initializer must name one of the members of the anonymous union.
struct C
{
    
    
    union
    {
    
    
        int a;
        const char* p;
    };
 
    int x;
} c = {
    
    .a = 1, .x = 3}; // 以 1 初始化 c.a 并且以 3 初始化 c.x
  • Out-of-order designated initialization, nested designated initialization, mixing of designated and regular initializers, and designated initialization of arrays are supported in the C programming language but not in C++.
struct A {
    
     int x, y; };
struct B {
    
     struct A a; };
 
struct A a = {
    
    .y = 1, .x = 2}; // C 中合法,C++ 中非法(乱序)
int arr[3] = {
    
    [1] = 5};        // C 中合法,C++ 中非法(数组)
struct B b = {
    
    .a.x = 0};       // C 中合法,C++ 中非法(嵌套)
struct A a = {
    
    .x = 1, 2};      // C 中合法,C++ 中非法(混合)

Summarize

Variable initialization is an important part of the C++ standard. C++ initialization can be divided into direct initialization and copy initialization according to whether there is a copy; other initializations can be classified into these two categories. Based on this classification standard, this article introduces: direct initialization, copy initialization, aggregate initialization, and parenthesis initialization of the C++98 standard; list initialization and initializer_list of the C++11 standard; mandatory omission of copy optimization of the C++17 standard , list initialization type deduction, aggregate initialization extension (allowing base classes); extension of C++20 standard aggregate initialization, designated initializer initialization.

Guess you like

Origin blog.csdn.net/liuguang841118/article/details/127912564