C++ polymorphic virtual function implementation principle, memory layout of objects and virtual function tables

basic concept

We know that C++ dynamic polymorphism is implemented using virtual functions. Although the C++ standard does not require the implementation of virtual functions, it is basically implemented using virtual function tables (determined by the compiler). So we need to understand the implementation principle of the virtual function table.

Member functions declared with the virtual keyword are virtual functions.

Classes with virtual functions and their derived classes will create a virtual function table at compile time, referred to as a virtual table (vtbl). The virtual table is an array of virtual function pointers.

Class objects with virtual functions have a virtual table pointer (vfptr), which is a pointer generated by the compiler and initialized during object construction. The virtual table pointer vfptr points to the first virtual function pointer of the virtual table (that is, the value of vfptr is the address of the first virtual function pointer of the virtual table).

Print object layout or virtual table layout

VS2017

Add: /d1 reportSingleClassLayoutXXX (XXX is the class name) to the C/C++ -> command line. Recompile and you will see the object layout and virtual table layout in the generated output window.

View all class layouts

/d1 reportAllClassLayout 

 View specific class layout

/d1 reportSingleClassLayoutXXX

gcc

View virtual table layout

g++ -fdump-class-hierarchy  Base.cpp

// >g++(8.3.1)
g++ -fdump-lang-class Base.cpp

Then the file Base.cpp.002t.class will be generated, which contains the virtual table layout.

gdb print object layout or virtual table 

开启打印虚表
set print vtbl on

打印对象的虚表
i vtbl OBJECT

根据虚表打印对象的派生类
set p object on 

Memory layout of objects and virtual tables

In order to facilitate testing on Windows and Linux at the same time, the x64 compiler is uniformly used, and the data type is intptr_t (an integer with the same size as the pointer).

Next, we will introduce the memory layout under virtual function without inheritance, single inheritance, multiple inheritance, diamond inheritance, and virtual inheritance.

The following tests will be performed on VS2017 or gcc5.4.0.

No inheritance, virtual functions

Object layout: The first member is the virtual table pointer.

Virtual table layout: The order of virtual function pointers is the order of virtual function declarations.

class NoInhert_A {
public:
    NoInhert_A() { cout << "NoInhert_A::NoInhert_A()" << endl; }
    virtual ~NoInhert_A() { cout << "NoInhert_A::~NoInhert_A()" << endl; }
    virtual void f() { cout << "NoInhert_A::f()" << endl; }
    virtual void g() { cout << "NoInhert_A::g()" << endl; }
    intptr_t a = 1;
};

VS2017 

You can see that the size of class NoInhert_A is 16 bytes, the virtual table pointer occupies 8 bytes, and the variable a occupies 8 bytes. The virtual table pointer vfptr points to the first virtual function pointer of the virtual table, which is the destructor. Different virtual function pointers can be accessed through the offset of vfptr.

gcc5.4.0

no_inhert.cpp.002t.class file output:

gdb output:

You can see that the object layout is the same as VS, both are 16 bytes.

The virtual table pointer vfptr points to the first virtual function pointer of the virtual table, followed by two virtual destructors and two virtual functions. This is the difference between VS and gcc compilers. gcc has two destructor pointers, one for stack destruction and one for heap destruction.

The first item of the virtual table is offset_to_top, which indicates the offset of the virtual table pointer of this type from the top address of the object. Here it is 0, because the first item of the object memory is the virtual The table pointer is not 0 only if there is multiple inheritance.

The second item of the virtual table is type_info, which is the RTTI pointer, pointing to runtime type information, used for runtime type identification, and used for typeid and dynamic_cast.

The overall memory layout is as follows:

Single inheritance

Single inheritance and multi-level inheritance are relatively simple.

Object layout: members of the parent class first, then members of the subclass.

Virtual table layout: When a subclass overrides a virtual function, the virtual function pointer of the parent class is replaced by the virtual function pointer of the subclass.

class SingleInhert_A {
public:
    SingleInhert_A() { cout << "SingleInhert_A::SingleInhert_A()" << endl; }
    virtual ~SingleInhert_A() { cout << "SingleInhert_A::~SingleInhert_A()" << endl; }
    virtual void f() { cout << "SingleInhert_A::f()" << endl; }
    virtual void g() { cout << "SingleInhert_A::g()" << endl; }
    intptr_t a = 1;
};

class SingleInhert_B : public SingleInhert_A {
public:
    SingleInhert_B() { cout << "SingleInhert_B::SingleInhert_B()" << endl; }
    virtual ~SingleInhert_B() { cout << "SingleInhert_B::~SingleInhert_B()" << endl; }
    virtual void f() override { cout << "SingleInhert_B::f()" << endl; }
    virtual void h() { cout << "SingleInhert_B::h()" << endl; }
    intptr_t b = 2;
};

int main() {
    SingleInhert_B obj;
    return 0;
}

 VS2017

gcc5.4.0

single_inhert.cpp.002t.class file output:

gdb output:

 The overall memory layout is as follows:

You can see that in the virtual table of SingleInhert_B, the f function is overwritten, the g function is inherited, and h is a new virtual function.

multiple inheritance

// 16字节
class Base1 {
public:
    Base1() { cout << "Base1::Base1()" << endl; }
    virtual ~Base1() { cout << "Base1::~Base1()" << endl; }
    virtual void f() { cout << "Base1::f()" << endl; }
    virtual void g() { cout << "Base1::g()" << endl; }
    intptr_t a = 1;
};

// 16字节
class Base2 {
public:
    Base2() { cout << "Base2::Base2()" << endl; }
    virtual ~Base2() { cout << "Base2::~Base2()" << endl; }
    virtual void f() { cout << "Base2::f()" << endl; }
    virtual void g() { cout << "Base2::g()" << endl; }
    virtual void h() { cout << "Base2::h()" << endl; }
    intptr_t b = 2;
};

// 40字节
class Derived : public Base1, public Base2 {
public:
    Derived() { cout << "Derived::Derived()" << endl; }
    virtual ~Derived() { cout << "Derived::~Derived()" << endl; }
    void f() override { cout << "Derived::f()" << endl; }
    void h() override { cout << "Derived::h()" << endl; }
    virtual void k() { cout << "Derived::k()" << endl; }
    intptr_t c = 3;
};

int main() {
    Derived d;
    return 0;
}

VS2017

/d1 reportSingleClassLayoutDerived

You can see the object layout: first the members of the parent class Base1, then the members of Base2, and finally the members of the subclass Derived.

Note that two parent classes are inherited here, so the subclass has two virtual tables. Derived and Base1 share the virtual table, and the object has two virtual table pointers.

The overall memory layout is as follows:

You can see that the f() and destructor of the subclass Derived cover Base1, &Base1::g and &Base2::g are directly inherited, and Derived’s new &Derived::k is appended to the Base1 virtual table the end of. Derived and Base1 share the first virtual table.​ 

What problems do multiple inheritance need to solve?

Multiple inheritance requires this pointer adjustment.

Consider the following situation:

Base2* pb2 = new Derived;
delete pb2;

When the Derived object is assigned to the Base2 pointer, the this pointer needs to be adjusted backwards by sizeof(Base) so that the members of Base2 can be called.

When deleting pb2, the this pointer needs to be adjusted forward by sizeof(Base) so that the destructor of Derived can be called.

Summary of multiple inheritance under VS2017

  • If the subclass's virtual function has an override, the parent class's virtual table will be modified (replaced by the subclass's virtual function pointer).
  • The subclass and the first parent class share a virtual table, and the subclass's unique virtual function pointer is appended to the first parent class's virtual table.
  • If there are n parent classes, then the subclasses have n virtual tables in total.

gcc5.4.0

multiple_inhert.cpp.002t.class output:

 gdb output:

Different from the VS virtual table implementation, under gcc:

The two vtables appear to be merged, with the second vtable obtained by offset. For convenience, we will refer to them as two virtual tables in the following introduction.

The overwritten virtual function pointer &Derived::h of Base2 is placed in the first virtual table. When pointing to a Derived object through a Base1 or Derived pointer, calling the h() function does not require adjusting the this pointer, but when pointing to a Derived object through a Base2 pointer, the this pointer needs to be adjusted. This is exactly the opposite of the mumble function in the multiple inheritance example in "In-Depth Exploration of the C++ Object Model". My personal guess is that the gcc implementation has changed.

Note that the offset_to_top of the second virtual table is -16, that is, the virtual table pointer is adjusted backward by 16, which is the starting address of the object, that is, this-=16.

Therefore, unless the virtual function of Base2 under gcc is not overwritten, the this pointer will be adjusted.

diamond inheritance

/**
 * @brief 16字节
 * vptr_A
 * a
*/
class A {
public:
    A() { cout << "A::A()" << endl; }
    virtual ~A() { cout << "A::~A()" << endl; }
    virtual void f() { cout << "A::f()" << endl; }
    intptr_t a = 1;
};

/**
 * @brief 24字节
 * vptr_B
 * a
 * b
*/
class B : public A {
public:
    B() { cout << "B::B()" << endl; }
    virtual ~B() { cout << "B::~B()" << endl; }
    virtual void g() { cout << "B::g()" << endl; }
    intptr_t b = 2;
};

/**
 * @brief 24字节
 * vptr_C
 * a
 * c
*/
class C : public A {
public:
    C() { cout << "C::C()" << endl; }
    virtual ~C() { cout << "C::~C()" << endl; }
    virtual void h() { cout << "C::h()" << endl; }
    intptr_t c = 3;
};

/**
 * @brief 56字节
 * --------------------
 * vptr_B
 * a
 * b
 * --------------------
 * vptr_C
 * a
 * c
 * --------------------
 * d
*/
class Tom : public B, public C {
public:
    Tom() { cout << "Tom::Tom()" << endl; }
    virtual ~Tom() { cout << "Tom::~Tom()" << endl; }
    void f() override { cout << "Tom::f()" << endl; }
    virtual void k() { cout << "Tom::k()" << endl; }
    intptr_t d = 4;
};

int main() {
    Tom t;
    t.a = 2; // Tom::a不明确
    return 0;
}

Problem with diamond inheritance

  1. There are two copies of the data of the common parent class, causing waste. (The public base class will be constructed and destructed twice)
  2. There is ambiguity. You cannot directly access the data and functions of the public parent class. You need to access it throughclass name::

Solution: Use virtual inheritance

VS2017

/d1 reportSingleClassLayoutTom

Object layout:

Virtual table layout:

 From the object layout above, you can see that there are two copies of the data member a of A. Because of the ambiguity, we cannot access a directly through t.a.

gcc5.4.0

diamond_inhert.cpp.002t.class output:

gdb output: 

Diamond inheritance gcc logic is similar to VS.

Diamond inheritance + virtual inheritance

/**
 * @brief 16字节
*/
class VA {
public:
    VA() { cout << "VA::VA()" << endl; }
    virtual ~VA() { cout << "VA::~VA()" << endl; }
    virtual void f() { cout << "VA::f()" << endl; }
    intptr_t a = 1;
};

/**
 * @brief VS2017:40字节,gcc5.4.0:32字节
*/
class VB : virtual public VA {
public:
    VB() { cout << "VB::VB()" << endl; }
    virtual ~VB() { cout << "VB::~VB()" << endl; }
    //void f() override { cout << "VB::f()" << endl; }
    virtual void g() { cout << "VB::g()" << endl; }
    intptr_t b = 2;
};

/**
 * @brief VS2017:40字节,gcc5.4.0:32字节
*/
class VC : virtual public VA {
public:
    VC() { cout << "VC::VC()" << endl; }
    virtual ~VC() { cout << "VC::~VC()" << endl; }
    virtual void h() { cout << "VC::h()" << endl; }
    intptr_t c = 3;
};

/**
 * @brief VS2017:80字节,gcc5.4.0:56字节
 * --------------------
*/
class VTom : public VB, public VC {
public:
    VTom() { cout << "VTom::VTom()" << endl; }
    virtual ~VTom() { cout << "VTom::~VTom()" << endl; }
    void f() override { cout << "VTom::f()" << endl; }
    virtual void k() { cout << "VTom::k()" << endl; }
    intptr_t d = 4;
};

int main() {
    VTom v;
    cout << sizeof(VA) << endl;
    cout << sizeof(VB) << endl;
    cout << sizeof(VC) << endl;
    cout << sizeof(VTom) << endl;
    return 0;
}

Virtual inheritance can solve the problem of diamond inheritance. After diamond inheritance is changed to virtual inheritance, A only has one copy.

In-depth exploration of the C++ object model recommends not declaring non-static data members in virtual base classes.

The public base class VA is a virtual base class.

Only the first direct base class will call the constructor of the virtual base class.

VS2017

/d1 reportSingleClassLayoutVTom

Object layout:

The two direct base classes have vbptr, which is a virtual base class pointer. vbptr points to the virtual base class table vbtable.

Next are the subclass members, and then the 4-byte vtordisp. In order to maintain 8-byte alignment, 4 bytes are reserved in front.

Finally, there are the members of the virtual base class.

Virtual table layout:

You can see that the new virtual function pointer &VTom::k of the subclass is in the first direct base class VB.

Let’s look at the virtual base class table:

  1. The first item in the virtual base class table is the offset of the class first address relative to vbptr, which is -8 here.
  2. The second item in the virtual base class table is the offset of the virtual base class's virtual table pointer relative to vbptr, that is, 64-32=32, 64-8=56.

gcc5.4.0

virtual_inhert.cpp.002t.class output:

Vtable for VTom
VTom::_ZTV4VTom: 21u entries
0     40u
8     (int (*)(...))0
16    (int (*)(...))(& _ZTI4VTom)
24    (int (*)(...))VTom::~VTom
32    (int (*)(...))VTom::~VTom
40    (int (*)(...))VB::g
48    (int (*)(...))VTom::f
56    (int (*)(...))VTom::k
64    24u
72    (int (*)(...))-16
80    (int (*)(...))(& _ZTI4VTom)
88    (int (*)(...))VTom::_ZThn16_N4VTomD1Ev
96    (int (*)(...))VTom::_ZThn16_N4VTomD0Ev
104   (int (*)(...))VC::h
112   18446744073709551576u
120   18446744073709551576u
128   (int (*)(...))-40
136   (int (*)(...))(& _ZTI4VTom)
144   (int (*)(...))VTom::_ZTv0_n24_N4VTomD1Ev
152   (int (*)(...))VTom::_ZTv0_n24_N4VTomD0Ev
160   (int (*)(...))VTom::_ZTv0_n32_N4VTom1fEv

Construction vtable for VB (0x0x7f648bdb1888 instance) in VTom
VTom::_ZTC4VTom0_2VB: 13u entries
0     40u
8     (int (*)(...))0
16    (int (*)(...))(& _ZTI2VB)
24    0u
32    0u
40    (int (*)(...))VB::g
48    0u
56    18446744073709551576u
64    (int (*)(...))-40
72    (int (*)(...))(& _ZTI2VB)
80    0u
88    0u
96    (int (*)(...))VA::f

Construction vtable for VC (0x0x7f648bdb1820 instance) in VTom
VTom::_ZTC4VTom16_2VC: 13u entries
0     24u
8     (int (*)(...))0
16    (int (*)(...))(& _ZTI2VC)
24    0u
32    0u
40    (int (*)(...))VC::h
48    0u
56    18446744073709551592u
64    (int (*)(...))-24
72    (int (*)(...))(& _ZTI2VC)
80    0u
88    0u
96    (int (*)(...))VA::f

VTT for VTom
VTom::_ZTT4VTom: 7u entries
0     ((& VTom::_ZTV4VTom) + 24u)
8     ((& VTom::_ZTC4VTom0_2VB) + 24u)
16    ((& VTom::_ZTC4VTom0_2VB) + 80u)
24    ((& VTom::_ZTC4VTom16_2VC) + 24u)
32    ((& VTom::_ZTC4VTom16_2VC) + 80u)
40    ((& VTom::_ZTV4VTom) + 144u)
48    ((& VTom::_ZTV4VTom) + 88u)

Class VTom
   size=56 align=8
   base size=40 base align=8
VTom (0x0x7f648be1dd90) 0
    vptridx=0u vptr=((& VTom::_ZTV4VTom) + 24u)
  VB (0x0x7f648bdb1888) 0
      primary-for VTom (0x0x7f648be1dd90)
      subvttidx=8u
    VA (0x0x7f648bd785a0) 40 virtual
        vptridx=40u vbaseoffset=-24 vptr=((& VTom::_ZTV4VTom) + 144u)
  VC (0x0x7f648bdb1820) 16
      subvttidx=24u vptridx=48u vptr=((& VTom::_ZTV4VTom) + 88u)
    VA (0x0x7f648bd785a0) alternative-path

 gdb output:

Although gdb outputs the members of VA at the front, the actual object layout is like this:

Overall memory layout:

virtual base offsets represents the offset of the virtual base class virtual table pointer VA::vfptr relative to the first address of the direct base class, that is, 40 and 24.

It can be seen that the overall virtual table layout shows the VB-like part, followed by the VC-like part, and finally the virtual base class VA part.

The virtual base class part uses trunk technology, that is, adjusting this pointer to access virtual functions.

The beginning of the virtual table is the first virtual function pointer &VB::g of the direct base class VB, followed by &VTom::f that covers the virtual base class and the newly added &VTom::k of the subclass.

On the whole, the virtual table is a continuous array, but logically we can think that there are 3 virtual tables here.

There is no vbptr under gcc.

Reference link

Basic principles of C++ virtual function implementation - JackTang's Blog

Interview Series: C++ Object Layout [Suggested Collection] - SegmentFault Sifu

Illustrated C++ object model: detailed explanation of object memory layout - melonstreet - Blog Park

Guess you like

Origin blog.csdn.net/ET_Endeavoring/article/details/124830482