The principles of polymorphism in C++

Preface

The previous article explained the principle of polymorphism. This article will explain the principle of polymorphism in detail.

Here is a frequently asked written test question: What is sizeof(Base)?
Insert image description here

Insert image description here
Why not 8?
I can debug it and take a look at it.
Look carefully, there is an extra pointer at the head of the object.
Insert image description here
This pointer is called the virtual function table pointer.

The above is not important, what is important is the following, the principle of polymorphism.
What exactly is in the table pointed to by this pointer?

The principle of polymorphism

Look below, there are two objects here, one is mike and the other is johnson. Both objects have table pointers.

class Person {
    
    
public:
	virtual void BuyTicket() {
    
     cout << "买票-全价" << endl; }
};
class Student : public Person {
    
    
public:
	virtual void BuyTicket() {
    
     cout << "买票-半价" << endl; }
};
void Func(Person& p)
{
    
    
	p.BuyTicket();
}
int main()
{
    
    
	Person mike;
	Func(mike);
	Student johnson;
	Func(johnson);
	return 0;
}

We have talked before about what constitutes polymorphism.
It has to do with the pointer or the object pointed to by this reference.

Why? How is it achieved?
Insert image description here
This pointer points to the parent class to call the virtual function of the parent class, and points to the subclass to call the virtual function of the subclass.
How?

You can see that the virtual table of the parent class object stores the virtual functions of the parent class, and the virtual table of the subclass object stores the virtual functions of the subclass.
How does the compiler do it?
The compiler also determines whether the construct constitutes polymorphism. If it does not constitute polymorphism, it determines the calling address during compilation.

How to be sure?
Depends on what type of person it is. Then it goes to find this function in person and determines the address of this place.

If it is polymorphic,
it will look for it in the virtual table of the pointed object.
The compiler is also very simple, it just strictly checks whether the polymorphic conditions are met or not.

Let me show you how to debug it
Insert image description here
Insert image description here

A situation that constitutes polymorphism:
p.BuyTicket(); The execution of this instruction does not know who is calling. Why?
There are two situations for this person object

The essence of the above assembly code is that it has nothing to do with the type of the calling pointer object or reference object.
Looking at the pointed object, the pointed parent class calls the parent class, and the pointed child class calls the subclass.

Polymorphism is a matter of converting it into assembly.
It does not constitute polymorphism and directly determines the address. If it does constitute polymorphism, it is converted into the corresponding assembly instructions.
What is this instruction for? The address cannot be determined, and we don't know who is calling it. The reference points to the parent class.
It finds the first four bytes of the parent class, finds the pointer of the virtual table, finds the virtual table, and finds the virtual function. It relies on this virtual function.
Insert image description here

Pointing to a subclass will cut or slice,
Insert image description here

Just looking at the p.BuyTicket(); instruction, it doesn’t know whether it refers to the subclass or the parent class.
The assembly instructions are the same, why are the calling results different?
Because different objects are passed, the virtual tables of different objects are different.

Why is another name for virtual function called override?
If it is in a subclass, after rewriting the virtual function, the corresponding virtual table location in the subclass will be copied over and
overwritten to be the same as my virtual function.

You can think of it this way, rewriting is a concept at the syntax level, and overwriting is a concept at the principle level.

Polymorphic conditional requirements

Now you can think about the conditions of polymorphism in reverse
1. Why are polymorphic conditions rewritten?
Because the location of the virtual function in the virtual table needs to be overwritten.

2. Why pointers or references?
Because pointers and references can point to both parent class objects and subclass objects.

Why not just store the virtual function directly at the head of the object?
Because it may have multiple virtual functions, it is inappropriate to store them all in the object.
Secondly, it is the same for virtual tables of the same type.

Virtual function table: essentially an array of virtual function pointers

If there are multiple virtual functions
Insert image description here

Let’s experience what coverage is.
Insert image description here
The first virtual function has been rewritten. It can be thought that the subclass object first copies the table of the parent class object.
Then rewrite that overlay into my own. No overwriting without rewriting.

The virtual function table is actually determined during compilation. It is one thing without rewriting, and
it is another thing after the rewriting is completed.

There may be multiple addresses in the virtual function table, so which one should be called specifically?
Look at the declaration order of the functions.

3. Can polymorphism be achieved if the object is a parent class?
Pointers or references to parent classes can be sliced ​​here. Objects of parent classes can also be sliced.
Why can't objects achieve polymorphism? From a principle perspective?
It is converted into instructions during compilation. If it is an object peron, just adjust the person directly.

It can also implement slicing, why not implement it polymorphically?
If what is the difference between pointers and references and objects, their slicing is a little different.

What if it's a slice of pointers and references?
If the pointer is pointing to this parent class or referencing this parent class.
What about subclasses? Cut out the parent class part of the subclass object. Then point or reference the cut out part.
The virtual table for this part of the subclass is still for the subclass.

What if it is an object?
If it is a parent class, there is no problem. What if it is a subclass?
When a subclass gives a slice to a parent class, the members will be copied and the copy constructor will be called.
Is there a problem involved here? Will the virtual table be copied?
If it is not copied, the virtual table of the object of the parent class will always contain the virtual function of the parent class.
It does not dare to copy the virtual table because there is a big problem with copying.
Because if you copy it, it will be messy. Assuming a deep copy of the virtual table, it is
completely unclear whether the virtual table of the parent class object is a virtual function of the subclass or a virtual function of the parent class.

Therefore, when slicing an object, only the members are copied and the virtual table is not copied.

Feel it, the virtual appearance has not changed
Insert image description here

Only the address of the virtual function will be stored in the virtual table.

Another question,
is it correct to say that virtual functions have virtual tables?
No, virtual functions are placed in the code segment just like ordinary functions.
But the address of the virtual function will not be put into the virtual function table.

This involves knowledge of the Linux operating system. You can find out more.
Insert image description here

You can check whether objects of the same type form the same virtual table. The
Insert image description here
Insert image description here
virtual tables of the parent class and the subclass are different, because the subclass must have an independent virtual table if it needs to be rewritten.

What you see in the monitoring window is modified, and what you see in the monitoring window may not be the most realistic.

class Base
{
    
    
public:
	virtual void Func1()
	{
    
    
		cout << "Base::Func1()" << endl;
	}

	virtual void Func2()
	{
    
    
		cout << "Base::Func2()" << endl;
	}

	void Func3()
	{
    
    
		cout << "Base::Func3()" << endl;
	}

private:
	int _b = 1;
};

class Derive : public Base
{
    
    
public:
	virtual void Func1()
	{
    
    
		cout << "Derive::Func1()" << endl;
	}
private:
	int _d = 2;
};

int main()
{
    
    
	Base b1;
	Base b2;
	Base b3;

	Derive d;

	b1.Func1();
	b1.Func3();
	return 0;
}

virtual function table

Remember that it is not the virtual function that enters the virtual table, but the address of the virtual function that enters the virtual table.
The full name of virtual table is virtual function table.
The essence of a virtual table is an array of pointers.

Insert image description here

base class virtual table
Insert image description here

The virtual table of the derived class
The virtual table of the derived class also has the addresses of two virtual functions
. The difference is that you can think that the virtual table of the subclass is a copy of the virtual table of the parent class
. What does it do after copying it?
Rewrite the virtual function, and the rewritten position will be overwritten into the virtual function I rewrote.
Insert image description here

The essence of polymorphism is achieved by relying on virtual tables.
For example, if there is a pointer or reference of a parent class, it can point to the parent class object or the
subclass object. Pointing to the subclass object means cutting out the parent class part of the subclass object.
You can think of it this way, for this pointer, all you see are parent class objects.
It's just that one is the parent class object itself, and the other is the parent class object cut out of the subclass object.

ptr->Func1(); The underlying assembly is the same, and the essence of the code is to convert it into assembly.
It doesn't matter what you are, it will go to the virtual table to find the address of the virtual function.
So point to the parent class to call the parent class, point to the subclass to call the subclass.

Suppose a Func4() is added to the derived class;
Insert image description here

Now Func1(); has completed the rewriting, Func4(); has not completed the rewriting.
Let's take a look now. Is Func4(); in the virtual table?
Insert image description here
I didn’t see it, where did Func4(); go?
Func4(); is a virtual function, why is it not in the virtual table?

Let's take a look at the memory window.
Insert image description here
Both Func1() and Func2() are there, so is this Func4();?
How to verify it?
Can you print the address of Func4(); for comparison? It's possible, but there are other more complicated situations later.
Now it is a single inheritance, what about multiple inheritance, and diamond inheritance.

Next, we will talk about a new way to play, using a program to print virtual tables.

Print virtual table using program

How to print?
Suppose I already have the address of the virtual table, the address of the array of function pointers,
how to print it now.

It is a function pointer, which is more troublesome to deal with.
Insert image description here
What does it mean. Here typedef is a function pointer.
The function pointer itself is very special and should be like this.
Insert image description here
But the function pointer typedef cannot be preceded by the type and followed by the renamed name.
Function pointer definition variables or typedefs should be moved to the middle.
Insert image description here

Printing an array is very simple, but I am not sure how big the array is, because
the virtual tables of different objects are different, and it can only be hard-coded under g++. For example, if you know there are three,
you can only print three. But vs series gives a traversal.

When the vs series stores the virtual table, a nullptt is placed at the end of the array, but
g++ does not.

If your VS compiler does not see nullptr, clean the solution and then regenerate the solution
.
Insert image description here

Insert image description here

Continuing to look down, I now want to take out the address of the virtual table.
Insert image description here
How to get the address of the virtual table?
This pointer is in the first 4 bytes or the first 8 bytes of the object.
How to get the first 4 bytes of the object?

You can look back at when you were learning big-endian and wanted to get the low-order value.
Suppose you are given an integer, and I want to get the first byte of this integer.
1. Define a union (define another union here and it cannot be added)
2. Forcibly convert the address of int into char and then dereference it.

Here we use the second method of play.
Insert image description here
But this is an int, and the function parameters cannot be passed. int is forced to the corresponding type.
Insert image description here
Insert image description here

Will it not be transferred directly when transferring parameters?
No, direct conversion is implicit type conversion. C++ only supports implicit type conversion for similar types.
Such as int, double, char.

A pointer is an address, but the type of the pointer determines how big the pointer is when it is referenced.

Note that sizeof() cannot be used to calculate arrays. Problems will occur as long as parameters are passed.
Also, this is not the kind of array we usually use. Only the static array we define can count the size of the array 0.
Not anywhere else.

There is a more direct way.
Insert image description here
This is still simplified. If you directly insert the function pointer, it will become a bible.

Let me help you understand.
Insert image description here

Insert image description here

Why not just do it like this?
Insert image description here
Let me start with the conclusion, this won’t work.
First, where is the address you want to send? The first 4 bytes or 8 bytes of the object.
Dereferencing is necessary to get out the first 4 bytes or 8 bytes of the object.
&b is a pointer to an object. Do you want to pass the pointer at position 1 or position 2? number 2.
What you are passing now is number 1. The pointer at position 2 is in the first 4 bytes of the object. How to get it out?

Forced conversion to VF_PTR**, pointer dereference looks at 4 bytes in 32-bit, and 8 bytes in 64-bit.
Insert image description here

What is the difference between these two ways of writing?
The first way of writing has certain limitations. The limitation is that it can only run in 32-bit and
cannot run in 64-bit.
The second way of writing is applicable. The dereference of VF_PTR** is to look at VF_PTR*. VF_PTR* is
4 bytes in 32 bits and 8 bytes in 64 bits.

Now you can print out the address of the virtual function in the virtual table, but how to confirm that it is this one?
Let me teach you a trick.

Insert image description here
Insert image description here

I have a question, the parent class does not have Func4(); how can it enter the virtual table?
This virtual table no longer only belongs to the parent class, it is inherited. It's just that the growth point is part of the parent class of the child class object.
Func4(); is a subclass, and secondly, this virtual table strictly speaking belongs to a subclass.

The virtual table of the parent class and the virtual table of the child class are not the same. After the child class inherits, the child class makes a copy of the virtual table, and
then the child class rewrites it, and its own virtual functions will also enter this virtual table.

At what stage is the virtual table generated?
It is generated during compilation, because the addresses of these functions are available during compilation, which can form the virtual table of the parent class and the virtual table of the subclass.

When is the virtual table in the object initialized?
It is initialized in the initialization list of the constructor. You can take a look at it yourself through debugging.

Where does the virtual table exist?
First of all, the virtual table is not in the object, what is inside the object is the virtual table pointer.
Is it possible that it's on the stack?
Absolutely impossible, because multiple objects point to the same virtual table. There are only stack frames in the stack. The function call ends and is then destroyed, which is impossible.
Insert image description here
Is it possible that it is on the heap?
It's possible, but unreasonable. The heap is generally allocated dynamically. impossible.

Insert image description here
Next we can verify whether it is in the static area or the constant area?
Just print out a few addresses to compare.
Insert image description here
Comparing the distance of the addresses, the address of the virtual table is closest to the constant area.

In fact, you can think carefully about whether the virtual table will be changed after it is compiled?
Virtual tables may be modified during compilation, especially those of subclasses.
It will not be changed during operation, so it is more appropriate to place it in the constant area.

In fact, you can see it by looking at the following
Insert image description here
. The compiled function is a string of instructions. The address of this string of instructions is the address of the function. The address of the function is placed in the
constant area of ​​the code segment.

Multiple inheritance virtual function table

Insert image description here

There is nothing wrong with Base1 and Base2. The key is to look at the Derive of multiple inheritance;
look at the monitoring window first.

Derive should have two virtual tables, because it inherits
Insert image description here
two virtual tables, Base1 and Base2, and rewrites func1(); func2(); and remains unchanged.

Now there is a question, where to put the func3(); of the subclass?
Let’s take a look here with the help of virtual table printing.
Insert image description here
Now there is a question. The first virtual table is at the first position. How big is the second virtual table when it is printed?
Insert image description here

The two virtual tables are placed in two objects, and it is impossible to determine whether they are continuous.
Because Base1 has other member variables besides this virtual table.

1. Skip Base1 and add sizeof(Base1);
2. Use slicing and use the offset of the pointer. (The pointer of Base2 will be automatically offset)
Insert image description here
But this is wrong, &d is Derive*, Derive*+1 skips Derive, and is forced to char*, char*+1 skips one byte.

It was placed on the first table
Insert image description here

Understand pointer offsets.
Let’s take a look at the following question first.
Insert image description here
This question can be solved if you understand slicing.
Although the addresses of p1 and p3 are the same, their meanings are different.
Insert image description here

Whoever inherits first will declare it first, and whoever declares it first will be in front.

Insert image description here
func1 will complete the rewriting. It will rewrite twice, covering two locations. The virtual table of Base1 will also cover the virtual table of Base2.
But there is a very strange phenomenon here. This is the real big problem. 9 out of 10 people who learn C++ will fail.

Have you noticed that the address of the rewritten func1 is different?
First of all, let me ask you a question, is this function func1? Is the address of fuc1 rewritten?
Yes, because the string behind us is printed by calling this function.
Insert image description here

But why is this address different?
This problem is very deep and difficult to understand. We can only understand it by reading the compilation.

Insert image description here

Both of these will be converted into assembly code, and these two pieces of assembly code are different. Are the same function called in these two places?
Is this the same address as the call?
Many people think it is the same, because the conditions for polymorphism are met here.
**You can think that the second address is encapsulated. **Because the call cannot be completed without encapsulation, there are some deviations in our understanding of some conditions.
There are deep reasons for this.

Insert image description here
The address remains unchanged when re-running, which is not conducive to comparison because it involves some reasons for process loading. It needs to be relocated.

ptr1 is a normal call.
Everyone watched ptr2 jmp several times in a row, why?
jmp is equivalent to encapsulation.
There is a very special instruction on the right.
Insert image description here
ecx stores the this pointer, and everyone can understand it with this picture.
Insert image description here

When calling this function of the subclass.
The reason ptr1 is not processed is because it happens to point to the beginning of the subclass pair.
When calling a subclass function, the this pointer should point to the subclass object.

When ptr2 calls this function of the subclass, the this pointer is incorrect.

The function of this instruction is to correct the position of this pointer.
This does not necessarily reduce 8, it reduces the size of Base1.

There is also a problem involved here. If you inherit Base2 first, base1 needs to be corrected.

Static polymorphism and dynamic polymorphism

In some places, static polymorphism and dynamic polymorphism are distinguished.

So what is static polymorphism?
Function overloading.

At the general language level, static refers to compile time.

Function overloading is implemented at compile time.

What is dynamic polymorphism?
Corresponds to runtime.

These two essences are written to death.

diamond virtual inheritance

Insert image description here
A has a virtual function func, B has a virtual function func, C has a virtual function func, and
D has no virtual function. This is not possible.
If D is not rewritten, an error will be reported. It's not clear why?
Insert image description here

Now B has been rewritten, and C has been rewritten. Now there is a question, whose virtual function is placed in A's virtual table?
If you are interested in this question, you can find out for yourself. I will not answer it here.

Guess you like

Origin blog.csdn.net/weixin_68359117/article/details/135113344