function overloading, rewriting, reentrancy

function overloading, rewriting, reentrancy

foreword

Recently, I was asked about function overloading, what is the difference between rewriting and reentrancy, and I was a little confused when I suddenly asked about these concepts. Here is a summary of these three concepts.

In fact, overloading, rewriting and reentrancy are concepts of different dimensions, but the names are similar, so they are often compared.
Especially overloading and overriding are the easiest to compare. These two concepts are also for C++, and for C language syntax, overloading and rewriting are not supported. The following will explain in detail why C does not support it.

overload

The first is the definition: Overloading means that different functions use the same function name, but the number or types of parameters of the function are different. When calling, different functions are distinguished according to the parameters of the function.
Overloading is to solve the problem of naming methods with similar functions. Do an addition operation as follows. If each type has a name, it will be difficult to remember, but the name is too big.

C++ functions are all written as global functions, and of course there is no problem with overloading member functions. The purpose of not writing a member function is to compare it with C below to explain why C++ can achieve overloading.
With the same function name add(), when calling, you can find the corresponding function according to the different types and numbers of parameters passed in. Then execute the correct code segment. This eliminates the need to record complicated names. Very friendly to people.

#include <iostream>
int add(int a, int b)
{
    
    
        return a + b;
}
double add(double a, double b)
{
    
    
        return a + b;
}
int add(int a, int b, int c)
{
    
    
        return a + b + c;
}
int main()
{
    
    
        int a = 1, b = 2, c = 3;
        int d = add(a , b);
        return 0;
}

Next, further explain why function overloading can be realized, and why the program can find the correct code segment to execute.
This involves the compilation principle of C++. When compiling a C++ function, the symbol table of the program will be generated. When the program is executed, the symbols in the symbol table are used to find the corresponding functions and variables. In C++, the symbol corresponding to the compiled function is composed of the function name + parameters.

Directly use the nm command on linux to view the symbol table, as follows, you can see _Z3adddd, _Z3addii, _Z3addiii, these three are the corresponding symbol tables after compiling the add function of the above code. Because the function symbol contains the function name and parameter type and parameter number information. Therefore, when calling, you can also
find the corresponding function code segment through the parameter type or the number of parameters. At the same time, it is difficult to understand why the difference in return value is not function overloading, because there is no return value information in the symbol table, and the correct function code segment cannot be found through the return value.

yang@legion:~/Desktop/sample/rewrite$ nm test2 
............
...........
0000000000000890 T __libc_csu_fini
0000000000000820 T __libc_csu_init
                 U __libc_start_main@@GLIBC_2.2.5
0000000000000784 T main
00000000000006a0 t register_tm_clones
0000000000000630 T _start
0000000000201010 D __TMC_END__
000000000000074e T _Z3adddd
000000000000073a T _Z3addii
0000000000000768 T _Z3addiii
00000000000007ba t _Z41__static_initialization_and_destruction_0ii
                 U _ZNSt8ios_base4InitC1Ev@@GLIBCXX_3.4
                 U _ZNSt8ios_base4InitD1Ev@@GLIBCXX_3.4
00000000000008a4 r _ZStL19piecewise_construct
0000000000201011 b _ZStL8__ioinit

The above is a C++ program, because C++ and C have many similarities. At this point, we will wonder why C does not support function overloading. This also has to do with the symbol table.
For example, the following piece of C code:

#include <stdio.h>
int add(int a, int b)
{
    
    
        return a + b;
}
int add(double a, double b)
{
    
    
        return a+b;
}
int main()
{
    
    
        int a = 0, b = 1;
        int c ;
        c = add(a,b);
        return 0;
}
~    

If this code is compiled, an error will be reported directly, and the function will be redefined.

yang@legion:~/Desktop/sample/rewrite$ gcc test.c -o test
test.c:8:5: error: conflicting types for ‘add’
 int add(double a, double b)
     ^~~
test.c:3:5: note: previous definition of ‘add’ was here
 int add(int a, int b)

It is precisely because the C language considers that the two functions are the same that the compilation error occurs, and therefore, C does not support function overloading.
If you delete the second add() function defined, after compiling, check the symbol table, the reason will be clear at a glance. In the first line below,
there is only one add symbol in the function table. That is, when the C language is compiled, the generated symbol table has only the function name and no subsequent parameters and other information.
So if the names are the same, it will be considered a redefinition.

yang@legion:~/Desktop/sample/rewrite$ nm test
00000000000005fa T add
0000000000201010 B __bss_start
0000000000201010 b completed.7698
                 w __cxa_finalize@@GLIBC_2.2.5
0000000000201000 D __data_start
0000000000201000 W data_start
.............
............
............
                 U __libc_start_main@@GLIBC_2.2.5
000000000000060e T main
0000000000000560 t register_tm_clones
00000000000004f0 T _start
0000000000201010 D __TMC_END__

In fact, based on the above differences in the symbol tables compiled by C and C++, it can also explain why we add extern "C" to include the C code segment when writing C++ code, so that the C code can be called correctly.

There are several rules for overloading:
①Must have different parameter lists.
② There can be different access modifiers.
③ Different exceptions can be thrown

rewrite

        Because C++ also supports global function definitions, for overloading, sometimes it is confusing why C cannot. As for rewriting, it is completely an object-oriented concept.
         The definition of overriding first: overriding (also called overriding) refers to reimplementing the virtual function (note that it is a virtual function) in the base class in the derived class. That is, the function name and parameters are the same, but the implementation body of the function is different.
         The overriding is to achieve object-oriented polymorphic behavior. Different subclasses rewrite the functions of the parent class, execute different logics, and exhibit different behaviors, that is, realize object-oriented polymorphism. Here we will focus on why the function can be rewritten. You can look at other materials about polymorphism.
This article focuses on the rules of function rewriting, why there are these rules, why rewriting can be achieved, and the coverage of parent class methods.
Let me talk about some rewriting rules first:
①The parameter list must be exactly the same as the method being rewritten, otherwise it cannot be called rewriting but overloading.
②The return type must always be the same as the return type of the overridden method, otherwise it cannot be called rewriting but overloading.
③The restriction of the access modifier must be greater than the access modifier of the overridden method.
④The overriding method must not throw a new checked exception or a checked exception that is wider than the overridden method declaration.

        Let’s talk about item ① first. It’s easier to understand that the parameter lists must be the same. Through the analysis of the above overloaded functions, we can know that the symbols of C++ functions have parameters after they are compiled. Although this reason can barely be explained, it cannot be explained later why it must be exactly the same, that is, why should the number of parameters be the same? There is ②Why do you ask for a return value? There is no return value in the function symbol table.
         Here is the difference in the realization principle of overloading and rewriting. Overloading is because the function symbol table of C++ has parameter types. So it can be overloaded.
The rewriting is not through the symbol table, but through the virtual function table. When declaring a class, if the virtual keyword is added to the member function, as shown in the figure below, when the object is generated, a virtual function table will be generated at the head of the object. The virtual function table exists in the virtual function pointer.
         The size of a function pointer is 4 bytes, which is relatively clear. Therefore, when calculating the size of an object, in addition to considering the memory size occupied by member variables, it is also necessary to consider how many virtual functions there are.
insert image description here
Because virtual functions are polymorphic behaviors implemented through virtual function tables. That is to say, the essence of rewriting is the corresponding function pointer. If you consider the function pointer, you can understand why the return value is required, and the parameter list must be completely consistent.
The code below shows an example of rewriting.

#include <iostream>
#include <memory>

using namespace std;

class Anamal
{
    
    
public:
        Anamal() = default;
        virtual ~Anamal() = default;

        virtual int voice(int a, int b) {
    
     cout<<"anamal voice ..."<<endl; };

};

class cat: public Anamal
{
    
    
public:
        cat() = default;
        virtual ~cat() = default;

        int voice(int b, int a) {
    
     cout <<"cat voice ..."<<endl;};
};

int main()
{
    
    
        unique_ptr<Anamal> p;
        p.reset(new cat());
        p->voice(1,2);

        return 0;
}

From this example, there is one point that needs attention. When rewriting the voice() function, I deliberately reversed the formal parameters, int b, int a, but there is no problem with compilation, and polymorphic behavior can be realized. This is because the function pointer only checks the type of the parameter, and there is no restriction on the name of the formal parameter.

heavy entry

        Finally, let’s talk about function reentrancy, or reentrant functions. This is actually a dynamic concept.
A reentrant function means that the function can be used concurrently by multiple tasks without worrying about data errors. It means a function that can be interrupted. This function can interrupt it at any time and execute another block of code. After the execution is completed, it can return to the original code and continue to run normally.
        As long as it is a multi-tasking system, there will be a function that is interrupted during execution, transferred to OS scheduling, to execute another piece of code, and then return to execute this function after execution. If the function uses system resources, such as global variables, etc., if it is interrupted, problems may occur. This is a non-reentrant function. Such functions cannot be run in a multitasking environment.
Most functions that meet the following conditions are non-reentrant functions:

  • The static data data structure is used in the function body
  • Functions for dynamic memory allocation and release are called, such as using new, malloc, etc.
  • The I/O operation is called in the function body.

To give an example, the following code uses global variables, then this function is a non-reentrant function.

int  c;
void  swap(int * a, int *b)
{
    
    
	 c    = *a;
	*a  = *b;
	*b  = c;
}

Why are most of the functions that meet the above conditions non-reentrant?

In a multitasking environment, shouldn't the context be saved and restored before and after the interruption? How can a function become non-reentrant after being interrupted?
The reason is that the interrupt confirmation saves some context, but it is limited to a small amount of stack information such as the function return value address, CPU registers, etc., and the global variables used inside the function, buffer and other data areas will not be protected. Therefore, after the interrupt occurs, and then return to execution, the value of these data is uncertain and unsafe.
Therefore, reentrant functions correspond to safe functions in a multitasking environment.
Based on this, write reentrant functions and follow the following rules:
1. Do not use (return) static data and global variables (unless mutually exclusive with semaphore).
2. Do not call dynamic memory allocation and release functions.
3. Do not call any non-reentrant functions (such as standard I/O functions)

Guess you like

Origin blog.csdn.net/yangcunbiao/article/details/130876163