Remember a gcc link prompt symbol undefined error

1. Preliminary summary

When using a third-party library last week, a link prompt symbol was undefined. However, when using readelf to view the properties of the internal symbols of the third-party so, you can see that they are actually defined, but there are certain differences between the internal symbols of so and the symbols I reported. This involves the role of extern "C" and some differences in symbol processing between g++ and gcc. It has been delayed for a long time. Let's complete the output of this problem today.

2. Sign difference

1. Phenomenon

Here we simply simulate and reproduce the scene at that time. Simply prepare a so source code " func.cpp ", including the definition of its header file " func.h ":

// func.h
#include <stdio.h>

int func(int a, int b);

// func.cpp
#include "func.h"

int func(int a, int b) {
    
    
  return a + b;
}

Use g++ -shared -fPIC -o libfunc.so func.cppthe command to compile and generate the corresponding so file.
insert image description here

Then we simply write a main function, and call the func function implemented by the so in the main function. The source code " main.c " is as follows:

// main.c
#include "func.h"

int main () {
    
    
  int a = 1, b = 1;
  int c = 0;
  c = func(a, b);
  printf("%d + %d = %d\n", a, b, c);
  
  return 0;
}

Let's try to compile main.c and the corresponding libfunc.so. with the following commandgcc main.c -o exec -L./ -lfunc
insert image description here

At this point, I will encounter problems when I link third-party libraries at work, suggesting that there is an undefined but referenced symbol undefined reference to 'func' in main.c.

2. Preliminary research

I was quite confused at the time. I clearly defined it int func(int a, int b). In func.cpp, I also linked to the corresponding so library. The " -lfunc " in the compilation command is a good proof. Then why does it prompt that the symbol is undefined? So I used readelfthe command to check the situation of func in libfunc.so. readelf -a ./libfunc.so | grep "func", the result is as follows:
insert image description here
You can see that there is a definition in libfunc.so, but this symbol looks different from the symbol that reported the error message in the above link. It should be this problem that caused the symbol definition to not be found when linking. According to the form of the symbol, it is basically determined that the symbol is modified by the compiler because so is compiled with g++, but gcc still uses the original symbol "func" when compiling main.c, resulting in undefined symbols.

3. Differences in symbol handling between g++ and gcc at compile time

When such a problem occurs, quickly check a wave of information to consolidate the principle of g++'s symbol processing.

It can be summarized as the following points:
1) Except for global variables that do not need to be modified, when other symbols need to be modified, they all start with _Z;
2) If you want to indicate that a certain symbol is in a namespace or class , must start with "N" and end with "E";
3) All namespace names, class names, function names or variable names, when adapted, are the number of characters contained in the name plus the real name; 4
) All names are adapted in order from the outer layer to the inner layer;
5) If it is a function, all parameters are adapted in the order of appearance.
(Refer to the blog post here: https://blog.csdn.net/roland_sun/article/details/43233565)

We refer to the above rules and manually modify the func function:
1) func is a global function symbol, so it starts with "_Z";
2) Since it is a global symbol, it does not need to start with "N" and end with "E";
3 ) The variable name "func" has four characters in total, so it is "4func"
4) The parameters are "int a, int b", so the order is "ii"

To sum up, "func" should be "_Z4funcii" after being compiled and modified by g++, which is readelfexactly the same as what we have seen.

The modification process is completed by the compiler "g++" independently, and the user has no perception. And "gcc" will not decorate symbols. Why is this? Why should it be treated differently?

This is because C++ is different from C by adding new features such as namespaces, classes, and function overloading, so there will be different namespaces, or different classes, or the same class, and there will be functions with the same name. In order to ensure the uniqueness of the symbols in the target file after compilation, g++ modifies the symbols of the original c++ file, so after we understand the modification rules, we can find the corresponding specific symbol position according to the modified symbols.

3. Problem solving

Now that we have found the root cause of the problem, we have to find a solution. Since the third-party library is directly compiled by the third party and released to me, I can't recompile it without source code files, so I can only find a solution from my side.

1. Compile main.c with g++

Since we cannot change the symbols inside the library, we can only modify the symbols when compiling main.c. I try g++to compile main.c using.

We use the command g++ -c -o main.o main.cto compile the source file and generate the object file "main.o". After the compilation is complete, we use readelfthe command to check the existing form of the func symbol in the target file, and the result is shown in the figure below:
insert image description here
You can see that in the target file "main.o", the symbol " func " has become a modified form" _Z4funcii ", then the following linking process should no longer prompt the problem that the symbol cannot be found. Let's try it:
insert image description here
we can see that we have been able to link so normally. The generated executable file "exec" runs without any problems. So far, this problem has been basically solved.

But we still want to follow the specification, g++ compiles C++ files, gcc compiles C files, what should we do at this time. For this kind of C interface, we still hope to retain the original symbol name instead of the modified symbol name, then we need to use extern “C”to modify the interface.

2、extern “C”

We can often see the following structure in the header file:

#ifdef __cplusplus
extern "C" {
    
    
#endif

/*
 ...符号声明...
 */

#ifdef __cplusplus
}
#endif

What is the use of this piece of code? Let's try to use it in the previous test code to see what effect it has. Let's modify the header file func.hto look like this:

// func.h
#ifndef _FUNC_H_
#define _FUNC_H_

#ifdef __cplusplus
extern "C" {
    
    
#endif

#include <stdio.h>

int func(int a, int b);

#ifdef __cplusplus
}
#endif
#endif

Then try to recompile "libfunc.so":
insert image description here

I recompiled the so library according to the modified header file, let's readelfcheck the "func" symbol in the new "libfunc.so" again:

insert image description here
We found that after adding the above code in the header file, the symbol name in the so library becomes unmodified. Does that mean that our main.c file can be compiled and linked directly with gcc? Try it out:
insert image description here
it is indeed possible to directly link to generate an executable file. According to the above introduction to the difference between g++ and gcc on symbol processing, we can conclude that extern “C”"tell" the compiler that the following code needs to be compiled according to the C language standard, then the symbol name will not be used by the compiler grooming .

Then for the C type interface, we should use extern “C”to declare, so that when we refer to the symbol in the C type source file, we can use the "gcc" compiler to compile the source file, and there is no need to use "g++" to keep the symbol consistent The "C" file is compiled.

3. Summary

So far, we have completely solved the problem of undefined symbols encountered this time, and delved into its related principles and corresponding solutions. The problem this time, in hindsight, is actually very simple and easy to solve, but at the time of the incident, I was at a loss. It may also be because I worked overtime late at that time, which caused my brain to be not flexible enough, hahaha.

Although the problem is simple, we still need to delve into the root cause, understand and learn the relevant principles, and then try to solve the problem from multiple angles and solutions, so that we can grow!

All great gods are also welcome to come to my personal website (www.ccccxy.top/coding) to leave comments and guidance. This article will also be updated to my personal website simultaneously, thank you!

Life is endless, bugs are endless, programmers, the road is long and difficult~

Guess you like

Origin blog.csdn.net/qq_38894585/article/details/109843764