environment:
- centos7.6
- gcc 4.8.5
1. From one test.c
totest.out
The environment for the experiment here is linux
that linux
the default suffix of the executable file is .out
.
First look at the following code:
test.c
#include <stdio.h>
int main()
{
printf("ok\n");
return 0;
}
We, first use gcc test.c --save-temps -o test.out
compile it to test.out
, and keep traces, as follows:
--save-temps
command can keep traces of the entire compilation process.
According to the above traces, we directly explain the process above:
Step 1: Preprocessing
The preprocessor first test.c
preprocesses the content, that is, disposes of the content inside #include...
, #ifdef...
etc. After processing, there will be no #include...
such content in it, and finally generated test.i
. For preprocessing content, please refer to "c: Preprocessing Directives (#include, #define, #if, etc.)" .
In short, the content of the generated file is roughly as follows:
Step 2: Compile
In this step, gcc will test.i
perform lexical analysis, optimization, and finally convert it into an assembly file test.s
. Note that test.s
it is still text, as shown in the figure below:
Step 3: Compile
In this step, the assembler will test.s
translate the instructions into a binary format, and the output test.o
will be ELF
a binary file in the format (the ELF file format will be mentioned later).
test.o
Also known as 可冲定位文件
, we file
observe via:
Step Four: Link
In this step, the linker test.o
links the file with its referenced resources, mainly integrating with other referenced resources, reassigning the memory address, and finally generating test.out
, using file
the following observations:
2. GCC is a compilation driver
In the above 4 steps, we mentioned respectively: 预处理器
, 编译器
, 汇编器
, 链接器
, these four have corresponding special programs, such as:
- preprocessor:
/usr/bin/cpp
- Compiler:
/usr/bin/cc
or/usr/bin/c++
(possibly both) - Assembler:
/usr/bin/as
- Linker:
/usr/bin/ld
If we are in a window environment, we can see it more clearly:
since gcc is a compilation driver, we can naturally compile step by step without the gcc command, for example: :
cpp test.c test.i
preprocessing
as test.s -o test.o
: assembly
Why are the other two processes not listed? Because of various errors encountered during the experiment, I gave up. . .
Now, we know the four processes of compilation, so can we control at which step to stop?
Of course, we still return to gcc
the command:
- Only do preprocessing
test.c
=>test.i
gcc -E test.c -o test.i
- preprocess and compile
test.c
=>test.s
gcc -S test.c -o test.s
- Preprocess, compile and assemble
test.c
=>test.o
gcc -c test.c -o test.o
- The whole process, the output executable file
test.c
=>test.out
gcc test.c -o test.out
3. About ELF files
Above we mentioned test.o
and test.out
are ELF
binary files in format. So what is ELF
a file?
Look directly at the introduction of Baidu Encyclopedia:
that is to say, what we care about ELF
can represent 4 types of files:
- Object code:
test.o
- executable file:
test.out
- Dynamic library:
test.so
- Core dump file (used less, generally to assist debugging)
So ELF
what is the internal format like?
Still look at Baidu Encyclopedia:
If we go deeper, we won't study it, just know a general idea.
4. Talk about the link
For the above four processes, our biggest question should be that 链接
we don't know why links are needed. . .
Links actually serve two purposes:
-
- Layout
Assuming that we have two filestest.c
and functions thatlibadd.c
aretest.c
calledlibadd.c
, then, when compiling, the compiler first generatestest.o
andlibadd.o
object files respectively. Because they are compiled separately, the addresses involved in the assembly instructions inside are considered to be fromtest.o
the beginning, that is, they do not know each other. The layout of the linker is to combine their address spaces to prevent overlapping.libadd.o
0
- Layout
-
- Relocation
still assumes the above two filestest.c
andlibadd.c
. We know that whentest.c
we call a function, we just callint add(int x,int y)
this statement. As for the specific implementation of this function, where is it?test.o
There is nothing in it, sotest.o
the place involved in the call iscallq 0x0000
, that is, if you don't know the address of this function, fill it with 0 first.
Therefore, the linker has to helptest.o
find the implementation of this function, andlibadd.o
there happens to be a declaration of this function in , so givelibadd.o
the address intest.o
.
- Relocation
These are the two main purposes of the linker. I simplified it above, but it is actually very complicated.
The example given here belongs to
静态链接
, and there are also (such as standard functions such as动态链接
our call ), which store the address of .printf
动态链接
库装载器
5. Dynamic library and static library
静态链接
Above we also mentioned and when linking 动态链接
. The so-called 静态链接
is to copy the referenced library together, and the dynamic link does not need to be copied. So, a lot more 动态链接
than 静态链接
applied.
After understanding the reconciliation 动态链接
, 静态链接
we should know 动态库
the reconciliation 静态库
.
Now let's experiment:
5.1 Generate static library
The so-called static library is to pack the compiled object code (such as:
libadd.o
,libsub.o
) into a compressed package, and the general suffix is*.a
.
First, prepare three files:
test.c
#include <stdio.h>
int add(int x,int y);
int sub(int x,int y);
int main()
{
int x=20,y=10;
printf("x+y=%d\n",add(x,y));
printf("x-y=%d\n",sub(x,y));
printf("ok\n");
return 0;
}
goose.c
int add(int x,int y)
{
return x+y;
}
libsub.c
int sub(int x,int y)
{
return x-y;
}
Now, we compile them separately:
Now, let's make libadd.o
and libsub.o
be static libraries:
ar rcs ...
Among them, r means replace, c means create
5.2 Call static library compilation
Continuing from the above, we will test.o
and libaddsub.a
generate test.out
:
5.3 Using dynamic libraries
Note: The dynamic library is a
ELF
binary file in a format, not a compressed package, and the suffix name is*.so
Above, we generated libadd.c
static libsub.c
libraries and now we let them generate dynamic libraries respectively:
# 生成 libadd.so
gcc -shared libadd.c -o libadd.so
# 生成 libsub.so
gcc -shared libsub.c -o libsub.so
Now, let's compile the executable with the dynamic library:
# 生成 test.out
gcc test.c libadd.so libsub.so -o test.out
But test.out
when we execute, we are disappointed:
Why is this so? Isn't there in the current directory libadd.so
?
This is about the principle of Linux loading dynamic libraries:
Linux will find the dynamic library from the specified path according to the configuration instead of the current directory, so where is this configuration?
In /etc/ld.so.conf
:
You can see that this file specifies that etc/ld.so.conf.d/*
you
can find it from inside, and you can only find it /usr/lib64/mysql
under (note: there are default /lib64
and others that are not listed).
So, how many dynamic libraries have been found so far?
Can be ldconfig -p
viewed :
In addition:
/etc/ld.so.cache
the file is the cache of the dynamic library, runningldconfig
the command can force/etc/ld.so.cache
the file to be updated.
Now, we know, either we put libadd.so
and libsub.so
put /lib64
in the system directory, or libadd.so
configure the directory where we are in /ect/ld.so.conf.d
the directory. This is indeed a workaround.
However, we have another solution, which is to use LD_LIBRARY_PATH
environment variables, as follows:
Then, we can put the configuration of this environment variable /etc/profile
into it .
If we don't want to leave any traces on the system, then we can write a script with the following content:
current_dir=$(cd $(dirname $0); pwd)
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:current_dir
./test.out
The effect is as follows:
6. Some understanding of C language compilation
In fact, according to the above compilation process, we should realize that the c language compilation is divided into two steps as a whole:
- All individual c files are compiled into object code separately
*.o
; - Link multiple
*.o
, static or dynamic libraries into an executabletest.out
Therefore, when compiling C language, compile it individually first, and then integrate resources.
Therefore, a single file in the c language c
may not have the implementation of an api function, but if you want to call it, you must declare it before the call (global variables also have a meaning).
7. gcc common compilation options
In addition to the above compilation commands, we commonly use
gcc -Og test.c -o test.out
Here -g
is to generate debugging information (if we want to debug, such as using gdb
debugging);
-O
it is an optimization option.
8. Supplement other commands attached to gcc
8.1 object dump
8.1.1 Display libaddsub.a
internal information
8.1.2 Displayed libadd.o
disassembly information
8.1.3 Display symbol table information
For objdump
more, refer to: obdump -v
orman objdump
8.2 readelf
Above, we said that test.o
, test.so
, test.out
and are all ELF
binary files in the format, now we will readelf
use to see:
8.2.1 Display elf file header information
8.2.2 Display program header table information