c: Let's talk about the compilation process of c language

environment:

  • centos7.6
  • gcc 4.8.5

1. From one test.ctotest.out

The environment for the experiment here is linuxthat linuxthe default suffix of the executable file is .out.

First look at the following code:

test.c

#include <stdio.h>

int main()
{
    
    
    printf("ok\n");
    return 0;
}

We, first use gcc test.c --save-temps -o test.outcompile it to test.out, and keep traces, as follows:
insert image description here

--save-tempscommand can keep traces of the entire compilation process.

According to the above traces, we directly explain the process above:

insert image description here

Step 1: Preprocessing

The preprocessor first test.cpreprocesses the content, that is, disposes of the content inside #include..., #ifdef...etc. After processing, there will be no #include...such content in it, and finally generated test.i. For preprocessing content, please refer to "c: Preprocessing Directives (#include, #define, #if, etc.)" .
In short, the content of the generated file is roughly as follows:
insert image description here

Step 2: Compile

In this step, gcc will test.iperform lexical analysis, optimization, and finally convert it into an assembly file test.s. Note that test.sit is still text, as shown in the figure below:
insert image description here

Step 3: Compile

In this step, the assembler will test.stranslate the instructions into a binary format, and the output test.owill be ELFa binary file in the format (the ELF file format will be mentioned later).
test.oAlso known as 可冲定位文件, we fileobserve via:
insert image description here

Step Four: Link

In this step, the linker test.olinks the file with its referenced resources, mainly integrating with other referenced resources, reassigning the memory address, and finally generating test.out, using filethe following observations:
insert image description here

2. GCC is a compilation driver

In the above 4 steps, we mentioned respectively: 预处理器, 编译器, 汇编器, 链接器, these four have corresponding special programs, such as:

  • preprocessor:/usr/bin/cpp
  • Compiler: /usr/bin/ccor /usr/bin/c++(possibly both)
  • Assembler:/usr/bin/as
  • Linker:/usr/bin/ld

If we are in a window environment, we can see it more clearly:
insert image description here
since gcc is a compilation driver, we can naturally compile step by step without the gcc command, for example: :
cpp test.c test.ipreprocessing
as test.s -o test.o: assembly

Why are the other two processes not listed? Because of various errors encountered during the experiment, I gave up. . .

Now, we know the four processes of compilation, so can we control at which step to stop?
Of course, we still return to gccthe command:

  • Only do preprocessing test.c=>test.i
    gcc -E test.c -o test.i
  • preprocess and compile test.c=>test.s
    gcc -S test.c -o test.s
  • Preprocess, compile and assemble test.c=>test.o
    gcc -c test.c -o test.o
  • The whole process, the output executable file test.c=>test.out
    gcc test.c -o test.out

3. About ELF files

Above we mentioned test.oand test.outare ELFbinary files in format. So what is ELFa file?

Look directly at the introduction of Baidu Encyclopedia:
insert image description here
that is to say, what we care about ELFcan represent 4 types of files:

  • Object code:test.o
  • executable file:test.out
  • Dynamic library:test.so
  • Core dump file (used less, generally to assist debugging)

So ELFwhat is the internal format like?

Still look at Baidu Encyclopedia:
insert image description here

If we go deeper, we won't study it, just know a general idea.

4. Talk about the link

For the above four processes, our biggest question should be that 链接we don't know why links are needed. . .

Links actually serve two purposes:

    1. Layout
      Assuming that we have two files test.cand functions that libadd.care test.ccalled libadd.c, then, when compiling, the compiler first generates test.oand libadd.oobject files respectively. Because they are compiled separately, the addresses involved in the assembly instructions inside are considered to be from test.othe beginning, that is, they do not know each other. The layout of the linker is to combine their address spaces to prevent overlapping.libadd.o0
    1. Relocation
      still assumes the above two files test.cand libadd.c. We know that when test.cwe call a function, we just call int add(int x,int y)this statement. As for the specific implementation of this function, where is it? test.oThere is nothing in it, so test.othe place involved in the call is callq 0x0000, that is, if you don't know the address of this function, fill it with 0 first.
      Therefore, the linker has to help test.ofind the implementation of this function, and libadd.othere happens to be a declaration of this function in , so give libadd.othe address in test.o.

These are the two main purposes of the linker. I simplified it above, but it is actually very complicated.

The example given here belongs to 静态链接, and there are also (such as standard functions such as 动态链接our call ), which store the address of .printf动态链接库装载器

5. Dynamic library and static library

静态链接Above we also mentioned and when linking 动态链接. The so-called 静态链接is to copy the referenced library together, and the dynamic link does not need to be copied. So, a lot more 动态链接than 静态链接applied.

After understanding the reconciliation 动态链接, 静态链接we should know 动态库the reconciliation 静态库.
Now let's experiment:

5.1 Generate static library

The so-called static library is to pack the compiled object code (such as: libadd.o, libsub.o) into a compressed package, and the general suffix is *.a​​.

First, prepare three files:

test.c

#include <stdio.h>

int add(int x,int y);
int sub(int x,int y);

int main()
{
    
    
    int x=20,y=10;
    printf("x+y=%d\n",add(x,y));
    printf("x-y=%d\n",sub(x,y));
    printf("ok\n");
    return 0;
}

goose.c

int add(int x,int y)
{
    
    
    return x+y;
}

libsub.c

int sub(int x,int y)
{
    
    
    return x-y;
}

Now, we compile them separately:
insert image description here
Now, let's make libadd.oand libsub.obe static libraries:
insert image description here

ar rcs ...Among them, r means replace, c means create

5.2 Call static library compilation

Continuing from the above, we will test.oand libaddsub.agenerate test.out:
insert image description here

5.3 Using dynamic libraries

Note: The dynamic library is a ELFbinary file in a format, not a compressed package, and the suffix name is*.so

Above, we generated libadd.cstatic libsub.clibraries and now we let them generate dynamic libraries respectively:

# 生成 libadd.so
gcc -shared libadd.c -o libadd.so
# 生成 libsub.so
gcc -shared libsub.c -o libsub.so

Now, let's compile the executable with the dynamic library:

# 生成 test.out
gcc test.c libadd.so libsub.so -o test.out

But test.outwhen we execute, we are disappointed:
insert image description here
insert image description here

Why is this so? Isn't there in the current directory libadd.so?

This is about the principle of Linux loading dynamic libraries:

Linux will find the dynamic library from the specified path according to the configuration instead of the current directory, so where is this configuration?
In /etc/ld.so.conf:
insert image description here
You can see that this file specifies that etc/ld.so.conf.d/*you
insert image description here
can find it from inside, and you can only find it /usr/lib64/mysqlunder (note: there are default /lib64and others that are not listed).

So, how many dynamic libraries have been found so far?
Can be ldconfig -pviewed :
insert image description here

In addition: /etc/ld.so.cachethe file is the cache of the dynamic library, running ldconfigthe command can force /etc/ld.so.cachethe file to be updated.

Now, we know, either we put libadd.soand libsub.soput /lib64in the system directory, or libadd.soconfigure the directory where we are in /ect/ld.so.conf.dthe directory. This is indeed a workaround.

However, we have another solution, which is to use LD_LIBRARY_PATHenvironment variables, as follows:
insert image description here
insert image description here

Then, we can put the configuration of this environment variable /etc/profileinto it .

If we don't want to leave any traces on the system, then we can write a script with the following content:

current_dir=$(cd $(dirname $0); pwd)
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:current_dir
./test.out

The effect is as follows:
insert image description here
insert image description here

6. Some understanding of C language compilation

In fact, according to the above compilation process, we should realize that the c language compilation is divided into two steps as a whole:

  • All individual c files are compiled into object code separately *.o;
  • Link multiple *.o, static or dynamic libraries into an executabletest.out

Therefore, when compiling C language, compile it individually first, and then integrate resources.
Therefore, a single file in the c language cmay not have the implementation of an api function, but if you want to call it, you must declare it before the call (global variables also have a meaning).

7. gcc common compilation options

In addition to the above compilation commands, we commonly use
gcc -Og test.c -o test.out

Here -gis to generate debugging information (if we want to debug, such as using gdbdebugging);
-Oit is an optimization option.

8. Supplement other commands attached to gcc

8.1 object dump

8.1.1 Display libaddsub.ainternal information

insert image description here

8.1.2 Displayed libadd.odisassembly information

insert image description here
insert image description here

8.1.3 Display symbol table information

insert image description here
insert image description here
insert image description here

For objdumpmore, refer to: obdump -vorman objdump

8.2 readelf

Above, we said that test.o, test.so, test.outand are all ELFbinary files in the format, now we will readelfuse to see:

8.2.1 Display elf file header information

insert image description here
insert image description here
insert image description here

8.2.2 Display program header table information

insert image description here
insert image description here
insert image description here

Guess you like

Origin blog.csdn.net/u010476739/article/details/127384988