In-depth understanding of the process of static linking

The process of static linking

Each file is compiled separately. After compiling to generate a relocatable object file, the object file format

The format is also the ELF executable file format but it does not work: the root cause is that the object file does not assign a virtual address when compiling

The reasons are as follows:

If it is a function and variable defined in this file, the compiler can assign it an address, but if the current source file references functions or global variables of other files, its address cannot be determined at this time. Therefore, the determination of the virtual address is postponed to the linking process. It is necessary to call the linker of the system to link the object file to generate an executable file. For the sake of simplicity, here is static linking to analyze what the linking process does: the linking process mainly includes address and space allocation, symbol resolution, and relocation.    

1. Space and address allocation:

    Scan all object files for the lengths, attributes and start addresses of their individual segments.

    Merge the symbol tables of each object file: Put all the symbol tables of the object files into a global symbol table for merging. The strategy of merging similar segments means that the .text segments of all object files are merged into the .text segments of the output file, followed by the .data and .bss segments of all object files to merge into the corresponding segments of the output file. Through this step, the linker will be able to obtain the segment lengths of all input object files, and combine them to calculate all the lengths and positions of the object files. After that, the virtual addresses of all functions and variables can be determined. ELF allocates virtual addresses starting from 0x08048000 (32 bits) 0x400000.

Second, symbol resolution and relocation

All the symbol information collected when using the first step is placed in the global symbol table, some are references to symbols and some are declarations of symbols. Symbol reference requires symbol resolution and address relocation to adjust the virtual address of the reference symbol. The relocation of the address is the core of the linking process.

Symbol resolution: Find the correct address of the symbol for the symbol in other object files referenced in the instruction, that is, find the corresponding address in the global symbol table. If it is not found, it will report a symbol undefined error, and if it is repeated, a symbol redefinition error will be reported. This process is performed in conjunction with the relocation process. The relocation process is accompanied by symbol resolution.

Relocation: It is to correct the address of the function or variable that refers to other object files by the correction instruction, and the false address is used before it is determined. At this time, because of the global symbol table merged in the first step, all symbols have their own addresses that can be relocated. The relocation process is also carried out with reference to the relocation table in the ELF file, that is, the object file, which records which segments contain functions and variables that need to be relocated. It is also accompanied by the process of symbol parsing. And it also accompanies the parsing process of symbols.

<<Programmer's self-cultivation>>

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326842887&siteId=291194637