The following content comes from the study and arrangement of network resources. If there is any infringement, please inform and delete it.
reference content
(1) ELF file format analysis_elf file analysis_mergerly's blog-CSDN blog
(2) How many sections are Linux C/C++ object files and executable files divided into? - Zhihu (recommend its blog)
1. Introduction to ELF format
Under the Linux system, the .o object file generated after C/C++ source code is assembled, or the executable program file generated after linking, they are generally in the ELF file format (Executable and Linking Format, executable and linkable format).
The target file (Relocatable File), executable file (Executable File), dynamic link library file (Shared Object File), and coredump file (Core Dump File) on Linux are all in ELF format.
This type of file is organized in segments, and the number of segments can be controlled by code, but usually a program file contains at least the following segments:
- text segment: the code segment, which stores the compiled binary machine code. Note that (non-static) local variables are also placed in this section.
- Data segment: The data segment stores initialized global variables and initialized local static variables.
- bss section: store uninitialized global variables and local static variables (the default value is 0).
This article describes the format of the ELF file, not the address space distribution of the program process . However, there is a certain relationship between the two. When the ELF file is executed in the future, the content of each area in the process address space distribution corresponding to it corresponds to the segment content of the file?
2. Examples
This article takes a simple C program as an example to illustrate how the C/C++ code corresponds to the segments in the ELF file.
//main.c
#include <stdio.h>
long global_n1; // 全局变量默认初始化为0,指针的话就是null
long global_n2 = 10;
long sum_func(long a, long b)
{
static long local_static_n1; // 局部静态变量默认初始化为0
static long local_static_n2 = 123;
static long local_static_n3 = 456;
return a + b;
}
int main(void)
{
long sum = sum_func(global_n1, global_n2);
printf("sum=%ld\n", sum);
return 0;
}
In order to simplify the analysis, we compile the above code into an object file instead of an executable file, because when compiled into an executable file, many additional symbols and segments will be introduced.
gcc -c -o main.o main.c
Then use the objdump tool to view the disassembly code of the main.o object file, the -t option means to display the symbol list.
xjh@ubuntu:~/iot/tmp$ objdump -t main.o
main.o: 文件格式 elf32-i386
SYMBOL TABLE:
00000000 l df *ABS* 00000000 main.c
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000000 l d .rodata 00000000 .rodata
00000004 l O .data 00000004 local_static_n3.1831
00000008 l O .data 00000004 local_static_n2.1830
00000000 l O .bss 00000004 local_static_n1.1829
00000000 l d .note.GNU-stack 00000000 .note.GNU-stack
00000000 l d .eh_frame 00000000 .eh_frame
00000000 l d .comment 00000000 .comment
00000004 O *COM* 00000004 global_n1
00000000 g O .data 00000004 global_n2
00000000 g F .text 0000000d sum_func
0000000d g F .text 0000003f main
00000000 *UND* 00000000 printf
xjh@ubuntu:~/iot/tmp$
From this you can know which segment each symbol (function name, variable name) is located in:
part | symbol | Remark |
---|---|---|
.data | local_static_n3 local_static_n2 global_n2 |
Store initialized global variables and initialized local static variables. |
.bss | local_static_n1 global_n1 |
Store uninitialized global variables and uninitialized local static variables. |
.text | sum_func main |
store code |
The size of each segment can be viewed through the size command. As shown below, the size of the text segment is 173 bytes (the binary code length of the main function and the sum_func function), the size of the data segment is 12 bytes (the size of the 3 variables in the .data segment in the above table), and the size of the bss segment The size is 4 bytes (the size of the two variables in the .bss section in the table above).
xjh@ubuntu:~/iot/tmp$ size main.o
text data bss dec hex filename
173 12 4 189 bd main.o
xjh@ubuntu:~/iot/tmp$
As mentioned above, when compiling a program into an executable file, many additional symbols and segments will be introduced. Now let's verify it.
xjh@ubuntu:~/iot/tmp$ gcc main.o -o main.elf
xjh@ubuntu:~/iot/tmp$ objdump -t main.elf
main.elf: 文件格式 elf32-i386
SYMBOL TABLE:
08048154 l d .interp 00000000 .interp
08048168 l d .note.ABI-tag 00000000 .note.ABI-tag
08048188 l d .note.gnu.build-id 00000000 .note.gnu.build-id
080481ac l d .gnu.hash 00000000 .gnu.hash
080481cc l d .dynsym 00000000 .dynsym
0804821c l d .dynstr 00000000 .dynstr
08048268 l d .gnu.version 00000000 .gnu.version
08048274 l d .gnu.version_r 00000000 .gnu.version_r
08048294 l d .rel.dyn 00000000 .rel.dyn
0804829c l d .rel.plt 00000000 .rel.plt
080482b4 l d .init 00000000 .init
080482e0 l d .plt 00000000 .plt
08048320 l d .text 00000000 .text
080484e4 l d .fini 00000000 .fini
080484f8 l d .rodata 00000000 .rodata
0804850c l d .eh_frame_hdr 00000000 .eh_frame_hdr
08048540 l d .eh_frame 00000000 .eh_frame
08049f08 l d .init_array 00000000 .init_array
08049f0c l d .fini_array 00000000 .fini_array
08049f10 l d .jcr 00000000 .jcr
08049f14 l d .dynamic 00000000 .dynamic
08049ffc l d .got 00000000 .got
0804a000 l d .got.plt 00000000 .got.plt
0804a018 l d .data 00000000 .data
0804a02c l d .bss 00000000 .bss
00000000 l d .comment 00000000 .comment
00000000 l df *ABS* 00000000 crtstuff.c
08049f10 l O .jcr 00000000 __JCR_LIST__
08048360 l F .text 00000000 deregister_tm_clones
08048390 l F .text 00000000 register_tm_clones
080483d0 l F .text 00000000 __do_global_dtors_aux
0804a02c l O .bss 00000001 completed.6600
08049f0c l O .fini_array 00000000 __do_global_dtors_aux_fini_array_entry
080483f0 l F .text 00000000 frame_dummy
08049f08 l O .init_array 00000000 __frame_dummy_init_array_entry
00000000 l df *ABS* 00000000 main.c
0804a024 l O .data 00000004 local_static_n3.1831
0804a028 l O .data 00000004 local_static_n2.1830
0804a030 l O .bss 00000004 local_static_n1.1829
00000000 l df *ABS* 00000000 crtstuff.c
0804860c l O .eh_frame 00000000 __FRAME_END__
08049f10 l O .jcr 00000000 __JCR_END__
00000000 l df *ABS* 00000000
08049f0c l .init_array 00000000 __init_array_end
08049f14 l O .dynamic 00000000 _DYNAMIC
08049f08 l .init_array 00000000 __init_array_start
0804a000 l O .got.plt 00000000 _GLOBAL_OFFSET_TABLE_
080484e0 g F .text 00000002 __libc_csu_fini
0804a020 g O .data 00000004 global_n2
00000000 w *UND* 00000000 _ITM_deregisterTMCloneTable
08048350 g F .text 00000004 .hidden __x86.get_pc_thunk.bx
0804a018 w .data 00000000 data_start
00000000 F *UND* 00000000 printf@@GLIBC_2.0
0804a02c g .data 00000000 _edata
080484e4 g F .fini 00000000 _fini
0804a018 g .data 00000000 __data_start
00000000 w *UND* 00000000 __gmon_start__
0804a01c g O .data 00000000 .hidden __dso_handle
080484fc g O .rodata 00000004 _IO_stdin_used
00000000 F *UND* 00000000 __libc_start_main@@GLIBC_2.0
0804a034 g O .bss 00000004 global_n1
08048470 g F .text 00000061 __libc_csu_init
0804a038 g .bss 00000000 _end
08048320 g F .text 00000000 _start
080484f8 g O .rodata 00000004 _fp_hw
0804a02c g .bss 00000000 __bss_start
0804842a g F .text 0000003f main
00000000 w *UND* 00000000 _Jv_RegisterClasses
0804a02c g O .data 00000000 .hidden __TMC_END__
00000000 w *UND* 00000000 _ITM_registerTMCloneTable
0804841d g F .text 0000000d sum_func
080482b4 g F .init 00000000 _init
xjh@ubuntu:~/iot/tmp$