What segments does an ELF format file consist of?

The following content comes from the study and arrangement of network resources. If there is any infringement, please inform and delete it.

reference content

(1) ELF file format analysis_elf file analysis_mergerly's blog-CSDN blog

(2) How many sections are Linux C/C++ object files and executable files divided into? - Zhihu (recommend its blog)

(3) Program compilation, loading and linking

1. Introduction to ELF format

Under the Linux system, the .o object file generated after C/C++ source code is assembled, or the executable program file generated after linking, they are generally in the ELF file format (Executable and Linking Format, executable and linkable format).

The target file (Relocatable File), executable file (Executable File), dynamic link library file (Shared Object File), and coredump file (Core Dump File) on Linux are all in ELF format.

This type of file is organized in segments, and the number of segments can be controlled by code, but usually a program file contains at least the following segments:

  • text segment: the code segment, which stores the compiled binary machine code. Note that (non-static) local variables are also placed in this section.
  • Data segment: The data segment stores initialized global variables and initialized local static variables.
  • bss section: store uninitialized global variables and local static variables (the default value is 0).

This article describes the format of the ELF file, not the address space distribution of the program process . However, there is a certain relationship between the two. When the ELF file is executed in the future, the content of each area in the process address space distribution corresponding to it corresponds to the segment content of the file?

2. Examples

This article takes a simple C program as an example to illustrate how the C/C++ code corresponds to the segments in the ELF file.

//main.c
#include <stdio.h>

long global_n1; // 全局变量默认初始化为0,指针的话就是null
long global_n2 = 10; 

long sum_func(long a, long b)  
{
  static long local_static_n1; // 局部静态变量默认初始化为0
  static long local_static_n2 = 123;
  static long local_static_n3 = 456;
  return a + b;
}

int main(void)
{
  long sum = sum_func(global_n1, global_n2);
  printf("sum=%ld\n", sum);
  return 0;
}

In order to simplify the analysis, we compile the above code into an object file instead of an executable file, because when compiled into an executable file, many additional symbols and segments will be introduced. 

gcc -c -o main.o main.c

Then use the objdump tool to view the disassembly code of the main.o object file, the -t option means to display the symbol list.

xjh@ubuntu:~/iot/tmp$ objdump -t main.o

main.o:     文件格式 elf32-i386

SYMBOL TABLE:
00000000 l    df *ABS*	00000000 main.c
00000000 l    d  .text	00000000 .text
00000000 l    d  .data	00000000 .data
00000000 l    d  .bss	00000000 .bss
00000000 l    d  .rodata	00000000 .rodata
00000004 l     O .data	00000004 local_static_n3.1831
00000008 l     O .data	00000004 local_static_n2.1830
00000000 l     O .bss	00000004 local_static_n1.1829
00000000 l    d  .note.GNU-stack	00000000 .note.GNU-stack
00000000 l    d  .eh_frame	00000000 .eh_frame
00000000 l    d  .comment	00000000 .comment
00000004       O *COM*	00000004 global_n1
00000000 g     O .data	00000004 global_n2
00000000 g     F .text	0000000d sum_func
0000000d g     F .text	0000003f main
00000000         *UND*	00000000 printf


xjh@ubuntu:~/iot/tmp$ 

From this you can know which segment each symbol (function name, variable name) is located in:

part symbol Remark
.data local_static_n3
local_static_n2
global_n2

Store initialized global variables and initialized local static variables.

.bss local_static_n1
global_n1
Store uninitialized global variables and uninitialized local static variables.
.text sum_func
main
store code

The size of each segment can be viewed through the size command. As shown below, the size of the text segment is 173 bytes (the binary code length of the main function and the sum_func function), the size of the data segment is 12 bytes (the size of the 3 variables in the .data segment in the above table), and the size of the bss segment The size is 4 bytes (the size of the two variables in the .bss section in the table above).

xjh@ubuntu:~/iot/tmp$ size main.o
   text	   data	    bss	    dec	    hex	filename
    173	     12	      4	    189	     bd	main.o
xjh@ubuntu:~/iot/tmp$

As mentioned above, when compiling a program into an executable file, many additional symbols and segments will be introduced. Now let's verify it.

xjh@ubuntu:~/iot/tmp$ gcc main.o -o main.elf
xjh@ubuntu:~/iot/tmp$ objdump -t main.elf 

main.elf:     文件格式 elf32-i386

SYMBOL TABLE:
08048154 l    d  .interp	00000000              .interp
08048168 l    d  .note.ABI-tag	00000000              .note.ABI-tag
08048188 l    d  .note.gnu.build-id	00000000              .note.gnu.build-id
080481ac l    d  .gnu.hash	00000000              .gnu.hash
080481cc l    d  .dynsym	00000000              .dynsym
0804821c l    d  .dynstr	00000000              .dynstr
08048268 l    d  .gnu.version	00000000              .gnu.version
08048274 l    d  .gnu.version_r	00000000              .gnu.version_r
08048294 l    d  .rel.dyn	00000000              .rel.dyn
0804829c l    d  .rel.plt	00000000              .rel.plt
080482b4 l    d  .init	00000000              .init
080482e0 l    d  .plt	00000000              .plt
08048320 l    d  .text	00000000              .text
080484e4 l    d  .fini	00000000              .fini
080484f8 l    d  .rodata	00000000              .rodata
0804850c l    d  .eh_frame_hdr	00000000              .eh_frame_hdr
08048540 l    d  .eh_frame	00000000              .eh_frame
08049f08 l    d  .init_array	00000000              .init_array
08049f0c l    d  .fini_array	00000000              .fini_array
08049f10 l    d  .jcr	00000000              .jcr
08049f14 l    d  .dynamic	00000000              .dynamic
08049ffc l    d  .got	00000000              .got
0804a000 l    d  .got.plt	00000000              .got.plt
0804a018 l    d  .data	00000000              .data
0804a02c l    d  .bss	00000000              .bss
00000000 l    d  .comment	00000000              .comment
00000000 l    df *ABS*	00000000              crtstuff.c
08049f10 l     O .jcr	00000000              __JCR_LIST__
08048360 l     F .text	00000000              deregister_tm_clones
08048390 l     F .text	00000000              register_tm_clones
080483d0 l     F .text	00000000              __do_global_dtors_aux
0804a02c l     O .bss	00000001              completed.6600
08049f0c l     O .fini_array	00000000              __do_global_dtors_aux_fini_array_entry
080483f0 l     F .text	00000000              frame_dummy
08049f08 l     O .init_array	00000000              __frame_dummy_init_array_entry
00000000 l    df *ABS*	00000000              main.c
0804a024 l     O .data	00000004              local_static_n3.1831
0804a028 l     O .data	00000004              local_static_n2.1830
0804a030 l     O .bss	00000004              local_static_n1.1829
00000000 l    df *ABS*	00000000              crtstuff.c
0804860c l     O .eh_frame	00000000              __FRAME_END__
08049f10 l     O .jcr	00000000              __JCR_END__
00000000 l    df *ABS*	00000000              
08049f0c l       .init_array	00000000              __init_array_end
08049f14 l     O .dynamic	00000000              _DYNAMIC
08049f08 l       .init_array	00000000              __init_array_start
0804a000 l     O .got.plt	00000000              _GLOBAL_OFFSET_TABLE_
080484e0 g     F .text	00000002              __libc_csu_fini
0804a020 g     O .data	00000004              global_n2
00000000  w      *UND*	00000000              _ITM_deregisterTMCloneTable
08048350 g     F .text	00000004              .hidden __x86.get_pc_thunk.bx
0804a018  w      .data	00000000              data_start
00000000       F *UND*	00000000              printf@@GLIBC_2.0
0804a02c g       .data	00000000              _edata
080484e4 g     F .fini	00000000              _fini
0804a018 g       .data	00000000              __data_start
00000000  w      *UND*	00000000              __gmon_start__
0804a01c g     O .data	00000000              .hidden __dso_handle
080484fc g     O .rodata	00000004              _IO_stdin_used
00000000       F *UND*	00000000              __libc_start_main@@GLIBC_2.0
0804a034 g     O .bss	00000004              global_n1
08048470 g     F .text	00000061              __libc_csu_init
0804a038 g       .bss	00000000              _end
08048320 g     F .text	00000000              _start
080484f8 g     O .rodata	00000004              _fp_hw
0804a02c g       .bss	00000000              __bss_start
0804842a g     F .text	0000003f              main
00000000  w      *UND*	00000000              _Jv_RegisterClasses
0804a02c g     O .data	00000000              .hidden __TMC_END__
00000000  w      *UND*	00000000              _ITM_registerTMCloneTable
0804841d g     F .text	0000000d              sum_func
080482b4 g     F .init	00000000              _init


xjh@ubuntu:~/iot/tmp$ 

Guess you like

Origin blog.csdn.net/oqqHuTu12345678/article/details/129486106