Detailed explanation of ELF file format

Summary

This article describes the ELF file format for Linux systems.

Introduction to ELF file format

ELF (Executable and Linkable Format) is a commonly used binary file format used to store executable files, shared libraries, and core dump files. It is a standard format adopted and supported by the Linux operating system.

There are four forms of ELF files:

1. Executable file: It is an ELF file that contains directly executable programs and can be run directly on the operating system.

2. Shared object file: It is an ELF file that contains functions and variables that can be called by multiple executable programs at the same time. It is usually called a dynamic link library or shared library. Multiple programs can share the same shared library, which can reduce the waste of system resources and improve the running efficiency of the program.

3. Object file: It is an ELF file that contains compiled code and data but has not yet been linked into an executable file or shared library. Object files are often compiled and linked multiple times until an executable or shared library is finally produced.

4. Crash dump file (Core file): Core file is a special file generated in Linux and other UNIX systems. It records the internal state of the process when it crashes or terminates abnormally, including the memory image and register status of the process. , stack trace information, and other debugging information. Core files are often used to debug programs that crash or terminate abnormally.

ELF target file structure diagram

The left side of the figure below is the target file structure of ELF, and the right side is the ELF executable file structure.

ELF file structure diagram

ELF file structure definition

The ELF file structure is defined in the header file elf.h, and its typical path is:

/usr/include/elf.h

ELF file type

The four types of ELF files are defined as follows:

#define ET_NONE  0      /* No file type */
#define ET_REL   1      /* Relocatable file type */
#define ET_EXEC  2      /* Executable file type */
#define ET_DYN   3      /* Shared object file type */
#define ET_CORE  4      /* Core file type */

ELF header

#define EI_NIDENT (16)
 
typedef struct elf32_hdr{
    unsigned char e_ident[EI_NIDENT];     /* 魔数和相关信息 */
    Elf32_Half    e_type;                 /* ELF文件类型,参考上节的定义 */
    Elf32_Half    e_machine;              /* 硬件体系,例如:Intel 80386 */
    Elf32_Word    e_version;              /* 目标文件版本 */
    Elf32_Addr    e_entry;                /* 程序进入点 */
    Elf32_Off     e_phoff;                /* 程序头部偏移量 */
    Elf32_Off     e_shoff;                /* 节头部偏移量 */
    Elf32_Word    e_flags;                /* 处理器特定标志 */
    Elf32_Half    e_ehsize;               /* ELF头部长度 */
    Elf32_Half    e_phentsize;            /* 程序头表项尺寸 */
    Elf32_Half    e_phnum;                /* 程序头表项个数  */
    Elf32_Half    e_shentsize;            /* 节头表项尺寸 */
    Elf32_Half    e_shnum;                /* 节头表项个数 */
    Elf32_Half    e_shstrndx;             /* 节头字符串表索引 */
} Elf32_Ehdr;

The e_ident field is defined as follows:

e_ident Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2’s complement, little end
Version: 1(current)
OS/ABI: UNIX - System V
ABI Version: 0

Program header table

The program header table is an optional part of ELF and is usually located in an executable file of type ET_EXEC.

Relocatable files of type ETL_REL (extension .o) are not included.

Program header entry definition:

typedef struct elf32_phdr{
  Elf32_Word  p_type;        /* 段类型 */
  Elf32_Off   p_offset;      /* 段位置相对于文件开始处的偏移量 */
  Elf32_Addr  p_vaddr;       /* 段在内存中的地址 */
  Elf32_Addr  p_paddr;       /* 段的物理地址 */
  Elf32_Word  p_filesz;      /* 段在文件中的长度 */
  Elf32_Word  p_memsz;       /* 段在内存中的长度 */
  Elf32_Word  p_flags;       /* 段的标记 */
  Elf32_Word  p_align;       /* 段在内存中对齐标记 */
} Elf32_Phdr;

Example of program header table:

  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
-----------------------------------------------------------------------------
  EXIDX          0x000884 0x00008884 0x00008884 0x00008 0x00008 R   0x4
  PHDR           0x000034 0x00008034 0x00008034 0x00140 0x00140 R E 0x4
  INTERP         0x000174 0x00008174 0x00008174 0x00013 0x00013 R   0x1
  LOAD           0x000000 0x00008000 0x00008000 0x00890 0x00890 R E 0x8000
  LOAD           0x000f04 0x00010f04 0x00010f04 0x0014c 0x00154 RW  0x8000
  DYNAMIC        0x000f10 0x00010f10 0x00010f10 0x000f0 0x000f0 RW  0x4
  NOTE           0x000188 0x00008188 0x00008188 0x00020 0x00020 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x4
  GNU_RELRO      0x000f04 0x00010f04 0x00010f04 0x000fc 0x000fc R   0x1

Section header table

The section header table is usually located at the end of the ELF file, and its intra-file offset is indicated by the e_shoff field of the ELF header structure.

typedef struct {
    Elf64_Word    sh_name; /* 节名称,是节区头字符串表节区中的索引。*/
    Elf64_Word    sh_type;
    Elf64_Xword   sh_flags; /* 内存访问属性:可度,可写,可执行,是否需要分配内存等。*/
    Elf64_Addr    sh_addr;
    Elf64_Off     sh_offset;
    Elf64_Xword   sh_size;
    Elf64_Word    sh_link;
    Elf64_Word    sh_info;
    Elf64_Xword   sh_addralign;
    Elf64_Xword   sh_entsize;
} Elf64_Shdr;

section content

The section names and meanings predefined by the system are as follows:

sh_name sh_type illustrate

.text

SHT_PROGBITS

Code segment, containing the executable instructions of a program

.data

SHT_PROGBITS

Initialized data that will appear in the program's memory image

.bss

SHT_NOBITS

The data is not initialized because there are only symbols so

.rodata

SHT_PROGBITS

read-only data

.comment

SHT_PROGBITS

version control information

.dynzyme

SHT_DYNSYM

Dynamic link symbol table

.shstrtab

SHT_STRTAB

Section header string table

.strtab

SHT_STRTAB

string table

.symtab

SHT_SYMTAB

Symbol table

ELF file analysis

objdump is a tool used to view binary code, object files, shared libraries, static libraries and other information in ELF files. It is a component in the GNU Binutils suite.

Disassembly application

objdump -d  hello.o  

Show file header information 

objdump -f hello.o

Display information about the specified section name

objdump -s -j .comment hello.o

References

ELF file format analysis

Detailed explanation of ELF file format

Detailed explanation of ELF format

Comparison between ELF and PE

ELF and PE are both variants of the COFF (Common File Format) format. The comparison between the two is as follows:

  1. The ELF file begins with the ELF header, without the historical baggage of DOS header. In order to be compatible with DOS programs, the PE file begins with the DOS header and the DOS stub program. The e_lfanew field of the DOS header indicates the offset of the PE header within the file.
  2. The ELF header of the ELF file is equivalent to the image file header of the PE file.
  3. The program header of the ELF file is equivalent to the PE optional header of the PE file.
  4. The section header table in the ELF file is located after the section content, the offset is specified by the e_shoff field of the ELF header, and the number of section headers is specified by the e_shnum field; the section header table of the PE is located before the section content, immediately following the PE optional header, with all A zero section header entry serves as the end mark.

The ELF file format is simpler than the PE file format.

Summarize

The ELF file format is one of the most common executable file formats used in modern operating systems and programming languages. Understanding the ELF file format is important because:

1. You need to know how to generate output in ELF file format during development. Compilers and linkers often use the ELF file format to generate executable files or shared libraries. Understanding the details of the ELF file format can help programmers understand how the compiler and linker work, and can better optimize the program by adjusting the compiler and linker parameters.

2. When debugging and optimizing a program, understanding the ELF file format can help us identify and understand the program's symbol table, debugging information, code segments, data segments, etc. This information can help us better understand the execution of the program and thus better debug and repair the program.

3. When developing disassembly or binary analysis tools, understanding the ELF file format can help us identify the structure and composition of executable files or shared libraries, thereby better understanding the contents of these files and how they work.

4. In the field of security research, understanding the ELF file format can help us analyze and understand various vulnerability exploitation techniques, such as stack overflow, format string vulnerabilities, ROP chains, etc. At the same time, it can also help us identify and analyze various malware and codes to better protect our systems.

In short, understanding the ELF file format can help us become more efficient and proficient in program development, debugging, analysis, and security.

Guess you like

Origin blog.csdn.net/bigwave2000/article/details/132645847