linux ELF文件格式分析

一、数据结构定义

linux ELF相关定义 

/* Type for a 16-bit quantity.  */
typedef uint16_t Elf64_Half;

/* Types for signed and unsigned 32-bit quantities.  */
typedef uint32_t Elf64_Word;
typedef int32_t  Elf64_Sword;

/* Types for signed and unsigned 64-bit quantities.  */
typedef uint64_t Elf64_Xword;
typedef int64_t  Elf64_Sxword;

/* Type of addresses.  */
typedef uint64_t Elf64_Addr;

/* Type of file offsets.  */
typedef uint64_t Elf64_Off;

/* Type for section indices, which are 16-bit quantities.  */
typedef uint16_t Elf64_Section;

/* Type for version symbol information.  */
typedef Elf64_Half Elf64_Versym;

typedef struct
{
  unsigned char e_ident[16];            /* Magic number and other info */
  Elf64_Half    e_type;                 /* Object file type */
  Elf64_Half    e_machine;              /* Architecture */
  Elf64_Word    e_version;              /* Object file version */
  Elf64_Addr    e_entry;                /* Entry point virtual address */
  Elf64_Off     e_phoff;                /* Program header table file offset */
  Elf64_Off     e_shoff;                /* Section header table file offset */
  Elf64_Word    e_flags;                /* Processor-specific flags */
  Elf64_Half    e_ehsize;               /* ELF header size in bytes */
  Elf64_Half    e_phentsize;            /* Program header table entry size */
  Elf64_Half    e_phnum;                /* Program header table entry count */
  Elf64_Half    e_shentsize;            /* Section header table entry size */
  Elf64_Half    e_shnum;                /* Section header table entry count */
  Elf64_Half    e_shstrndx;             /* Section header string table index */
} Elf64_Ehdr;

源文件代码(摘自程序员自我修养):

/*
 * SimpleSection.c
 *
 * Linux:
 *   gcc -c SimpleSection.c
 *
 * Windows:
 *   cl SimpleSection.c /c /Za
 */

int printf( const char* format, ... );
int global_init_var = 84;
int global_uninit_var;

void func1( int i )
{
  printf( "%d\n",  i );
}

int main(void)
{
    static int static_var = 85;
    static int static_var2;

    int a = 1;
    int b;

    func1( static_var + static_var2 + a + b );

    return a;
}

 通过gcc编译,gcc SimpleSection.c -o SimpleSection

[root@iot-vm ~]# readelf -h SimpleSection
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x400440
  Start of program headers:          64 (bytes into file)
  Start of section headers:          4512 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         9
  Size of section headers:           64 (bytes)
  Number of section headers:         30
  Section header string table index: 27
[root@iot-vm ~]# 
[root@iot-vm ~]# 
[root@iot-vm ~]# hexdump -C SimpleSection
00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  02 00 3e 00 01 00 00 00  40 04 40 00 00 00 00 00  |..>.....@.@.....|
00000020  40 00 00 00 00 00 00 00  a0 11 00 00 00 00 00 00  |@...............|
00000030  00 00 00 00 40 00 38 00  09 00 40 00 1e 00 1b 00  |[email protected]...@.....|

通过命令查看到16进制数据,进行字节分析

二、ELF header分析

e_ident
    7f -- 固定
    45 4c 46 --> 'E', 'L', 'F'
    02 -> 64位(00 - 无效, 01 - 32位,02 - 64位)
    01 -> 小端序 (00 - 无效 01 - 小端 02 - 大端)
    01 -> 版本号,默认都是1
    00 00 00 00 00 00 00 00 00 --> 补齐

注意后面都是小端序
e_type -> 02 00 [00 02]
    00 01 ET_REL  (一般是.o)
    00 02 ET_EXEC (可执行文件)
    00 03 ET_DYN  (一般是.so)


e_machine -- 3e 00 [00 3e] --> #define EM_X86_64       62              /* AMD x86-64 architecture */

e_version -- 01 00 00 00 [00 00 00 01]

e_entry   -- 40 04 40 00 00 00 00 00 [00 00 00 00 00 40 04 40]  --> 程序入口地址

e_phoff   -- 40 00 00 00 00 00 00 00 [00 00 00 00 00 00 00 40]  --> program header table offset,  文件偏移位置第64字节

e_shoff   -- a0 11 00 00 00 00 00 00 [00 00 00 00 00 00 11 a0]  --> section header table offset,  文件偏移位置第4512字节

e_flags   -- 00 00 00 00

e_ehsize  -- 40 00 [00 40] --> ELF Header Size -> 64字节

e_phentsize -- 38 00 [00 38] --> program header table entry size -> 56字节

e_phnum -- 09 00 [00 09] --> Program header table entry count -> 9个 --> 通过这个命令可以查看,  readelf -l 文件名字
 
e_shentsize --> 40 00 [00 40] --> Section header table entry size -> 64字节

e_shnum --> 1e 00 [00 1e] --> Section header table entry count -> 30个  --> 通过这个命令可以查看,  realelf -S 文件查看

e_shstrndx --> 1b 00 [00 1b] --> Section header string table index

三、Program Header

这个结构大小是52字节 

typedef struct {
    Elf64_Half    p_type;        /* entry type */
    Elf64_Half    p_flags;    /* flags */
    Elf64_Off    p_offset;    /* offset */
    Elf64_Addr    p_vaddr;    /* virtual address */
    Elf64_Addr    p_paddr;    /* physical address */
    Elf64_Xword    p_filesz;    /* file size */
    Elf64_Xword    p_memsz;    /* memory size */
    Elf64_Xword    p_align;    /* memory & file alignment */
} Elf64_Phdr;

 通过上面elf header分析,可以总共有9个Program Header

e_phentsize -- 38 00 [00 38] --> program header table entry size -> 56字节

e_phnum -- 09 00 [00 09] --> Program header table entry count -> 9个 --> 通过这个命令可以查看,  readelf -l 文件名字

p_type和p_flags取值如下:

#define PT_NULL         0               /* Program header table entry unused */
#define PT_LOAD         1               /* Loadable program segment */
#define PT_DYNAMIC      2               /* Dynamic linking information */
#define PT_INTERP       3               /* Program interpreter */
#define PT_NOTE         4               /* Auxiliary information */
#define PT_SHLIB        5               /* Reserved */
#define PT_PHDR         6               /* Entry for header table itself */
#define PT_TLS          7               /* Thread-local storage segment */
#define PT_NUM          8               /* Number of defined types */
#define PT_LOOS         0x60000000      /* Start of OS-specific */
#define PT_GNU_EH_FRAME 0x6474e550      /* GCC .eh_frame_hdr segment */
#define PT_GNU_STACK    0x6474e551      /* Indicates stack executability */
#define PT_GNU_RELRO    0x6474e552      /* Read-only after relocation */
#define PT_LOSUNW       0x6ffffffa
#define PT_SUNWBSS      0x6ffffffa      /* Sun Specific segment */
#define PT_SUNWSTACK    0x6ffffffb      /* Stack segment */
#define PT_HISUNW       0x6fffffff
#define PT_HIOS         0x6fffffff      /* End of OS-specific */
#define PT_LOPROC       0x70000000      /* Start of processor-specific */
#define PT_HIPROC       0x7fffffff      /* End of processor-specific */

/* Legal values for p_flags (segment flags).  */

#define PF_X            (1 << 0)        /* Segment is executable */
#define PF_W            (1 << 1)        /* Segment is writable */
#define PF_R            (1 << 2)        /* Segment is readable */
#define PF_MASKOS       0x0ff00000      /* OS-specific */
#define PF_MASKPROC     0xf0000000      /* Processor-specific */

四、Section Header 

sizeof(Elf64_Shdr)是64字节

typedef struct
{
  Elf64_Word    sh_name;                /* Section name (string tbl index) */
  Elf64_Word    sh_type;                /* Section type */
  Elf64_Xword   sh_flags;               /* Section flags */
  Elf64_Addr    sh_addr;                /* Section virtual addr at execution */
  Elf64_Off     sh_offset;              /* Section file offset */
  Elf64_Xword   sh_size;                /* Section size in bytes */
  Elf64_Word    sh_link;                /* Link to another section */
  Elf64_Word    sh_info;                /* Additional section information */
  Elf64_Xword   sh_addralign;           /* Section alignment */
  Elf64_Xword   sh_entsize;             /* Entry size if section holds table */
} Elf64_Shdr;

下面是段表类型,即sh_type取值,关注一下:SHT_SYMTAB(符号表),SHT_STRTAB(字符串标), 

#define SHT_NULL          0             /* Section header table entry unused */
#define SHT_PROGBITS      1             /* Program data */
#define SHT_SYMTAB        2             /* Symbol table */
#define SHT_STRTAB        3             /* String table */
#define SHT_RELA          4             /* Relocation entries with addends */
#define SHT_HASH          5             /* Symbol hash table */
#define SHT_DYNAMIC       6             /* Dynamic linking information */
#define SHT_NOTE          7             /* Notes */
#define SHT_NOBITS        8             /* Program space with no data (bss) */
#define SHT_REL           9             /* Relocation entries, no addends */
#define SHT_SHLIB         10            /* Reserved */
#define SHT_DYNSYM        11            /* Dynamic linker symbol table */
#define SHT_INIT_ARRAY    14            /* Array of constructors */
#define SHT_FINI_ARRAY    15            /* Array of destructors */
#define SHT_PREINIT_ARRAY 16            /* Array of pre-constructors */
#define SHT_GROUP         17            /* Section group */
#define SHT_SYMTAB_SHNDX  18            /* Extended section indeces */
#define SHT_NUM           19            /* Number of defined types.  */
#define SHT_LOOS          0x60000000    /* Start OS-specific.  */
#define SHT_GNU_ATTRIBUTES 0x6ffffff5   /* Object attributes.  */
#define SHT_GNU_HASH      0x6ffffff6    /* GNU-style hash table.  */
#define SHT_GNU_LIBLIST   0x6ffffff7    /* Prelink library list */
#define SHT_CHECKSUM      0x6ffffff8    /* Checksum for DSO content.  */
#define SHT_LOSUNW        0x6ffffffa    /* Sun-specific low bound.  */
#define SHT_SUNW_move     0x6ffffffa
#define SHT_SUNW_COMDAT   0x6ffffffb
#define SHT_SUNW_syminfo  0x6ffffffc
#define SHT_GNU_verdef    0x6ffffffd    /* Version definition section.  */
#define SHT_GNU_verneed   0x6ffffffe    /* Version needs section.  */
#define SHT_GNU_versym    0x6fffffff    /* Version symbol table.  */
#define SHT_HISUNW        0x6fffffff    /* Sun-specific high bound.  */
#define SHT_HIOS          0x6fffffff    /* End OS-specific type */
#define SHT_LOPROC        0x70000000    /* Start of processor-specific */
#define SHT_HIPROC        0x7fffffff    /* End of processor-specific */
#define SHT_LOUSER        0x80000000    /* Start of application-specific */
#define SHT_HIUSER        0x8fffffff    /* End of application-specific */

SectionHeader偏移是11 a0开始,每64个字节一个段,一共30个段

e_shoff   -- a0 11 00 00 00 00 00 00 [00 00 00 00 00 00 11 a0]  --> section header table offset,  文件偏移位置第4512字节

e_shentsize --> 40 00 [00 40] --> Section header table entry size -> 64字节

e_shnum --> 1e 00 [00 1e] --> Section header table entry count -> 30个  --> 通过这个命令可以查看,  realelf -S 文件查看

e_shstrndx --> 1b 00 [00 1b] --> Section header string table index

举例说明:

第一个,000011a0开始,全都是0,表NULL表 

000011a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000011b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000011c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000011d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

第二个,字符串表,此例中一共有三个

00001320  56 00 00 00 03 00 00 00  02 00 00 00 00 00 00 00  |V...............|
00001330  18 03 40 00 00 00 00 00  18 03 00 00 00 00 00 00  |..@.............|
00001340  3f 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |?...............|
00001350  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

00001860  11 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
00001870  00 00 00 00 00 00 00 00  95 10 00 00 00 00 00 00  |................|
00001880  08 01 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001890  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

000018e0  09 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
000018f0  00 00 00 00 00 00 00 00  b0 1f 00 00 00 00 00 00  |................|
00001900  8a 02 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001910  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

分析第一个字符串表:

sh_name  -- 56 00 00 00  [00 00 00 56] --> 如何确定名字呢?
sh_type  -- 03 00 00 00  [00 00 00 03]
sh_flags -- 02 00 00 00 00 00 00 00  [00 00 00 00 00 00 00 02]

如何确定名字呢? 这里的0x56,其实数组下标,需要通过两个索引:e_shstrndx,sh_offset,sh_name

e_shstrndx --> 在elf文件头,最后一个字段,表示段表字符串索引,当前值是27

sh_offset --> 表示当前段相对于文件起始位置(文件最开始位置0)的偏移量

sh_name --> 表示当前段,在字符表中索引位置

用代码表示:file_base + shdrs[shstrndx].sh_offset + shdr->sh_name(其中file_base,代表文件最开始的起始位置)

1)先找到第27个段

00001860  11 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
00001870  00 00 00 00 00 00 00 00  95 10 00 00 00 00 00 00  |................|
00001880  08 01 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001890  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

具体字段对应

sh_name      -- 11 00 00 00   [00 00 00 01]
sh_type      -- 03 00 00 00   [00 00 00 03]
sh_flags     -- 00 00 00 00 00 00 00 00
sh_addr      -- 00 00 00 00 00 00 00 00
sh_offset    -- 95 10 00 00 00 00 00 00  [00 00 00 00 00 00 10 95]
sh_size      -- 08 01 00 00 00 00 00 00  [00 00 00 00 00 00 01 08]
sh_link      -- 00 00 00 00
sh_info      -- 00 00 00 00
sh_addralign -- 01 00 00 00 00 00 00 00  [00 00 00 00 00 00 00 01]
sh_entsize   -- 00 00 00 00 00 00 00 00

2)字符串表起始位置

file_base + shdrs[shstrndx].sh_offset  =》 0 + shdrs[27].sh_offset =》 0 + 0x1095 = 0x1095,下面是具体位置(红色开始):

3)确定段表名称sh_name

file_base + shdrs[shstrndx].sh_offset + shdr->sh_name =》 0x1095 + 0x56 => ".dynstr"

其他字符串也是类似方法 

五、其他

具体加载解析过程,可以参考fs/binfmt_elf.c文件中的函数load_elf_binary

猜你喜欢

转载自blog.csdn.net/xxb249/article/details/118572367