2019-2020-1 20199315 "Linux kernel principle and Analysis" in the eighth week of work

Executable program works

Compiled

Linux system c language source code compilation process is divided into four steps: pretreatment, compile, assemble, link.

Pretreatment

Work content pre-processing stage are as follows:

  • Removes all comments;

  • Delete all #define, and expand all the macro definitions;

  • Precompiled instruction processing all conditions;

  • Pre-compiler process #include command to include file location inserted pre-compiled instructions;

  • Add the file name and line number identification.

Its in the shell command execution

gcc -E hello.c -o hello.i

Compile

The main function of the compilation phase of the high-level language code is pre-compiled into the assembly language.

Its in the shell command execution

gcc -S hello.c -o hello.s -m32

compilation

The main function of the compilation phase of the assembly language code is compiled into binary machine code.

Its in the shell command execution

gcc -c hello.c -o hello.o -m32

link

The main function is to link the various stages of code and data portions are collected and combined into a single file, this file can be loaded into memory for execution.

Its in the shell command execution

gcc hello.o -o hello -m32 -static

Helloworld wrote a simple test to do:

--ELF executable program file

ELF (Executable and Linking Format) is a format object file that defines the different types of object file (Object files) are put anything, and are in what format to put these things.

ELF file type

ELF files can be divided into three categories:

  • Can be redirected file: file holds code and data suitable for other object files together to create an executable file or a shared object file. (Object files or static library files that linux is usually suffix .a and .o files)

  • Executable: file holds a program to be executed. (E.g. bash, gcc, etc.)

  • Shared object file: a shared library. File holds code and data suitable for being connected to the link editor, and dynamic link. (File extension is .so under linux.)

ELF file format

The main structure of the ELF file as shown below:

ELF Header

ELF Header portion 64 bytes long, including the basic data executable file type (32-bit / 64-bit), program entry address, other moieties start address, size, number of ELF file.

By readelf -h hello command to view the ELF executable file hello in the shell ELF Header parts:

Section Header Table

Sections in the ELF file, the minimum content of the container for loading data. In the ELF file, within each of the sections are loaded nature of the properties are the same content, such as:

1) .text section was loaded executable code;

2) .data section inside the initialized data is loaded;

3) .bss section which is not initialized data is loaded;

4) at the beginning .rec loaded sections which relocation entries;

5) .symtab or .dynsym section which symbol information is loaded;

6) .strtab .dynstr section or inside the character string information is loaded;

7) There are other different purposes to meet the section set, for example, to meet the purpose of debugging, to meet the purpose of dynamic linking and loading, and so on.

What are the specific sections in the end there is an ELF file, determined by the Section Head Table (SHT) included in this ELF file. In the SHT, for each section, is provided with an entry, corresponding to describe this section, which mainly include the name of the section, the type, size, and so on throughout the byte offset in the ELF file.

By readelf -S hello command to view the ELF executable file hello in the shell Section Head Table parts:

It can be seen as Header explained, Section Head Table Section A total of 31 items.

Which, .symtabsection items stored string table program, with readelf -s hello View:

Type column indicates the type of symbol, Bind column indicates the binding type of symbol, which together constitute the st_info field.

Program Header Table

When the compiler linking step, sections may be relocatable object files as input, linked to the linker used to do that, these sections are also frequently called the input section.

section and the link in the process of linking the executable file or dynamic library, will be different from the same name relocatable object file of the same name merged section configuration. Subsequently section, which would in turn with the same properties (for example are read and loaded) are merged into a so-called Segments (segment). segments output as the linker, often referred to as the output section.

A single segment will typically comprise several different Sections, may be for example a loaded, the read-only segment will generally include executable code section .text, and read-only data symbols to a section .rodata dynamic linker used section .dymsym and so on. section is used by the linker, but the segments are used by the loader. Loader will require segment loaded into memory to run. And with the sections header table to specify a relocation which in the end sections in the same file. In an executable file or dynamic library, also we need to have an information structure to indicate which segments contain. This information structure is the Program Header Table.

By readelf -l hello command to view the ELF executable file hello in the shell Program Header Table parts:

How to load Linux kernel and boot an executable process

Experimental requirements

Understand the process of compiling and linking of ELF executable file format, details, refer to the first week;

Programming using the exec * library function to load an executable file, dynamically linked into both use dynamic linking and run-time dynamic linking when the executable is loaded, the dynamic link library programming exercises, details, refer to the second week;

Gdb trace analysis using a execve kernel system call handler sys_execve, verify that your Linux system to load executable programs needed to understand the process, the details, refer to the third week; recommended done under laboratory building Linux virtual machine environment experiment.

Special attention to the new executable program is executed where to start? Why execve system call returns the new executable program will run smoothly? When the call returns would be any different for statically linked executables and dynamically linked executables execve system?

experiment procedure

1. Start by updating the kernel, then test_exec.c will overwrite test.c:

View test.c file, you can see the new addition exec system call

E is switched to hello.c hello.c

View Makefile

make rootfs boot the kernel and verify function execv

Frozen core, terminal 2 start GDB debug menu in

At the first stop in sys_execve, then set other breakpoints; c press run all the way down until the breakpoint sys_execve

new_ip state is returned to the first instruction of the user

Exit debugging mode, enter redelf -h hello hello can see the head of the EIF

Open a new window, gdb into the interior can be found in the kernel stack is being modified

struct pt_regs * regs is part of the kernel stack bottom of the stack, when the interruption occurred, esp push and ip are carried out. By modifying the value of the kernel stack EIP (i.e. the value pushed onto the stack by replacing new_ip) as the starting point of a new program.

experiment analysis

elf header file analysis

Bring up in front of the elf header file that

Visible elf header size is 52 bytes, and analyzed by hexadecimal dump command reads the first 52 bytes

Command: hexdump -x hello -n 52

  • The first row, corresponding to e_ident [EI_NIDENT]. Little-Endian actual content expressed as 7f454c46010101000000000000000000, the first four bytes at the beginning of the fixed elf 7f454c46 (0x45,0x4c, 0x46 is 'e', ​​'l', 'f' corresponding to the encoded ascii), which represents an ELF object. The next byte 01 represents a 32-bit object, the following representation is a little endian byte 01 represents the law, then the next byte 01 indicates a version of the file header. The remaining defaults are set to 0.

  • The second line, e_type value of 0x0002, represents an executable file. e_machine value 0x0003, represents a intel80386 processor architecture. e_version value 0x00000001, represents the current version. e_entry value 0x04080a8d, represents the entry point. e_phoff value 0x00000034, the program header table represents the offset is 52 bytes 0x34 i.e. exactly elf header size.

  • Third row, e_shoff value 0x000a20f0, showing the section header table offset address. e_flags value 0x00000000, specific flag indicating an unknown processor. e_ehsize value 0x0034, elf file header indicates the size of 52 bytes. e_phentsize entry indicates the length of a program header table (application head), i.e. the value of 0x0020 is 32 bytes. e_phnum values ​​0x0006, given the number of entries in the program header table. e_shentsize value of 0x0028 indicates the section header table entry (header section) size is 40 bytes.

  • Fourth row, e_shnum value 0x001f, the section header table showing the entrance 31. e_shstrndx value 0x001c, represents the index number in the section name string table section table in.

exec () function Structure Analysis

int do_execve(struct filename *filename,
    const char __user *const __user *__argv,
    const char __user *const __user *__envp)
{
    return do_execve_common(filename, argv, envp);
}
 
 
static int do_execve_common(struct filename *filename,
                struct user_arg_ptr argv,
                struct user_arg_ptr envp)
{
    // 检查进程的数量限制
 
    // 选择最小负载的CPU,以执行新程序
    sched_exec();
 
    // 填充 linux_binprm结构体
    retval = prepare_binprm(bprm);
 
    // 拷贝文件名、命令行参数、环境变量
    retval = copy_strings_kernel(1, &bprm->filename, bprm);
    retval = copy_strings(bprm->envc, envp, bprm);
    retval = copy_strings(bprm->argc, argv, bprm);
 
    // 调用里面的 search_binary_handler
    retval = exec_binprm(bprm);
 
    // exec执行成功
 
}
 
static int exec_binprm(struct linux_binprm *bprm)
{
    // 扫描formats链表,根据不同的文本格式,选择不同的load函数
    ret = search_binary_handler(bprm);
    // ...
    return ret;
}

do_ execve call do_ execve_ common, but do_ execve_ common and mainly rely on the exec_ binprm, have a vital function in the exec_ binprm called search_binary_ handler. This is the internal sys_execve processing.

Problems encountered

gdb inside the first command prompt I did not file or directory

I found in LinuxKernel directly into the inside of gdb, should cd into the menu which then gdb

This is my seventh week of the Linux learning content, if insufficient, please criticism and be grateful.

the above

Guess you like

Origin www.cnblogs.com/qianxiaoxu/p/11819722.html