From creating the process to entering the main function, what happened?

A few days ago, a small partner in the readers asked: After the process was created, how did I enter the main function I wrote?

Today this article will talk about this topic.

First define the scope of discussion of this issue: C/C++ language

This article mainly discusses the creation and initialization of processes and threads at the operating system level , and how languages ​​such as Python and Java based on interpreters and virtual machines can enter the execution of the main function. The path behind this is longer. (Including the internal execution process of the interpreter and the virtual machine), I will discuss it later when I have the opportunity. So here we focus on how the main function of native languages ​​such as C/C++ is entered.

This article will both describe the detailed process on the two main platforms of Linux and Windows .

Create process

The first step is to create a process.

On Linux, we need to start a new process, which is generally achieved through the fork  +  exec series of functions. The former "forks" the current process into a twin child process, and the latter is responsible for replacing the execution file of this child process to execute the child. The new program file of the process.

The fork and exec series of functions here are API functions provided by the operating system to the application program. They will eventually enter the operating system kernel through system calls , and complete the creation of a process through the process management mechanism in the kernel.

The operating system kernel will be responsible for the creation of the process, and there are mainly the following tasks to be done:

  • Create a data structure used to describe the process in the kernel, task_struct on Linux

  • Create the page directory and page table of the new process to build the memory address space of the new process

In the Linux kernel, due to historical reasons, the early Linux kernel did not have the concept of thread , but used task: task_struct to describe the execution instance of a program: process .

In the kernel, a task corresponds to a task_struct , that is, a process, and the scheduling unit of the kernel is also a task_struct .

Later, the concept of multithreading emerged. In order to support multithreading technology in the Linux kernel, task_struct actually represents a thread, and a process is described by combining multiple task_structs into a group (through the group id field inside the structure) . Therefore, threads on Linux are also called lightweight processes .

An important mission of the system call fork is to create the task_struct structure of the new process. After the creation is completed, the process has the scheduling unit. Then you can participate in scheduling and have the opportunity to be executed.

Load executable file

After the process is successfully created by fork , the child process and the parent process at this time are equivalent to a cell undergoing mitosis, and the two processes are "almost" identical.

In order for the child process to execute a new program, the exec series of functions are also needed in the child process to replace the process executable program.

The exec series of functions are also the encapsulation of system calls. By calling them, they will enter the kernel sys_execve to perform real work.

There are many details of this work, and one of the important tasks is to load the executable file into the process space and analyze it to extract the entry address of the executable file .

We use the code written in high-level languages ​​such as C and C++, and finally the executable file is compiled by the compiler. On Linux, it is in ELF format. On Windows, it is called PE file.

Regardless of whether it is an ELF file or a PE file, in their respective file headers, the instruction entry address of the executable file is recorded, which indicates where the program should be executed.

Where does this entry point? Is it our main function? Here is a key point to solve the previous problem: how did the process get to this entry address after the process was created?

No matter on Windows or Linux, application threads will often shuttle back and forth between user space and kernel space. This may occur in the following situations:

  • System call

  • Interrupt

  • abnormal

When returning from the kernel, how does the thread know where it came from, and where to go back to the application space to continue execution?

The answer is that when entering the kernel space, the thread will automatically save the context (in fact, the contents of some registers, such as the instruction register EIP) on the thread's stack, record where it came from, and wait until it returns from the kernel, and then from the stack Load this information and return to the original place to continue execution.

As mentioned earlier, the child process enters the kernel through the sys_execve system call. After the analysis of the executable file is completed later, the entry address of the ELF file is obtained, and the context information originally saved on the stack will be modified, and the EIP Point to the entry address of the ELF file. In this way, when the sys_execve system call ends, after returning to the user space, you can directly go to the new program entry and start executing the code.

Therefore, a very important feature is: The exec series of functions will not return under normal circumstances. Once entered, the execution flow will shift to the new executable file entry after the mission is completed .

Another thing to mention is that in addition to ELF files, Linux also supports executable files in other formats, such as MS-DOS and COFF.

In addition to binary executable files, shell scripts are also supported. In this case, the script interpreter program will be used as the entry point to start

From ELF entry to main function

The above explains how a new process is executed to the entry address of the executable file.

At the same time, there is a question, what is the entry address? Is it our main function?

Here is a simple C program that will output the classic hello world after running:

#include <stdio.h>
int main() {
    printf("hello, world!\n");
    return 0;
}

After compiling with gcc, an ELF executable file is generated. Through the readelf instruction, the analysis of the ELF file can be realized. Here you can see that the entry address of the ELF file is 0x400430:

Then, we used the disassembly artifact, IDA opened and analyzed this file, and see what function is located at the entrance of 0x400430?

As you can see, the entry point is a  function called  _start , not our main function.

At the end of _start, the __libc_start_main  function is called  , and this function is located in libc.so.

You may be wondering, where did this function come from? We didn't use it in our code?

In fact, before entering the main function, there is another important work to be done, which is: the initialization of the C/C++ runtime library . The above  __libc_start_main  is doing this work.

When compiling with GCC, the compiler will automatically complete the link of the runtime library, encapsulate our main function, and call it.

glibc is open source, we can find the libc-start.c file of this project on GitHub, and get a glimpse   of the true face of __libc_start_main . Our main function is called by it.

Complete process

At this point, we have sorted out the process from the process of creating a fork, to completing the replacement of executable files through the exec series of functions, to the entry of the ELF file in the execution process, and to the complete process of our main function.

Some differences on Windows

The following briefly introduces some differences in this process on Windows.

The first step is to create a process. The Windows system merges the two steps of fork+exec into one step. The CreateProcess series functions are used in one step, and the executable file path of the child process is specified in its parameters.

Different from the fuzzy boundary between process and thread on Linux, on Windows operating system, the kernel has a clear definition of the concept of process and thread. Processes are represented by EPROCESS structure, and threads are represented by ETHREAD structure.

Therefore, on Windows, after the process-related work is ready, it is necessary to create a separate execution unit participating in kernel scheduling, which is the first thread in the process: the main thread . Of course, this work is also encapsulated in the CreateProcess series of functions.

After the main thread of the new process is created, it begins to participate in system scheduling. Where does the main thread start execution? The kernel is clearly specified when it is created: nt!KiThreadStartup , this is a kernel function, and the thread starts to execute from here.

After the thread is started from here, the APC mechanism is called through the asynchronous process of Windows to execute the APC inserted in advance, and then the execution flow is introduced into the application layer to perform the initialization work of the Windows process application, such as the loading of some core DLL files (Kernel32.dll) , Ntdll.dll) and so on.

Then, through the APC mechanism again, turn to the entry point of the executable file.

The mechanism behind this is similar to that on Linux. There is also no direct access to the main function. Instead, the C/C++ runtime library needs to be initialized first, and then the runtime function is wrapped before it finally comes to our main function.

The following is the complete process from the creation process to our main function on Windows (high-definition large image: https://bbs.pediy.com/upload/attach/201604/501306_qz5f5hi1n3107kt.png):

Now you know how to get from the process start to your main function step by step? If you have any doubts and puzzles, please leave a message for exchange.

Past TOP5 articles

I’m Redis, and MySQL’s big brother was miserable by me!

The CPU obviously has 8 cores, why is the network card desperately tossing the No. 1 core?

I almost lost my job because of a cross-domain request

That's it! The CPU keeps asking for something to happen soon!

Which hash table is stronger? Several programming languages ​​are arguing!

Guess you like

Origin blog.csdn.net/xuanyuan_fsx/article/details/109351232