Why does the fork function return twice?

foreword

The fork function is used to create a new process, called a child process , which runs concurrently with the old process (the one called the system call fork), and this process is called the parent process .

After creating a new child process, both processes will execute the next instruction after the fork() system call. The child process uses the same pc (program counter), the same CPU registers, the same open files used in the parent process. It takes no parameters and returns an integer value.

Here are the different values ​​returned by fork():

  • Negative value: Failed to create child process.
  • zero: return to the newly created child process.
  • Positive values: return to the parent process or the caller. The value contains the process ID of the newly created child process

Two main reasons for failure:

  1. The number of processes in the current system has reached the upper limit specified by the system, and the value of Error Codes is set to EAGAIN
  2. The system memory is insufficient, at this time the value of Error Codes is set to ENOMEM

code example

#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
 
int main() {
    printf("before fork: \n");
    pid_t pid = fork();			//进程从当前位置开始分裂为父进程和子进程,分别开始交替向前并发执行,区分在于pid不同
    printf("after fork: \n");
    if (pid == -1) {
        perror("fork");
    }
 
    if (pid > 0) {
        printf("pid: %d\n", pid);
        //程序在父进程,返回子进程的pid
        printf("this is parent process, pid: %d, ppid: %d\n",  getpid(), getppid());
    } else if (pid == 0){
        //程序在子进程,它没有子进程,返回0
        printf("this is child process, pid: %d, ppid: %d\n",  getpid(), getppid());
    }
 
    for (int i = 0; i < 5; i++) {
        printf("i: %d, pid: %d\n", i, getpid());
        sleep(1);
    }
 
    return 0;
}

The output is as follows:

before fork: 			
after fork: 				#父进程先执行
pid: 146087					#父进程的返回值是子进程的pid
this is parent process, pid: 146086, ppid: 144773
i: 0, pid: 146086			#父进程自己的pid
after fork: 				#子进程开始执行
this is child process, pid: 146087, ppid: 146086 #子进程pid, ppid
i: 0, pid: 146087  				#子进程pid		linux时间片为4ms~x00ms
i: 1, pid: 146086				#父进程pid
i: 1, pid: 146087
i: 2, pid: 146086
i: 2, pid: 146087
i: 3, pid: 146086
i: 3, pid: 146087
i: 4, pid: 146086
i: 4, pid: 146087

Why does fork return twice?

Since the stack segment of the parent process is copied when copying, both processes stay in the fork function, waiting to return. Therefore, the fork function will return twice, one is returned in the parent process, and the other is returned in the child process, and the return values ​​of these two times are different. The process is as shown in the figure below.

A wonderful thing about the fork call is that it is only called once but can return twice. It may have three different return values:

(1) In the parent process, fork returns the process ID of the newly created child process;
(2) In the child process, fork returns 0;
(3) If an error occurs, fork returns a negative value.

After the fork function is executed, if the new process is successfully created, two processes will appear, one is the child process and the other is the parent process. In the child process, the fork function returns 0, and in the parent process, fork returns the process ID of the newly created child process. We can judge whether the current process is a child process or a parent process by the value returned by fork.

Quoting a netizen to explain why the value returned by the fork function is different in the parent and child processes. "In fact, it is equivalent to a linked list. The process forms a linked list. The value returned by the fork function of the parent process points to the process id of the child process. Because the child process has no child process, the value returned by the fork function is 0.

After calling fork, there are two copies of data, heap, and stack, and one copy of code, but this code segment becomes a shared code segment of the two processes, and both return from the fork function, and the arrows indicate their respective execution locations. When one of the parent and child processes wants to modify data or the stack, the two processes actually split.

The subprocess code is executed from the fork, why not copy the code from #include? This is because fork copies the current situation of the process. When fork is executed, the process has already executed the previous part of the logic. fork only copies the next code to be executed to the new process.

Guess you like

Origin blog.csdn.net/Jason_Lee155/article/details/131523130