Zombie process (Zombie)

1. Zombie process

Processes should be destroyed when they have finished their work (after executing the program in the main function), but sometimes these processes will turn into zombie processes and take up important resources in the system. A process in this state is called a "zombie process", which is one of the reasons for the burden on the system. We should eliminate this process. Of course, the correct method should be mastered, otherwise it will resurface.

As can be seen from the figure below, the PID of the parent process is 1166, and the PID of the child process is 1167.

insert image description here

If we use kill -9 1167the command to kill the child process, what signal will the parent process receive?

As can be seen from the figure below, the parent process receives SIGCHLDthe signal , and the child process becomes a zombie process.

insert image description here

insert image description here

In a Unix system, a child process ends, but its parent process is still alive, but the parent process does not call the wait()/waitpid() function for additional processing, then the child process will become a zombie process .

The zombie process has been terminated and stopped working, but it is still not discarded by the kernel, because the kernel thinks that the parent process may still need some information about the child process. As a developer, the existence of zombie processes is resolutely not allowed.

How to kill zombie process?

  • Restart the computer.
  • Manually kill the parent process of the zombie process, and the zombie process will disappear automatically.
  • When a process is terminated or stopped, SIGCHLDthe signal will be sent to the parent process, so for the process with fork() behavior in the source code, we should intercept and process SIGCHLDthe signal .
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <sys/wait.h>

// 信号处理函数
void sig_usr(int signo)
{
    
    
    int status;

    switch (signo)
    {
    
    
        case SIGUSR1:
            printf("收到了SIGUSR1信号,进程id=%d!\n", getpid());    
            break;
        
        case SIGCHLD:
            printf("收到了SIGCHLD信号,进程id=%d!\n", getpid());

            // waitpid()函数获取子进程的终止状态,这样子进程就不会成为僵尸进程了
            // 第一个参数为-1,表示等待任何子进程
            // 第二个参数保存子进程的状态信息
            // 第三个参数提供额外选项,WNOHANG表示不要阻塞,让这个waitpid()立即返回
            pid_t pid = waitpid(-1, &status, WNOHANG);

            if (pid == 0) return; // 子进程没结束,会立即返回这个数字
            if (pid == -1) return; // 这表示这个waitpid()调用有错误,有错误也立即返回出去
            return; // 走到这里,表示成功,那也return吧
            break;
    }
}

int main(int argc, char* const* argv)
{
    
    
    pid_t pid;

    printf("进程开始执行!\n");

    // 系统函数,第一个参数是个信号,第二个参数是个函数指针,代表一个针对该信号的捕捉处理函数
    if (signal(SIGUSR1, sig_usr) == SIG_ERR)
    {
    
    
        printf("无法捕捉SIGUSR1信号!\n");
        exit(1);
    }

    if (signal(SIGCHLD, sig_usr) == SIG_ERR)
    {
    
    
        printf("无法捕捉SIGCHLD信号!\n");
        exit(1);
    }

    pid = fork(); // 创建一个子进程

    // 要判断子进程是否创建成功
    if (pid < 0)
    {
    
    
        printf("子进程创建失败,很遗憾!\n");
        exit(1);
    }

    // 现在父进程和子进程同时开始运行了 
    for (;;)
    {
    
    
        sleep(1);
        printf("休息1秒,进程id=%d!\n", getpid());
    }

    printf("再见了!\n");

    return 0;
}

insert image description here

2. Reasons for generating zombie processes

The parameter value passed to the exit function and the value returned by the return statement of the main function will be passed to the operating system. The operating system will not destroy the child process until these values ​​are passed to the parent process that spawned the child process. A process in this state is a zombie process. That is, it is the operating system that turns the child process into a zombie process.

Q: When will this zombie process be destroyed?

Answer: The value of the exit parameter of the child process or the return value of the return statement should be passed to the parent process that created the child process.

Q: How to pass these values ​​to the parent process?

Answer: The operating system will not actively pass these values ​​to the parent process. Only when the parent process actively initiates the request (function call), the operating system will pass this value. In other words, if the parent process does not actively request to obtain the end status value of the child process, the operating system will always save it and keep the child process in the zombie process state for a long time. In other words, parents are responsible for taking back their own children.

The next example will create a zombie process.

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>

int main(int argc, char *argv[])
{
    
    
	pid_t pid = fork();
	
	if (pid == 0)    // if Child Process
	{
    
    
		puts("Hi, I am a child process");
	}
	else
	{
    
    
		// 输出子进程ID。可以通过该值查看子进程状态(是否为僵尸进程)。
		printf("Child Process ID: %d\n", pid);
		
		// 父进程暂停30秒。如果父进程终止,处于僵尸状态的子进程将同时销毁。因此,延缓父进程的执行以验证僵尸进程。
		sleep(30);    // Sleep 30 sec.
	}

	if (pid == 0)
		puts("End child process");
	else
		puts("End parent process");
	
	return 0;
}

Compile and run:

gcc zombie.c -o zombie
./zombie

insert image description here

After the program starts running, it will pause in the state shown above. Before jumping out of this state (within 30 seconds), you should verify whether the child process is a zombie process. This verification is performed in other console windows.

insert image description here

It can be seen that the process status of PID 1387 is a zombie process (Z+). In addition, after a waiting time of 30 seconds, the parent process with PID 1386 and the zombie child process with PID 1387 are destroyed at the same time.

3. Use the wait function to destroy the zombie process

As mentioned above, in order to destroy the child process, the parent process should actively request to obtain the return value of the child process.

#include <sys/wait.h>

pid_t wait(int *statloc);

// 成功时返回终止的子进程ID,失败时返回-1

If there is already a child process terminated when this function is called, the return value passed when the child process terminates (the parameter value of the exit function, the return return value of the main function) will be saved in the memory space pointed to by the parameter of this function. However, the unit pointed to by the function parameter also contains other information, so it needs to be separated by the following macro:

  • WIFEXITED: Returns "true" when the child process terminates normally.
  • WEXITSTATUS: Returns the return value of the child process.

That is to say, when passing the address of the variable status to the wait function, the following code should be written after calling the wait function:

if (WIFEXITED(status))    // 是正常终止的吗?
{
    
    
    puts("Normal termination!");
    printf("Child pass num: %d", WEXITSTATUS(status));    // 那么返回值是多少?
}

Write the following example based on the above content. In the following example, the child process will no longer become a zombie process.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(int argc, char *argv[])
{
    
    
	int status;

	// 第11行创建的子进程将在第15行通过main函数中的return语句终止。
	pid_t pid = fork();
	
	if (pid == 0)
	{
    
    
		return 3;
	}
	else
	{
    
    
		printf("Child PID: %d\n", pid);

		// 第22行中创建的子进程将在第26行通过调用exit函数终止。
		pid = fork();
		
		if (pid == 0)
		{
    
    
			exit(7);
		}
		else
		{
    
    
			printf("Child PID: %d\n", pid);
			
			// 调用wait函数。之前终止的子进程相关信息将保存到status变量,同时相关子进程被完全销毁。
			wait(&status);

			// 第36行中通过WIFEXITED宏验证子进程是否正常终止。如果正常退出,则调用WEXITSTATUS宏输出子进程的返回值。
			if (WIFEXITED(status))
				printf("Child send one: %d\n", WEXITSTATUS(status));
			
			// 因为之前创建了2个进程,所以再次调用wait函数和宏。
			wait(&status);
			
			if (WIFEXITED(status))
				printf("Child send two: %d\n", WEXITSTATUS(status));
			
			// 为暂停父进程终止而插入的代码。此时可以查看子进程的状态。
			sleep(30); // Sleep 30 sec.
		}
	}

	return 0;
}

Compile and run:

gcc wait.c -o wait
./wait

Output result:

insert image description here

insert image description here

It can be seen that there are no processes with PIDs 1497 and 1498 in the system at this time, because the wait function is called to completely destroy the child process. In addition, the 3 3 returned when the two child processes terminate3 and7 77 is passed to the parent process.

This is how the zombie process is killed by calling the wait function. When calling the wait function, if there is no terminated child process, the program will block (Blocking) until a child process is terminated, so call this function with caution.

4. Use the waitpid function to destroy the zombie process

The wait function will cause the program to block, and you can also consider calling the waitpid function. This is the second way to prevent zombie processes, and the way to prevent blocking.

#include <sys/wait.h>

pid_t waitpid(pid_t pid, int *statloc, int options);

// 成功时返回终止的子进程ID(或0),失败时返回-1
// pid:等待终止的目标子进程的ID,若传递-1,则与wait函数相同,可以等待任意子进程终止
// statloc:与wait函数的statloc参数具有相同含义
// options:传递头文件sys/wait.h中声明的常量WNOHANG,即使没有终止的子进程也不会进入阻塞状态,而是返回0并退出函数

The following describes an example of calling the waitpid function. The program does not block when the waitpid function is called.

#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>

int main(int argc, char *argv[])
{
    
    
	int status;
	pid_t pid = fork();
	
	if (pid == 0)
	{
    
    
		// 调用sleep函数推迟子进程的执行。这会导致程序延迟15秒。
		sleep(15);
		return 24;
	}
	else
	{
    
    
		// while循环中调用waitpid函数。向第三个参数传递WNOHANG,因此,若之前没有终止的子进程将返回0。
		while (!waitpid(-1, &status, WNOHANG))
		{
    
    
			sleep(3);
			puts("sleep 3sec.");
		}

		if (WIFEXITED(status))
			printf("Child send %d\n", WEXITSTATUS(status));
	}

	return 0;
}

Compile and run:

gcc waitpid.c -o waitpid
./waitpid

Output result:

insert image description here

It can be seen that the 22nd 22ndA total of 5 5were executed in 22 lines5 times. In addition, this also proves that the waitpid function is not blocked.

Guess you like

Origin blog.csdn.net/qq_42815188/article/details/129392190