专栏内容：并发编程

个人主页：我的主页

座右铭：天行健，君子以自强不息；地势坤，君子以厚德载物．

前言

概述

前言

本专栏介绍：以实战为主线，从浅到深逐步介绍并发编程的核心知识，在SMP架构系统下需要用到的并发性能相关的要点，不仅满足面试需求，而且让自己的技能提升到并发架构的层次。

概述

在linux中每个进程都有一个编号号PID，它的父进程的编号为PPID。当我们用ps -ef查看时，就可以看到每个进程的PID和PPID。

UID          PID    PPID C STIME TTY          TIME CMD

root           1       0 0 Mar07 ?        00:00:19 /usr/lib/systemd/systemd --switched-root --system --deserialize 17

root           2       0 0 Mar07 ?        00:00:00 [kthreadd]

root           3       2 0 Mar07 ?        00:00:00 [rcu_gp]

root           4       2 0 Mar07 ?        00:00:00 [rcu_par_gp]

root           5       2 0 Mar07 ?        00:00:00 [slub_flushwq]

root           6       2 0 Mar07 ?        00:00:00 [netns]

root           8       2 0 Mar07 ?        00:00:00 [kworker/0:0H-events_highpri]

root          10       2 0 Mar07 ?        00:00:00 [mm_percpu_wq]

root          11       2 0 Mar07 ?        00:00:00 [rcu_tasks_kthread]

root          12       2 0 Mar07 ?        00:00:00 [rcu_tasks_rude_kthread]

root          13       2 0 Mar07 ?        00:00:00 [rcu_tasks_trace_kthread]

root          14       2 0 Mar07 ?        00:00:02 [ksoftirqd/0]

root          15       2 0 Mar07 ?        00:01:57 [rcu_preempt]

root          16       2 0 Mar07 ?        00:00:00 [migration/0]

root          18       2 0 Mar07 ?        00:00:00 [cpuhp/0]

root          19       2 0 Mar07 ?        00:00:00 [cpuhp/1]

root          20       2 0 Mar07 ?        00:00:00 [migration/1]

root          21       2 0 Mar07 ?        00:00:01 [ksoftirqd/1]

我们发现，Linux下最原始的PID为0，由它产生了两个PID分别1，2的子进程，后面的进程的PPID都为1或2。

其实在linux当中，所有进程都是树状关系，由树根为PID=0, 然后产生子进程PID=1，由它再产生用户态的进程树分支；产生PID=2的子进程，由它再派生出内核态的进程树分支。

为什么每一个进程都需要一个父进程呢？因为进程结束时，需要由父进程来回收资源，否则就会变成僵尸进程，造成资源泄漏。

在linux上编程，如何创建一个多进程的并发程序呢？下面我们来看有几种方法。

fork 父子进程

当我们运行一个main函数时，其实就已经在运行一个进程了，通过系统调用fork可以再分裂出来另一个进程，两个进程的关系是父子关系，为什么这么说呢？

其一，子进程会将在创建时，会将父进程的用户堆栈和数据区都复制给子进程，之后父子进程各自运行在自己的地址空间中，所以子进程会继承父进程的用户堆栈和数据区，当然是在分裂前的部分，之后就会完成独立。

基二，子进程的退出，会通过信号SIGCHLD通知父进程；父进程可以通过waitpid来等待子进程的退出。

下面我们通过一个例子来演示一下；

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>

#include <errno.h>
#include <string.h>


int global_var = 100;

int main(int argc ,char *argv[])
{
        int pid = -1;
        char *str = NULL;
        int str_len = strlen("I come from parent process.");

        str = malloc(str_len);
        if(NULL == str)
        {
                printf("memory is not enough.\n");
                exit(-1);
        }
        memset(str, 0x00, str_len);
        strncpy(str, "I come from parent process.", str_len);


        pid = fork();
        if(pid == 0)
        {
                // in child
                printf("here in child ,my pid is %d\n", getpid());

                printf("parent bring info: global_var:%d, str:%s\n", global_var, str);
        }
        else if(pid > 0)
        {
                // in parent
                printf("here in parent, child pid is %d\n", pid);
        }
        else
        {
                // error
                printf("fork error[%s]\n",strerror(errno));
        }

        return 0;
}

在子进程中可以访问父进程创建的变量，甚至是动态内存。

后台进程

当我们运行一个程序时，即使我们关闭当前终端也能运行，希望它能后台运行，就像我们ps 看到的进程一样，如何能做呢？

当我们创建完子进程后，父进程先退出，这样子进程就会变成孤儿进程被1号进程(init 或 systemd)接管，父进程ID就为变成 1,就变成了后台守护进程，只能用ps来查看，通过kill 来结束了。

下面我们通过一个例子来验证一下：

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>

#include <errno.h>
#include <string.h>


int global_var = 100;

int main(int argc ,char *argv[])
{
        int pid = -1;
        char *str = NULL;
        int str_len = strlen("I come from parent process.");

        str = malloc(str_len);
        if(NULL == str)
        {
                printf("memory is not enough.\n");
                exit(-1);
        }
        memset(str, 0x00, str_len);
        strncpy(str, "I come from parent process.", str_len);


        pid = fork();
        if(pid == 0)
        {
                // in child
                printf("here in child ,my pid is %d\n", getpid());

                printf("parent bring info: global_var:%d, str:%s\n", global_var, str);

                sleep(30);

                printf("child will exit now.\n");
                exit(0);
        }
        else if(pid > 0)
        {
                // in parent
                printf("here in parent, child pid is %d\n", pid);
        }
        else
        {
                // error
                printf("fork error[%s]\n",strerror(errno));
        }

        printf("parent exit now");

        return 0;
}

here in parent, child pid is 3430010

parent exit nowhere in child ,my pid is 3430010

parent bring info: global_var:100, str:I come from parent process.

执行后，打印以上三条后，命令行就退出了，我们再查看进程

ps -ef|grep a.out

root 3430010 1 0 08:21 pts/7 00:00:00 ./a.out

此时发现，a.out的父进程PID已经成1了，它已经被1号进程接管。

execv 非关系进程

execv系统调用之后，会用传入的程序替换当前进程的内存上下文，但是当前进程的PID不会变化。调用execv后，进程内存上下文被替换后，从父进程继承下来的变量就无法访问，此处要特别注意，对于动态申请的资源，如socket，文件句柄等要在打开之前关闭。

execv函数有五个变种，大家可以选择使用。

int execl(const char *path, const char *arg, ...);

int execlp(const char *file, const char *arg, ...);

int execle(const char *path, const char *arg, ..., char * const envp[]);

int execv(const char *path, char *const argv[]);

int execvp(const char *file, char *const argv[]);

execv调用成功后，不会返回，除非调用失败。我们用一个例子来演示一下。


#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/wait.h>

int global_var = 100;

int main(int argc ,char *argv[])
{
        int pid = -1;
        char *str = NULL;
        int str_len = strlen("I come from parent process.");
        char *child_argv[] = {"test","I come from parent process.", NULL};
        int status = 0;

        str = malloc(str_len);
        if(NULL == str)
        {
                printf("memory is not enough.\n");
                exit(-1);
        }
        memset(str, 0x00, str_len);
        strncpy(str, "I come from parent process.", str_len);


        pid = fork();
        if(pid == 0)
        {
                // in child
                printf("here in child ,my pid is %d\n", getpid());

                printf("parent bring info: global_var:%d, str:%s\n", global_var, str);

                execv("./test",child_argv);

                printf("child will exit now.\n");
                exit(0);
        }
        else if(pid > 0)
        {
                // in parent
                printf("here in parent, child pid is %d\n", pid);
        }
        else
        {
                // error
                printf("fork error[%s]\n",strerror(errno));
        }

        /* wait all child processes exited  */
        pid = waitpid(-1, &status, 0);
        if(pid > 0)
        {
                printf("child %d is exited\n",pid);
        }

        printf("parent exit now\n");

        return 0;
}

其中的test程序是一个简单的程序，代码下：

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int main(int argc ,char *argv[])
{

        printf("test programme running, pid[%d]\n",getpid());

        if(argc > 1)
        {
                printf("input:%s\n", argv[1]);
        }
        sleep(30);
        return 0;
}

gcc test.c -o test

编译后运行。

vfork

vfork创建的子进程与父进程共享数据段,而且由vfork()创建的子进程将先于父进程运行，直到子进程调用了execv或者exit时，父进程才会被继续运行。

什么意思呢？就是子进程不调用execv或者子进程不结束时，父进程就不会被运行了，此时只有一个子进程在运行，同时它可以访问父进程的数据。

为什么会有vfork这种行为存在呢？这是因为早期fork调用要复制父进程内存上下文，代价非常大，但是后来就发明了“写时复制”技术，代价就没那么大了。

那么vfork还有存在的意义吗？当然有，我们知道fork之后的父子进程，虽然是不同的内存空间，但是地址都是一模一样的，这是通过段页机制来实现，如果没有这种机制时，vfork就必须出场了。

下面我们来演示一下。

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/wait.h>
int global_var = 100;

int main(int argc ,char *argv[])
{
        int pid = -1;
        char *str = NULL;
        int str_len = strlen("I come from parent process.");
        char *child_argv[] = {"test","I come from parent process.", NULL};
        int status = 0;

        str = malloc(str_len);
        if(NULL == str)
        {
                printf("memory is not enough.\n");
                exit(-1);
        }
        memset(str, 0x00, str_len);
        strncpy(str, "I come from parent process.", str_len);


        pid = vfork();
        if(pid == 0)
        {
                // in child
                printf("here in child ,my pid is %d\n", getpid());

                printf("parent bring info: global_var:%d, str:%s\n", global_var, str);

                sleep(4);
                printf("I'm subprocess %d, will be entering execv\n",getpid());
                execv("./test",child_argv);

                printf("child will exit now.\n");
                exit(0); // ignore
        }
        else if(pid > 0)
        {
                // in parent
                for(int i=5;i>=0;i--)
                        printf("here in parent, child pid is %d\n", pid);
        }
        else
        {
                // error
                printf("fork error[%s]\n",strerror(errno));
        }

        /* wait all child processes exited  */
        pid = waitpid(-1, &status, 0);
        if(pid > 0)
        {
                printf("child %d is exited\n",pid);
        }

        printf("parent exit now\n");

        return 0;
}

here in child ,my pid is 3518145

parent bring info: global_var:100, str:I come from parent process.

I'm subprocess 3518145, will be entering execv

here in parent, child pid is 3518145

here in parent, child pid is 3518145

here in parent, child pid is 3518145

here in parent, child pid is 3518145

here in parent, child pid is 3518145

here in parent, child pid is 3518145

test programme running, pid[3518145]

input:I come from parent process.

运行结果中，只有在子进程执行execv后，主进程的循环才打印出来。

clone

clone创建子进程，效果类似于fork，相比fork来说，clone提供了更灵活的选项，可以选择是否共享进程上下文，或者更精细的控制，对虚拟地址空间，文件描述符列表，信号处理等是否共享进行控制。

clone除了创建进程，还可以用来创建线程，这是它更强大的地方。

结尾

作者邮箱：[email protected]
如有错误或者疏漏欢迎指出，互相学习。

注：未经同意，不得转载！

linux 进程创建的五种方法fork/vfork/execv/clone

前言

概述

fork 父子进程

后台进程

execv 非关系进程

vfork

clone

结尾

猜你喜欢