Detailed process control Linux-

I. Overview of the process

1. The concept of process

Linux is a multiuser, multitasking operating system, multi-user means that multiple users can simultaneously operate the computer, multitasking means that Linux can perform multiple tasks simultaneously. Process, it can be understood as a simple program running, so according to the characteristics of multi-tasking Linux, we know that Linux can start multiple processes simultaneously.
The process is the smallest unit of resource management operating system, the process is a dynamic entity, is the process of execution of a program (so that the process can be understood as a simple program running). The process appears to make the concurrent execution of multiple programs, to improve resource utilization, and improve the system throughput.

The process has the following features:

  • Dynamic : the process is the execution of the program is active when a program is executed on the processing mechanism, so the process is dynamic;
  • Concurrency : multiple processes can run at the same time on a single memory space, it can be seen, and the process has issued;
  • Independence : While a memory space can have multiple processes running, but they are running in their virtual memory space, do not interfere with each other;
  • Asynchrony : each process in accordance with their own speed running, the asynchronous between multiple processes.

2. processes and threads, the difference between the program

Processes and procedures: The difference is that the process is dynamic, the program is static . It can be considered the process of running the program, and the program is the number of executable code stored on the hard disk .
Processes and threads: In order for a computer to perform more tasks at the same time, the internal processes and divided many threads. Threads within a process, it is smaller than the process of the basic unit can operate independently. Thread basically do not have independent resources, all threads in the process shared all belong to the same resources, you can execute multiple threads within the same process at the same time .

3. Process Identity

Large number of processes running in Linux all the time will be, if we are looking for a specific process or a specific query information process, how do? ID card with the process can be achieved! ID is the ID process, each process is identified by a unique ID , so the ID is unique, and the process of correspondence. The process ID is a non-negative number, each process ID in addition there are some other identifying information, are included in the unistd.hheader file. You can get the process ID by some of the following functions:

Function declarations Features
pid_t getpid() Get Process ID
pid_t getppid() Get the process ID of the parent process
pid_t getuid() Real user ID acquisition process
pid_t geteuid() Effective user ID acquisition process
pid_t getgid() The actual acquisition process group ID
pid_t getegid() Acquisition process's effective group ID

4. The process of structure

Although the process can be understood as the program is running, but in fact in Linux process consists of three parts: a code segment, data segment and stack segment .

  • Program code segments stored executable code ;
  • Data stored program segment ends global variables, constants, static variables ;
  • The stack segment heap memory for storing dynamically allocated variables , stack for the function call, storing the function parameters, local variables inside a function definition .

Here it comes to the process of memory image, image memory refers to the kernel how to store the executable file in memory. In the process of converting the program into the process, the operating system will copy the executable program from the hard disk into memory , the memory address low-to-high order address below:

  • Code segment: the binary machine code, code segments is read-only, can be shared by multiple processes. For example, a parent process creates a child process, the parent and child share the code segment, in addition to parent the child will also receive data segment, stack, stack replication.
  • Data group: store initialized variables, including global variables, and static variables have been initialized.
  • Uninitialized data section: uninitialized storing static variables, also called bss section.
  • Heap: a variable for storing the program dynamically allocated.
  • Stack: a function call, to save the return address, function parameters, local variables inside a function defined functions.
  • High storage address also the command-line arguments and environment variables.

FIG layout program image as follows:
Here Insert Picture Description

Executable program memory and image differences:

  • Executable program is located in the disk and memory image in memory ;
  • No stack executable program, because the program is loaded into memory will be allocated stack ;
  • Although there are executable uninitialized data segment stored in the hard disk but it is not an executable file;
  • Executable files are static, unchanging, and memory image, with the execution of the program in the dynamic.

5. The process of state

Linux in the process can have the following states:

  • Operating status: process is running or waiting to run in the run queue
  • Interruptible wait state: Process is waiting for an event to complete, wait for the process can be woken up signal or timer
  • Uninterruptible wait state: Process is waiting for an event to complete, waiting for the signal can not be awakened or timer, you must wait until the event occurs wait
  • End state: the process receives the signal goes down or the process is being tracked (in the debugger, the process being followed in the state)
  • 僵死状态:进程已经终止,但进程描述依然存在,直到父进程调用wait()函数后释放

使用ps命令可以查看当前进程的状态。

二.进程的基本操作

1.创建进程

创建进程有俩种方式:前台创建和后台创建,也可以分为:操作系统创建和父进程创建。
由操作系统创建的进程它们之间是平等的,不存在资源继承关系。在系统启动时,操作系统会创建一些进程,它们承担着管理和分配系统资源的任务,这些进程就是系统进程而通过已有的进程创建出来的进程叫做子进程,创建子进程的进程称为父进程(稍微有点绕),子进程和父进程存在隶属关系,子进程又可以创建进程,就这样子子孙孙无穷无尽…子进程可以继承父进程几乎所有的资源。
这里主要介绍后台创建的进程,即通过父进程创建的进程,一般可以通过三种渠道创建进程:

  • fork()函数创建
  • vfork()函数创建
  • exec()族函数创建

fork()创建

使用fork()需要引用"sys/types.h"和"unistd.h"头文件,函数的返回值类型为pid_t 为非负整数。若程序运行在父进程中,函数返回子进程的ID;若函数在子进程中,函数返回0;当函数返回值为负数时代表出错!

当要创建一个子进程时,一般是使用fork()函数来创建的(创建一个进程也常称为fork一个进程),fork的英文意思是分支,所以根据其英文意思我们就可以大概明白其函数的功能了——将当前进程分支出一个子进程,也可以说创建一个新进程。
创建一个子进程后,父进程会和子进程争夺CPU,谁先抢到CPU的使用权谁就先执行,另一个进程就会挂起等待。
下面说一下这个fork()函数与众不同的地方:调用一次,返回俩次!!!
以往的函数都是调用一次返回一次或者不返回,但为什么这个函数竟然可以返回俩次呢???原因就在于利用fork()函数在当前进程的基础上由创建了一个进程!进程是什么?是运行中的程序,也可以理解为一段代码的执行。所以,创建了子进程就相当于在当前进程中复制了一段代码在子进程中,复制在子进程中的代码就会再次执行一次,这样就造成了调用一次,返回俩次的错觉!!!
当前进程在代码中创建子进程之后,代码就相当于分为俩部分了:

  • 调用fork()之前的代码,只在当前进程中执行
  • 调用fork()之后的代码,在当前进程中会执行,但在子进程中也会同样执行一次。要注意的是调用fork()之前的代码所实现的操作和功能都会在子进程中保留,当前进程中的各种数据都会复制一份到子进程中,所以子进程和当前进程在执行后半段的代码的时候互不干扰
    下面用代码来说明:
    (因为在Linux环境下方便编译,所以就放了截图…凑活着看吧)
    Here Insert Picture Description
    函数运行结果为:
    Here Insert Picture Description我来解释一下代码,很明显,我把代码分为俩部分,分开比较好说:
    前半部分:前半部分主要定义了变量k和调用了fork()函数创建进程,k的作用是验证后半部分在当前进程和子进程中的执行互不干扰,而且子进程继承了父进程的资源。
    后半部分:后半部分主要是对fork()的返回值进行处理,通过条件语句来判断代码是在当前进程中执行的还是在子进程中执行的。如果返回0,代码在子进程中执行,使用getpid()获取子进程ID(当前进程为子进程),使用getppid()获取父进程ID;如果返回值大于0,代码在父进程中执行,而且返回的pid就是子进程的ID,所以再使用getpid()获取的就是父进程的ID。

vfork()创建

调用vfork()的本质还是调用fork()
在使用vfork()函数创建子进程时与fork()函数唯一不同的是:vfork()创建的子进程是和父进程共享内存空间的,就是在创建好子进程后,子进程上的内存空间和父进程共用,这样带来的影响就是,当变量在子进程中改变是,变量在父进程中的值也会随之改变!
使用vfork()后父子进程的执行顺序是固定的:先执行子进程后执行父进程。
这个代码和上面的差不多,改改就能用:
Here Insert Picture Description

与上次的明显不同就是k的值,在子进程中k的值是2,在父进程中k的值是3.因为子进程先执行,子进程执行后k++变为2,而子进程和父进程共享变量,所以父进程中再次执行k++时,k的值就变成3了

exec()函数族

使用fork() 和vfork()函数创建子进程后,子进程执行的代码就是从父进程中复制过来的,这样同一段代码执行俩次也没啥意思,所以就有了exec()函数族,注意是函数族,exec()不是一个人在战斗!!!
言归正传,如果像让子进程执行另外一个程序,就要用到exec()函数族了。
但是,exec调用并没有生成新的进程,一个进程一旦调用exec,它本身就失去了价值,系统把原来的代码替换成新的程序的代码,废弃原有的数据段和堆栈段,并为新程序分配数据段与堆栈段,唯一保留的就是进程的ID,对系统而言:调用exec后的进程还是原来的进程,只不过其程序已经不是原来的程序了。可以理解为exec重新编写了程序。
Linux下exec函数族有6种不同的调用形式,它们的声明在头文件"unistd.h"中:

#include "unsitd.h"
int execl( const char *path, const char *arg,... );
int execlp( const char *file, const char *arg,... );
int execle( const char *path, const char *arg,...,char* const envp[ ] );
int execv( const *char path, const char *argv[ ] );
int execve( const *char path,const char *argv[ ], char* const envp[ ] );
int execvp( const char *file, const char *argv[ ] );

这些函数都定义在系统函数库中,在使用前需要引用头文件"sys/types"和"unistd.h"并且必须在预定义是定义一个外部的全局变量,用来显示环境变量,比如:extern char **environ

环境变量:为了便于用户灵活地使用Shell,Linux引入了环境变量的概念,环境变量可以是用户的主目录、终端类型、当前目录等,它们定义了用户的工作环境,所以称之为环境变量。

environ是一个指向Linux系统全局变量的指针。定义了这个指针后就可以在当前工作的目录中执行系统程序,如同在shell中不输入路径就直接运行VIM、GCC等程序一样。

由于这些函数tmd名字都差不多,很容易让人弄错,所以这里归纳一下exec函数族的命名规律吧:(都是以exec为基础的)

-p
字符p是path的首字母,代表文件的绝对路径,当函数名中带有 p 时,函数的参数就可以不用写出文件的相对路径(就是很详细的路径),只需写出文件名即可,因为函数会自动搜索系统的path路径。

-l
字符 l 是 list 的首字母,表示需要将新程序的每个命令行参数都当作一个参数传给它,参数的个数不做规定,但在最后要加上 NULL 参数,表示参数输入结束。

-v
字符 v 是 vector 的首字母,表示该类函数支持使用参数数组,数组中的最后一个指针也要输入 NULL 参数,作为结束标志。

-e
字符 e 是 environment 的首字母,表示该函数可以接收一份新的环境变量。

首先介绍一下execve()函数的使用,因为其余5ge函数在执行的过程中都要最后调用一下execve()函数,execve()函数名中包含了v、e,下面是利用execve()函数,在execve.c程序中执行new.c的代码:

execve.c程序
Here Insert Picture Description
new.c程序
Here Insert Picture Description
结果:
Here Insert Picture Description
所谓execve函数所实现的功能就是创建一个子进程,在子进程中执行另外一个文件,根据结果,在执行execve()后原代码中剩余部分就不执行了,那是因为调用了execve()函数,将进程中的代码段、数据段、和堆栈段都进行了修改,使得这个新创建的子进程只执行了这个程序的代码,此时父进程与子进程的代码不再有任何关系。执行了execve()
函数后,原来存在的代码都被释放了。

2.进程等待

进程等待就是为了同步父进程与子进程,通常调用wait()函数来使父进程等待子进程执行完毕。
wait()函数使父进程暂停执行,直到它的一个子进程结束为止。该函数的返回值是终止运行的子进程PID。
waitpid()函数可以指定等待特定的子进程。

3.结束进程

想要结束一个进程时,可以调用wait()_wait()函数来终止进程的运行。

exit()
该函数调用成功与失败都没有返回值,并且没有提示出错信息。

_exit()
该函数与exit()函数一样,调用成功与失败都没有返回值,并且没有提示出错信息。

需要注意的是 exit() 函数结束进程时会清空缓冲区,而 _exit() 函数不会清空缓冲区。所以对于 vfork() 函数创建的子进程,只能用 _exit() 函数来结束子进程,因为子进程和父进程的内存是共享的,如果使用 exit() 函数来结束子进程,会导致父进程内部数据丢失。

Published 62 original articles · won praise 188 · views 10000 +

Guess you like

Origin blog.csdn.net/qq_43743762/article/details/100857415