Linux process basics

The computer can actually do essentially very simple, such as the calculation of two numbers and then to find such an address, etc. in memory. The most basic of these computers is named command (instruction). A so-called program (program), is composed of a set of such series of instructions. Through the program, we can let the computer perform complex operations. Most of the time the program is stored as executable files. Such an executable file is like a recipe, the computer can make delicious meals according to recipes.

 

So, the difference between procedures and processes (process) then what is it?

 

Process is a concrete implementation of the program. Only recipes of no use, we always have to really step by step implementation in accordance with the guidance of recipes to make dishes. The process is the process of executing a program, similar to the process recipes, cook according to the real. The same program can be executed multiple times, can open up a separate space in memory to load, resulting in multiple processes. Different processes can also have their own separate IO interface.

 

An important feature of the operating system is to provide convenience for the process, for example, to allocate memory space, information management processes, and so is the process, if we are ready for a beautiful kitchen.

 

Look at the process

 

First, we can use the $ ps command to query the running processes, such as $ ps -eo pid, comm, cmd, below shows the results:

 

(-E flag to list all processes, -o pid, comm, cmd mean that we need to PID, COMMAND, CMD information)

 

 

Each row represents a process. Each row is divided into three. The first column PID (process IDentity) is an integer, each process has a unique PID to represent their identity, the process can also be used to identify other processes based on PID. COMMAND The second column is the abbreviation of the process. The third column is the CMD process and the parameters corresponding to the program brought runtime.

 

(Third row there are enclosed within brackets [] enclosed. They are part of the kernel function, is dressed as a way to facilitate the process of managing the operating system. We do not consider them.)

 

We look at the first line, PID is 1, the name for init. This process is executed / bin / init this file (program) generated. When Linux boots, init is the first process created by the system, this process will always exist until we shut down the computer. This process has special importance, we will continue to refer to it.

 

How to create a process

 

实际上,当计算机开机的时候,内核(kernel)只建立了一个init进程。Linux内核并不提供直接建立新进程的系统调用。剩下的所有进程都是init进程通过fork机制建立的。新的进程要通过老的进程复制自身得到,这就是fork。fork是一个系统调用。进程存活于内存中。每个进程都在内存中分配有属于自己的一片空间 (address space)。当进程fork的时候,Linux在内存中开辟出一片新的内存空间给新的进程,并将老的进程空间中的内容复制到新的空间中,此后两个进程同时运行。

 

老进程成为新进程的父进程(parent process),而相应的,新进程就是老的进程的子进程(child process)。一个进程除了有一个PID之外,还会有一个PPID(parent PID)来存储的父进程PID。如果我们循着PPID不断向上追溯的话,总会发现其源头是init进程。所以说,所有的进程也构成一个以init为根的树状结构。

 

如下,我们查询当前shell下的进程:

 

root@vamei:~# ps -o pid,ppid,cmd
 PID  PPID CMD
16935  3101 sudo -i
16939 16935 -bash
23774 16939 ps -o pid,ppid,cmd

 

我们可以看到,第二个进程bash是第一个进程sudo的子进程,而第三个进程ps是第二个进程的子进程。

 

还可以用$pstree命令来显示整个进程树:

 

init─┬─NetworkManager─┬─dhclient
    │                └─2*[{NetworkManager}]
    ├─accounts-daemon───{accounts-daemon}
    ├─acpid
    ├─apache2─┬─apache2
    │         └─2*[apache2───26*[{apache2}]]
    ├─at-spi-bus-laun───2*[{at-spi-bus-laun}]
    ├─atd
    ├─avahi-daemon───avahi-daemon
    ├─bluetoothd
    ├─colord───2*[{colord}]
    ├─console-kit-dae───64*[{console-kit-dae}]
    ├─cron
    ├─cupsd───2*[dbus]
    ├─2*[dbus-daemon]
    ├─dbus-launch
    ├─dconf-service───2*[{dconf-service}]
    ├─dropbox───15*[{dropbox}]
    ├─firefox───27*[{firefox}]
    ├─gconfd-2
    ├─geoclue-master
    ├─6*[getty]
    ├─gnome-keyring-d───7*[{gnome-keyring-d}]
    ├─gnome-terminal─┬─bash
    │                ├─bash───pstree
    │                ├─gnome-pty-helpe
    │                ├─sh───R───{R}
    │                └─3*[{gnome-terminal}]

 

fork通常作为一个函数被调用。这个函数会有两次返回,将子进程的PID返回给父进程,0返回给子进程。实际上,子进程总可以查询自己的PPID来知道自己的父进程是谁,这样,一对父进程和子进程就可以随时查询对方。

 

通常在调用fork函数之后,程序会设计一个if选择结构。当PID等于0时,说明该进程为子进程,那么让它执行某些指令,比如说使用exec库函数(library function)读取另一个程序文件,并在当前的进程空间执行 (这实际上是我们使用fork的一大目的: 为某一程序创建进程);而当PID为一个正整数时,说明为父进程,则执行另外一些指令。由此,就可以在子进程建立之后,让它执行与父进程不同的功能。

 

子进程的终结(termination)

 

当子进程终结时,它会通知父进程,并清空自己所占据的内存,并在内核里留下自己的退出信息(exit code,如果顺利运行,为0;如果有错误或异常状况,为>0的整数)。在这个信息里,会解释该进程为什么退出。父进程在得知子进程终结时,有责任对该子进程使用wait系统调用。这个wait函数能从内核中取出子进程的退出信息,并清空该信息在内核中所占据的空间。但是,如果父进程早于子进程终结,子进程就会成为一个孤儿(orphand)进程。孤儿进程会被过继给init进程,init进程也就成了该进程的父进程。init进程负责该子进程终结时调用wait函数。

 

当然,一个糟糕的程序也完全可能造成子进程的退出信息滞留在内核中的状况(父进程不对子进程调用wait函数),这样的情况下,子进程成为僵尸(zombie)进程。当大量僵尸进程积累时,内存空间会被挤占。

 

进程与线程(thread)

 

尽管在UNIX中,进程与线程是有联系但不同的两个东西,但在Linux中,线程只是一种特殊的进程。多个线程之间可以共享内存空间和IO接口。所以,进程是Linux程序的唯一的实现方式。

 

Guess you like

Origin blog.csdn.net/luoyajingfeng2/article/details/82848715