[Linux Basics Introduction] (13) Linux process concept

1 Introduction

content

  • In Linux, it is inevitable that a certain program will not respond. You can use some commands to help us make the system run more smoothly. Before that, we need to have a certain understanding of the basic knowledge of the process in order to use the tools provided by Linux better and more efficiently.

Knowledge points

  • Processes and procedures
  • Process derivation
  • Work management

2 Understanding of concepts

First of all, what are the procedures and processes? What is the difference between a program and a process?

Procedure: To be less precise, a procedure is to execute a series of instructions with a logical and sequential structure to help us achieve a certain result. Just like when we went to a restaurant and told the waiter that I wanted beef rice bowl, she executed the procedure of making beef rice bowl, and finally we got such a plate of beef rice bowl. It needs to be executed, otherwise it's like a martial arts secret book, just put it there and wait for someone to look through it.

Process: A process is an execution process of a program on a data set. In early UNIX, Linux 2.4 and earlier versions, it was an independent basic unit for system resource allocation and scheduling. Same as the previous example, as if we went to a restaurant and told the waiter that I want beef rice topping, she executed the procedure of making beef rice topping, and the cooking in it was a process, and the beef soup was a process. , Mixing the beef broth with rice is a process, and putting the rice on the table is a process. It is like we are looking at the martial arts secrets, and then practicing it chapter by chapter.

Simply put, a program is a software designed to accomplish a certain task, such as vim is a program. What is a process? A process is a running program.

A program is just a collection of a series of instructions, a static entity. Unlike a process, a process has the following characteristics:

  • Dynamic: The essence of a process is a process of program execution, with state changes such as creation and cancellation. The program is a static entity.
  • Concurrency: A process can have multiple programs running in a period of time. The program is just a static entity, so there is no concurrency.
  • Independence: Processes can allocate resources independently, receive scheduling independently, and run independently.
  • Asynchrony: The process advances at an unpredictable speed.
  • Structure: The process has code segment, data segment, PCB (process control block, the only sign of the existence of the process). It is precisely because of the structure that the process can run independently.

Concurrency: In a period of time, there are multiple programs that are active in a macroscopic view and are executed in an orderly manner (only one is executed at a moment, but multiple programs have been executed in a period of time)

Parallel: At every moment, there are multiple programs executing at the same time, this must have multiple CPUs.

The process itself is not the basic unit of operation, but a container of threads. Just as each department will be subdivided into various working groups (threads), and the resources needed by the working groups need to be applied to higher levels (processes).

Thread (thread) is the smallest unit that the operating system can perform operation scheduling. It is included in the process and is the actual operating unit in the process. A thread refers to a single sequential control flow in a process. Multiple threads can be concurrent in a process, and each thread executes different tasks in parallel. Because the thread contains almost no system resources, the execution is faster and more efficient.

In short, a program has at least one process, and a process has at least one thread. The division scale of threads is smaller than that of processes, which makes multithreaded programs have high concurrency. In addition, the process has an independent memory unit during execution, and multiple threads share memory, which greatly improves the efficiency of the program.

3 Process attributes

3.1 Classification of processes

According to the function of the process and the object of service, it can be divided into user process and system process:

  • User process: A process generated by executing a user program, an application program, or a system program other than the kernel. This type of process can be run or shut down under the control of the user.
  • System process: A process generated by executing the system kernel program, such as memory resource allocation and process switching and other relatively low-level tasks; and the operation of the process is not subject to user intervention, even the root user cannot interfere with the operation of the system process .

According to the service type of the application, it can be divided into interactive process, batch process, and daemon process

  • Interactive process: A process started by a shell terminal. During execution, it needs to interact with the user. It can run in the foreground or in the background.
  • Batch process: This process is a collection of processes, responsible for starting other processes in sequence.
  • Daemon: A daemon is a process that runs all the time. It starts when the Linux system starts and terminates when the system shuts down. They are independent of the control terminal and periodically perform certain tasks or wait to process certain events. For example, the httpd process is always running, waiting for user access. There is also the frequently used cron (crond in the centOS series) process, which is a daemon process of crontab, which can periodically perform certain tasks set by the user.

3.2 Process derivation

Regarding the parent process and the child process, these two system calls fork() and exec() will be mentioned .

fork() is a system call. Its main function is to create a new process for the current process. This new process is its child process. This child process is other than the return value and PID of the parent process. All are exactly the same, such as the execution code segment of the process, memory information, file description, register status, etc., fork () calls two return values ​​at a time, the child process returns zero, and the parent process returns the pid of the child process.

exec() is also a system call, and its role is to switch the execution program in the child process, that is, to replace the code segment and data segment copied from the parent process

The child process is the copy produced by the parent process through the system call fork() . Fork() is to directly copy the data structure information of the parent process's PCB and other processes, but modify the PID, so it is exactly the same, only when executing exec() It will be different later, and the earlier fork() consumes more resources and later evolved into vfork() , which has a much higher rate.

The simple implementation logic is as follows

pid_t p;

p = fork();
if (p == (pid_t) -1)
        /* ERROR */
else if (p == 0)
        /* CHILD */
else
        /* PARENT */

When a child process terminates normally, or when the process ends, its main function main() will execute exit(n); or return n , where the return value n is a signal, and the system will pass the SIGCHLD signal To its parent process, of course, if it terminates abnormally, it is often because of this signal.

The code execution part of the child process at the end of the process has finished execution, and the system resources are basically returned to the system, but if the process control block (PCB) of the process still resides in the memory, and its PCB is still there, It means that the process still exists (because the PCB is the only sign of the process, there are PID and other messages in it), and it has not died, such a process is called a zombie process (Zombie).

Under normal circumstances, the parent process will receive two return values: exit code (SIGCHLD signal) and reason for termination . After that, the parent process will use the wait(&status) system call to obtain the exit status of the child process, and then the kernel can release the PCB of the child process that has ended from the memory; and if the parent process does not do so, the PCB of the child process will be Will always reside in memory, and remain in the system as a zombie process (Zombie).

Although the zombie process has given up almost all memory space, has no executable code, and cannot be scheduled, it reserves a position in the process list to record the exit status of the process and other information for its parent process to collect and release it. However, the PIDs that can be used in the Linux system are limited. If there are a large number of zombie processes in the system, the system will not be able to generate new processes because there is no PID available.

In addition, if the parent process ends (abnormal end), the child process cannot be recovered in time, and the child process is still running. Such a child process is called an orphan process. In Linux systems, orphan processes are generally "adopted" by the init process and become a child process of init. The aftermath is handled by init, and there is no harm.

Process 0 is a special process created when the system boots, also called kernel initialization. Its last action is to call fork() to create a child process to run the /sbin/init executable file, and the process is PID=1 Process 1, and process 0 is turned into a swap process (also known as an idle process), process 1 (init process) is the first user mode process, and then it keeps calling fork() to create other processes in the system , So it is the parent or ancestor process of all processes. At the same time it is a daemon and will not stop until the computer is shut down.

We can clearly see this structure through the following command

pstree

We can also use such a command to see, where pid is a unique number of the process, ppid is the pid of the parent process of the process, and command indicates what kind of command or script the process is executed by.

ps -fxo user,ppid,pid,pgid,command

Insert picture description here

3.3 Process Group and Sessions

Each process will be a member of a process group, and this process group is unique, they are distinguished by PGID (process group ID), and whenever a process is created, it will become its parent process A member of the group.

In general, the PGID of the process group is equivalent to the PID of the first member of the process group, and such a process is called the leader of the process group, that is, the leading process. The process generally finds its location by using the getpgrp() system call For the PGID of the group, the leading process can terminate first. At this time, the process group still exists and holds the same PGID until the last process in the process group terminates.

Similar to a process group, whenever a process is created, it will become a member of the Session where its parent process is located. Each process group will be in a Session, and this Session is the only one that exists.

Session is mainly established for a tty, and each process in the session is called a job. Each session can be connected to a terminal (control terminal). When the control terminal has input and output, they are all passed to the foreground process group of the session. The meaning of Session is to include multiple jobs in a terminal, and take one job as the foreground to directly receive the input and output of the terminal and terminal signals. Other jobs run in the background.

The foreground is running in the terminal and can interact with you

The background (background) is running in the terminal, but you can’t interact with it, nor display its execution process.

3.4 Work management

bash (Bourne-Again shell) supports job control, while sh (Bourne shell) does not.

And each terminal or bash can only manage jobs in the current terminal, not jobs in other terminals. For example, I currently have two bashs, namely bash1 and bash2. Bash1 can only manage jobs in itself but not jobs in bash2.

We all know that when a process is operating in the foreground we can use ctrl + c to terminate it, but it will not work if it is in the background.

We can use the & symbol to let our commands run in the background

ls &

Insert picture description here
[1] 236 shown in the figure is the job number of the job and the PID of the process, and the Done in the last line indicates that the command has been executed in the background.

We can also use ctrl + z to stop our current job and throw it into the background.<The job that was stopped and placed in the background can be viewed by using the jobs command

jobs

Insert picture description here
The first column shows the number of the job that was placed in the background, and the + in the second column represents the job that was placed in the background recently (just now, last), and it also represents the default job, that is, if there is something for the background job Operation, first for the preset job,-means that the second -to- last (that is, the one before the preset) is placed in the background, and the third-to-last (before) will not be decorated with such symbols. The three columns represent their status, and the last column represents the command executed by the process

We can bring the background work to the foreground through such a command

#后面不加参数提取预设工作,加参数提取指定工作的编号
#ubuntu 在 zsh 中需要 %,在 bash 中不需要 %
fg [%jobnumber]

Insert picture description here
Previously, we used ctrl + z to stop the work from being placed in the background. If we want it to work in the background, we use such a command

#与fg类似,加参则指定,不加参则取预设
bg [%jobnumber]

Insert picture description here

Since there is a way to bring the job that is placed in the background to the foreground or let it change from stopping to continuing to run in the background, of course there are also ways to delete a job, or restart it, etc.

#kill的使用格式如下
kill -signal %jobnumber

#signal从1-64个信号值可以选择,可以这样查看
kill -l

These signal values ​​are commonly used

Signal value effect
-1 Re-read the parameters and run, similar to restart
-2 Exit like the operation of ctrl+c
-9 Forcibly terminate the task
-15 Terminate the task in a normal way

Insert picture description here
note

  • If you are using kill + signal value and then add pid directly, you will operate on the process corresponding to pid
  • If you are using kill + signal value and then %jobnumber , the object you are operating at this time is job, and this number is the ID of the job currently running in the background in bash

Guess you like

Origin blog.csdn.net/happyjacob/article/details/107045428