Linux process (1) --- in-depth understanding of the concept and state of the process

Table of contents

what is a process

How to manage processes

describe process

What exactly is a PCB?

Content classification of PCB

organizational process 

view progress

ps command

Get the process identifier through a system call

getpid()

getppid()

Create a process through a system call - a first look at fork

process state

Running state (R)

Blocking state (S)

blocked state

 suspended state

The difference between ready state

stop state (T)

Zombie Status (Z)


what is a process

Although we haven't specifically talked about the process, we must be in touch with it all the time. For example, open the task manager.

These are all processes, such as QQ, browser, wallpaper software, etc., these are all one process.

That is to say: when we start a software, the essence is to start a process.

Under Linux, running a command or ./xxx actually creates a process at the system level.

Process concept:

Textbook concepts: an executing instance of a program, a program being executed, etc.

Kernel point of view: Acts as the entity that allocates system resources (CPU time, memory).

How to manage processes

Linux can load multiple programs at the same time, that is, Linux can have a large number of processes in the system at the same time.

We must manage these processes, how to manage them?

Describe first, organize later.

I highly recommend that you read what I said in the last chapter. What is description first and then organization? I gave some common examples to explain by analogy. After reading the last chapter, you can look at these. The angle of view of the problem will be different. A big difference.

describe process

Let's first understand this picture briefly:

There are many executable programs on the disk. We know that to run the program, we must first load it into the memory. Note that the content (code + data) of the file (the executable program is essentially a file) is loaded into the memory . For the convenience of management in the future, we are in Inside the operating system, a structure PCB is defined to describe all attribute data of this process.

Note that this attribute data has little to do with the file content.

Therefore, the task at this time has changed from the management of the process to the addition, deletion, checking and modification of the PCB structure linked list.

So at this time, there is a deeper definition of the process.

Process = corresponding code and data + PCB structure corresponding to the process.

I have been talking about PCB before, so this

What exactly is a PCB?

We know that this is used to describe all attribute information of a process. When a new executable file is loaded into the memory, a new copy of the corresponding PCB will also be loaded.

Process information is placed in a data structure called process control block (PCB), which can be understood as a collection of process attributes.
It is called PCB (process control block) in the textbook, and the PCB under the Linux operating system is: task_struct.

The structure that describes the process in Linux is called task_struct, and the name of each system platform may be different. But it is collectively called PCB.
task_struct is a data structure of the Linux kernel, which will be loaded into RAM (memory) and contains information about the process.

Content classification of PCB

This is some attribute information in the PCB, and we will explain the corresponding key parts later.

Identifier : Describe the unique identifier of this process, which is used to distinguish other processes.
Status : Task status, exit code, exit signal, etc.
Priority: Priority relative to other processes. (Different from permissions, the priority is to distinguish the priority, and the permission is to distinguish whether it can be done).
Program counter: The address of the next instruction to be executed in the program.
Memory pointer: including pointers to program code and process-related data, as well as pointers to memory blocks shared with other processes. Context
data: data in the registers of the processor when the process is executed [CPU, registers]. Focus on.
I/O status information: including the displayed I/O request, the I/O device assigned to the process and the list of files used by the process.
Billing information: May include sum of processor time, sum of clocks used, time limit, billing account number, etc.
other information

organizational process 

As we said earlier, the attribute information describing the process is PCB, which is called task_struct under Linux.

All processes running in the system are stored in the kernel in the form of task_struct double linked list .

The operation on the process is transformed into the operation on the linked list.

view progress

There are three ways to view, namely

ls /proc/process PID [view the process as a file]

ps

top【The top command is a commonly used performance analysis tool under Linux , which can display the resource usage status of each process in the system in real time】

These three commands. We generally use ps to view

ps command

The ps command is used to display the status of the current process , similar to the task manager of windows.

The usage is as follows:

ps [options]

The commonly used option ajx means to display all process information.

 We write a source file named myproc.c

Enter the following:

 That is, an infinite loop is created. Then one window is duplicated, one window executes and the other window monitors the process. 

Because ps ajx displays all processes, we need to add grep to pick out the processes we need:

ps -ajx | grep 'myproc'

In this way, we find that when the program on the right is running, the running status of the corresponding program can also be seen on the left.

So many processes, how do we identify the only one?

Get the process identifier through a system call

Each process will have an id, which we call PID, which is used to identify itself. There is also a PPID, which is used to represent the ID of the parent process of the process. I will talk about it later.

Call function: getpid().

getpid()

Let's look at the usage:

 where pid_t is a data type representing an unsigned integer.

Let's use it briefly, open the myproc.c file, and enter:

 

Then it is the same as the previous operation, after compiling the file successfully, copy a window, and then one window runs, and the other window monitors the .

 In this way, the program obtains its own pid and outputs it.

What can we do if we want to close this running program at this time?

1.Ctrl+C   2.kill -p PID

Needless to say on the first, let’s look at the effect of the second method.

 In this way, the running process will be killed, which will be discussed later.

In addition to getting the ID of the child process, you can also get the ID of the parent process. The usage is the same as above, but getppid().

getppid()

We add this usage to the original file:

 Then do the same:

 

At this time, both its own id and the id of the parent process have been obtained, so there is a question, who is this parent process?

 

We found that the parent process is bash, so what is this bash?

bash is a command processor that runs in a text window and can execute commands directly entered by the user.

Create a process through a system call - a first look at fork

Let's first look at what fork is.

 You can see that the function of fork is to create a child process. Then look at the return value.

 Said that if successful, the PID of the child process will be returned to the parent process, and 0 will be returned to the newly created child process.

On failure, -1 will be returned to the parent process and no child process will be created.

This is very unconventional. How can a function have two return values?

We can look at the following code:

 Compile and run:

 It is found that a printf has output twice, and the two return values ​​are different.

In fact, this is because the parent process has added a new child process after encountering fork(), and both processes have passed through this printf, resulting in two output, one is the child process PID returned to the parent process, and the other is the child process itself Got 0.

 It is equivalent to changing from one execution flow to two execution flows.

However, when we actually use it, it will be divided into modules. The child process performs its own tasks, and the parent process performs its own tasks, so that they do not interfere with each other.

 Because if the creation is successful, the child process id will be equal to 0, so the second if will be taken, and the parent process will get the child process id, which must be greater than 0, so the third if will be taken.

In this way, you can see that you have indeed gone to a different branch. 

Still summing up, why does fork() have two return values?

1. Because inside fork(), the father and son will each execute the return statement once, and of course there are two return values ​​after returning twice.

2. Returning twice does not mean that it will be kept twice. (For example, using the same id to judge, how can it be possible to judge twice, and each time is a different value? (How can the same id go through two execution flows ) This question will be discussed later.        

After the fork() child process is created, which process will execute first?

This is not necessarily , who runs first, this is determined by the operating system scheduler.

You can stop here for the understanding of fork(), and we will explain the principle in depth later.

process state

There are 5 states in the process:

  1. Running: The process is running or waiting for CPU resources.

  2. Ready state (Ready): The process is ready to run, but has not yet received CPU resources.

  3. Blocking (interruptible sleep) state (Sleeping): The process is waiting for an event to occur, such as waiting for input and output to complete, waiting for a semaphore, waiting for a lock, etc.

  4. Stopped state (Stopped): The process is suspended, for example due to receiving a SIGSTOP signal.

  5. Zombie state (Zombie): The process has ended, but its parent process has not had time to process its exit status information, so it is called a zombie process.

Here is a diagram of the relationship between them:

The following are several scenarios of state transition:

Ready -> Running: Process Scheduling
Running -> Ready: The time slice is up or being forcibly occupied
Running -> Blocking: Waiting for a response after requesting a service, or waiting for a signal to arrive
Blocking -> Ready: The requested service has been completed, or waiting the signal has come

We will explain it separately.

Running state (R)

Running: The process is running or waiting for CPU resources.

Note that the running state is not necessarily running. As long as the task_struct structure is queued in the running queue (scheduling queue) under Linux, it is called the running state.

It’s like in a cafeteria, when you are queuing up to get food, when someone asks you what you are doing, you will say that I am eating, but the food does not really reach your mouth at this time. That’s what it means.

Blocking state (S)

blocked state

Blocking state (Waiting): The process is waiting for an event (non-CPU resource) to occur , such as waiting for input and output to complete, waiting for a semaphore, waiting for a lock, etc.

We must first know that there must be various resources in the system (not only CPU), but also network cards, graphics cards, disks and other devices.

So there is not just one queue in the system ! Not only the running queue of the CPU, but also related queues such as disks and network cards.

For example, the CPU has a run queue running, and the disk also has a queue and many processes are preparing to access the disk. At this time, the process being executed by the CPU encounters fread and needs to read data from the disk. At this time, the process is removed from the CPU. The queue is placed in the waiting queue of the disk. This waiting process is called the blocking state, and the waiting queue is called the blocking queue.

 suspended state

When the memory is insufficient, the OS replaces the code and data of the process to the disk by appropriate, and the state of the process is called hang.

When a process is in the suspended state, it temporarily relinquishes the execution right of the CPU and marks the state of the process as "not executable" . This is usually because the process needs to wait for some event to occur, such as: I/O operation, signal reception, insufficient resources, etc.

When the process is in the suspended state, there is only the task_struct structure inside the memory, without code and data.

As we mentioned at the beginning, this blocking state is also called an interruptible sleep state. What is an interruptible sleep state?

Look at the code below:

We let this program sleep for 100 seconds, then run it, and watch it in another window.

 

 At this time, it is found that the program has entered the sleep state, and at this time we send it a signal to wake it up.

After sending the No. 19 signal, it is found that the state has changed, indicating that it is still "for you".

That is to say, if you give it a signal, it will return to you, just like I am sleeping and wake me up at any time if something happens. This is interruptible sleep.

Corresponding to this is the D state (disk hibernation state), also known as the deep sleep state, which cannot be woken up. It is similar to that I am sleeping, so do not disturb.

The difference between ready state

Ready State (Ready State) means that the process is ready to execute and has met all the conditions required for scheduling, including obtaining the required resources. In the ready state, the process waits to be scheduled for execution, but does not obtain the execution right of the CPU. It is in a state that can be executed immediately, just waiting for the system scheduler to select it and allocate CPU execution time.

Waiting State (Waiting State) means that the process is temporarily unable to execute because it is waiting for some event, condition or resource to occur. When a process enters the waiting state, it relinquishes the execution right of the CPU and waits for an external event to be triggered or a specific condition to be met. In the wait state, the process is temporarily unable to perform any further operations until the event or condition it is waiting for occurs, or the required resource becomes available.

In general, the resource in the ready state is ready and is waiting to be called; the waiting state is because the resource is not ready, and then gives up being called.

stop state (T)

Stopped state (Stopped): The process is suspended, for example due to receiving a SIGSTOP signal.

Usually, the suspended state is used to suspend or terminate the execution of a process, such as suspending a certain process during debugging for debugging operations , or by a system administrator to suspend the execution of a process when necessary. A process can resume from the suspended state to the ready state and continue to perform the operations it was suspended from.

It should be noted that the suspended state is different from the suspended state (sleeping/blocking state). The suspended state means that the process is temporarily unable to execute, waiting for some conditions or events to occur. The suspended state is a human operation or a specific signal that causes the process to actively stop execution and temporarily suspends the operation of the process.

Wait for the corresponding signal (eg  SIGCONT) or use the command (eg  kill -CONT <pid>) to return to the ready state.

Zombie Status (Z)

Zombie state (Zombie): The process has ended, but its parent process has not had time to process its exit status information, so it is called a zombie process.

The main feature of a zombie process is that the process has ended its execution, but the related resources have not been completely released , including the PCB, memory and other open file descriptors of the process. Although the zombie process is no longer executing, its presence occupies some resources of the system.

 It can be seen that the child process sleeps for 5 seconds and will end and leave, while the parent process has been running, and a zombie process will appear at this time.

 It can be found that the child process has become a Z zombie state, and Z+ means that the program is running in the foreground.

Guess you like

Origin blog.csdn.net/weixin_47257473/article/details/131694790