Chapter 4: Linux Process Concept and Environment Variables

Series Article Directory



foreword

A process is the smallest unit of resources allocated by the operating system. How are processes represented and operated on the Linux system?


1. Von Neumann Architecture

Most computer hardware systems arevon Neumann system
insert image description here

  • Input and output devices: collectively referred to as peripherals, including keyboards, microphones, cameras, network cards, disks, etc.
  • Memory: here refers toMemory
  • Central Processing Unit: Arithmetic Unit and Controller

1. The meaning of memory

All devices can only deal directly with memory.
The advantage of this is that the data that needs to be waited is stored in the memory, and the read data is the speed of the memory, which improves the overall operating speed of the computer.

2. Data flow

Even for simple operations, every part is used.
Transfer files in QQ: input: disk, output: network card, input: network card, output: disk
chat in QQ: input: keyboard, output: network card, input: network card, output: monitor

2. Operating system

Any computer system contains a basic collection of programs called an operating system (OS).

Operating systems include:

  • Kernel (process management, memory management, file management, driver management)
  • Other programs (such as function libraries, shell programs, etc.)

1. The purpose of the operating system

  • Interact with hardware and manage all hardware and software resources
  • Provide a good operating environment for user programs (applications)
    insert image description here
  1. The hardware part complies with the von Neumann system

  2. The OS does not trust any user, and any access to system hardware or software must be manually controlled by the OS.

  3. The computer system is a layered structure. Any access to hardware or software must be accessed through the OS interface through the OS.

  4. Library function: the interface provided by the language or third-party library (first party: system, second party: own, and the rest are third-party)

  5. System calls: Interfaces provided by the OS

Summarize:

  • Computer management hardware: describe it, organize it with a struct structure, use a linked list or other efficient data structures
  • The operating system is software for software and hardware resource management (the essence of management is first described in the organization (it is the management of data)
  • There are three types of management: the manager, the executor, and the managed (eg, the manager is the OS, the executor is the driver, and the managed is the underlying hardware)

3. Process

1. Basic concepts

Textbook concept: an execution instance of a program, a program being executed, etc.
Kernel point of view: an entity responsible for allocating system resources (CPU time, memory).
Operating system: the kernel's data structure about the process (PCB) + the code and data of the current process
insert image description here

Describe the process - PCB

  • Process information is placed in a data structure called a process control block, which can be understood as a collection of process attributes. It is called PCB (process control block), and the PCB under the Linux operating system is: task_struct. task_struct is a type of PCB
  • The structure describing the process in Linux is called task_struct.
  • task_struct is a data structure of the Linux kernel that is loaded into RAM (memory) and contains process information

task_ struct content classification

  • Identifier: Describe the unique identifier of this process, which is used to distinguish other processes.
  • Status: Task status, exit code, exit signal, etc.
  • Priority: Priority relative to other processes.
  • Program Counter: The address of the next instruction in the program to be executed.
  • Memory pointers: including pointers to program code and process-related data, as well as pointers to memory blocks shared with other processes
  • Context data: the data in the registers of the processor when the process is executed
  • I/O status information: including displayed I/O requests, I/O devices assigned to the process and a list of files used by the process.
  • Billing information: May include sum of processor time, sum of clocks used, time limits, billing account number, etc.
  • other information

2. Check the process

ps axj
//查看进程
ps axj | head -1 && ps axj | grep "test"
//带标题栏和过滤带有“test"的进程
top
//查看进程占资源情况
ls /proc
ls /porc/xxx -al
//这些目录保存了当前系统中运行的所有进程的信息

3. Create a process through a system call - fork

insert image description here
1. Create a subprocess
insert image description here
insert image description here

  • The reason fork has two return values ​​is that after the child process is successfully created, the child process and the parent process share code
  • fork: The return value of the child process is 0, and the return value of the parent process is the pid of the child process. Because the child process has only one parent process, and the parent process has multiple child processes, each child process needs to be identified (pid) and remembered.
  • Code sharing between parent and child processes, separate space for data, private copy (copy-on-write)

4. Process status

1. Linux kernel source code

/*
* The task state array is a strange "bitmap" of
* reasons to sleep. Thus "running" is zero, and
* you can test for combinations of others with
* simple bit tests.
*/
static const char * const task_state_array[] = {
    
    
"R (running)", /* 0 */
"S (sleeping)", /* 1 */
"D (disk sleep)", /* 2 */
"T (stopped)", /* 4 */
"t (tracing stop)", /* 8 */
"X (dead)", /* 16 */
"Z (zombie)", /* 32 */
};

  • R running status (running): It does not mean that the process must be running, it indicates that the process is either running or in the run queue.
  • S sleeping state (sleeping): means that the process is waiting for the event to complete (sleep here is sometimes called interruptible sleep
  • D Disk sleep state (Disk sleep) is sometimes called uninterruptible sleep state (uninterruptible sleep). Processes in this state usually wait for the end of IO.
  • T stop state (stopped): A process can be stopped (T) by sending a SIGSTOP signal to the process. The suspended process can be resumed by sending the SIGCONT signal.
  • X dead state (dead): This state is just a return state, you will not see this state in the task list.
  • Z(zombie) - zombie process:

insert image description here

2. Zombie process

  1. Zombies are a special state. When the process exits and the parent process (using the wait() system call) does not read the return code of the child process exit, a zombie (corpse) process will be generated.
  2. A zombie process will remain in the process table in a terminated state, and will keep waiting for the parent process to read the exit status code.
  3. As long as the child process exits, the parent process is still running, but the parent process does not read the state of the child process, and the child process enters the Z state.

insert image description here

3. Orphan process

The parent process exits first, and the child process is called an "orphan process"

5. Process priority

Process priority:

  • The order of cpu resource allocation refers to the priority of the process.
  • Processes with higher priority have the right to execute first. Configuring process priority is useful for Linux in a multitasking environment and can improve system performance.
  • You can also run the process on a specified CPU. In this way, assigning unimportant processes to a certain CPU can greatly improve the overall performance of the system.
  • Priority is to be able to get certain resources, it is only a matter of sequence
  • Permissions determine whether you can or cannot get a certain resource
  • Priority is the order in which certain resources (CPU) are obtained, and its essence is because resources are limited (CPU

1. View process priority

insert image description here

  • UID : represents the identity of the executor
  • PID: represents the code name of this process
  • PPID: Represents which process this process is derived from, that is, the code name of the parent process
  • PRI: Represents the priority that this process can be executed, the smaller the value, the earlier it will be executed
  • NI: represents the nice value of this process
  1. The priority of Linux is determined by the value of pri and nice (the smaller the value of the priority, the higher the priority; the larger the value of the priority, the lower the priority)
  2. The nice value is the correction data of the priority, and the range is [-20,19]
  3. The priority cannot be blindly high, nor can it be blindly low (the scheduler of the operating system should properly consider the balance problem to avoid the "starvation problem")

2. Set the process priority

  • PRI is the priority of the process, or in layman's terms, it is the order in which programs are executed by the CPU. The smaller the value, the
    higher the priority of the process
  • NI is the nice value, which represents the modified value of the priority of the process that can be executed
  • The smaller the PRI value, the faster it will be executed, PRI(new)=PRI(old)+nice
  • When the nice value is negative, the priority value of the program will become smaller, that is, its priority will become higher, and the sooner it will be executed
  • Adjust the process priority, under Linux, is to adjust the process nice value
  • The value range of nice is -20 to 19, a total of 40 levels.
  • The nice value of a process is not the priority of the process, but the nice value of the process will affect the change of the priority of the process.
  • The nice value is the correction data of the process priority
    insert image description here
  • Competitiveness: There are a large number of system processes, but only a small amount of CPU resources, or even one, so the processes are competitive. In order to complete tasks efficiently and compete for related resources more reasonably, they have priorities
  • Independence: Multi-process operation requires exclusive use of various resources, and does not interfere with each other during multi-process operation
  • Parallelism: Multiple processes run separately and simultaneously under multiple CPUs, which is called parallelism
  • Concurrency: Multiple processes use process switching under one CPU to allow multiple processes to advance within a period of time, which is called concurrency

6. Environment variables and command line parameters

1. Basic concepts

Environment variables (environment variables) generally refer to some parameters used in the operating system to specify the operating environment of the operating system. For example: when we write C/C++ code, we never know where our linked dynamic and static libraries are when we link, but we can still link successfully and generate executable programs, because there are relevant environment variables to help the compiler to search.
Environment variables usually have some special purposes, and usually have global characteristics in the system

  • PATH : Specifies the search path for commands
  • HOME: Specify the user's main working directory (that is, the default directory when the user logs in to the Linux system)
  • SHELL : The current Shell, its value is usually /bin/bash.
  • View environment variable method: echo $NAME //NAME: your environment variable name

2. View environment variables

insert image description here
insert image description here

3. Environment variables usually have global attributes

Generally, there are two variables in the command line: local variables and environment variables.
Local variables: can only be accessed in the current shell command line interpreter, and cannot be inherited by child processes.
Environment variables: have "global attributes" and can be inherited by child processes.

  • echo: display the value of an environment variable
  • export: set a new environment variable
  • env: display all environment variables
  • unset: clear environment variables
  • set: display locally defined shell variables and environment variables

4. Command line parameters

insert image description here

  • Command line parameters can help us design different business functions in the same program
  • argv array of pointers, where the last element points to NULL

5. The organization of environment variables and how to obtain environment variables through code

Each program will receive an environment table, which is an array of character pointers, and each pointer points to an environment string ending with '\0'

insert image description here
In addition, you can use this way

#include <stdio.h>
int main(int argc, char *argv[])
{
    
    
 extern char **environ;
 int i = 0;
 for(; environ[i]; i++){
    
    
 printf("%s\n", environ[i]);
 }
 return 0;
}

7. Process address space

1. Distribution of process address space

insert image description here

  • Process address space is not memory address space
  • The process address space will exist throughout the life cycle of the process until the process exits

2. What is a process address space

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int g_val = 0;
int main()
{
    
    
	pid_t id = fork();
 	if(id < 0)
 	{
    
    
		perror("fork");
 		return 0;
 	}
 	else if(id == 0)
 	{
    
     
 		//child,子进程肯定先跑完,也就是子进程先修改,完成之后,父进程再读取
 		g_val=100;
 		printf("child[%d]: %d : %p\n", getpid(), g_val, &g_val);
 	}
 	else
 	{
    
     
 		//parent
 		sleep(3);
		printf("parent[%d]: %d : %p\n", getpid(), g_val, &g_val);
 	}
 	sleep(1);
 	return 0;
}
child[3046]: 100 : 0x80497e8
parent[3045]: 0 : 0x80497e8

The address of g_val in the parent-child process is actually the same

  • The address in any programming language is definitely not a physical address, but a virtual address (& in C++/C language gets a virtual address not a physical address)
  • The virtual address is provided by the operating system, and the data and code must be in the physical memory (von Neumann regulations), so it is necessary to convert the virtual memory into physical memory (automatically completed by the OS)
  • The parent-child process code is shared, and the data is privately owned (copy-on-write)
  • When all programs are running, the program immediately becomes a process

The essence of address space is the way a process looks at memory. It is an abstract concept. The kernel struct mm_struct thinks that it monopolizes system memory resources. The essence of area division: divide the linear address space into areas one by one, [start, end] virtual address essence: each address between [start, end] is called a
virtual
address
insert image description here

insert image description here


Summarize

Process management is one of the important functions of the operating system.
Strong beliefs win strong people and make them stronger. --Walter Becky

Guess you like

Origin blog.csdn.net/yanyongfu523/article/details/129808252