"Introduction to Operating System" study notes (2): CPU virtualization (process)

Study Notes on "Introduction to Operating System" (1): Overview of Operating System

Program creation process


Program: A collection of instructions and data, generally stored as a target file on disk.
Process: A program's running activity on a certain data set is the basic unit of resource allocation and scheduling by the system.
The executable program is located on the disk, and the static program needs to be loaded into the memory to generate a dynamic process before the CPU can continuously fetch and execute instructions. The program mainly contains code and static data in the hard disk, and by contrast, the process seems to have more heap and stack. Not only that, each process also has its own ID card-process control block. How did these come about? The following will explain one by one from the virtual address space.

Virtual address space

Insert picture description here
Virtual memory: A method of memory management. A space on the disk is divided and managed by the operating system. When the physical memory is exhausted, it can be used as physical memory. Virtual memory can be mapped to physical memory through a page table.

Virtual address space: In a multitasking operating system, each process runs in its own memory sandbox, which is the virtual address space (virtual address space). The virtual address space is composed of kernel space (kernel space) and user space (user space).
Insert picture description here

2. Process control block

When the operating system creates a process, it will be equipped with a process control block (PCB) in the kernel space, which
contains a data structure that describes the current situation of the process and manages all the information of the process. The process control block in the Linux operating system is actually a task_struct structure, placed in sched.h, briefly introduced below.
Insert picture description here
(1) Process state: The common running state of a process includes Ready, Blocked, and Running.

enum proc_state { READY, RUNNING, READY };

Insert picture description here
Ready: The process has all the resources needed to run, waiting to allocate CPU.
Running: The process is running on the CPU, that is, the CPU is running the instructions contained in the process.
Blocked / Waiting: Requests for events other than CPU (such as I / O requests) that occur while the process is running, giving up CPU usage.
Simulation Homework: Simulation process state transition

(2) Process identifier (process identifier / number): a unique identifier describing this process, used to distinguish other processes.

int pid;		// Process ID

(3) Program counter (program counter) and register information (registers): the entry address and information to be saved when the process is switched

// the registers xv6 will save and restore
// to stop and subsequently restart a process
struct context {
  int eip;		// 程序计数器(PC),存放下一个CPU指令存放的内存地址
  int ebx;		// 基址寄存器, 在内存寻址时存放基地址
  int ecx;		// 计数器(counter),loop循环的内定计数器
  int edx;		// 用来放整数除法产生的余数
  int esi;		// 源变址寄存器
  int edi;		// 目的变址寄存器
  int esp;		// 栈指针寄存器
  int ebp;		// 基址指针寄存器
};

(4) Memory limits (memory limits)

char *mem;		// Start of process memory
uint sz;		// Size of process memory

(5) Open the file list

struct file *ofile[NOFILE]; // Open files

(6) Process pointer: Pointers are used to link process control blocks to each other. The
Insert picture description here
process list is organized by the system setting the ready queue head pointer, blocking queue head pointer, and running queue head pointer, and hangs the process PCB according to the state of the process. Form a queue after the corresponding head pointer

3. User space partition

The source program contains code and data, and the executable file generated by linking is compiled in assembly form and loaded into memory as a binary file. The program accesses data through variables, but there is no concept of variables under binary, and data can only be accessed through memory addresses. If all variables are accessed by address, this is inefficient and unrealistic. Therefore, it is necessary to partition according to the nature of different variables. For example, the content of local variables is short and needs to be accessed frequently, but the life cycle is very short. Usually only survive in one method, a small area is specifically divided from memory and named stack ), Allocated and recovered by the compiler, high efficiency. Another example is that a larger structure may not need to be accessed too frequently, but it has a long life cycle. It is usually used in many methods, and another large area is designated as a heap, which is allocated by the programmer. Recycle.
Insert picture description here
Insert picture description here

4. Create Process

Insert picture description here
When creating a process, the operating system generates a unique process control block (PCB) for the process and hangs it in the process queue. A space is opened in the memory to store the program variables and code, the input parameters are loaded into the stack argc/argv, the register content is cleared, and the main()entry address is sent to the program The counter PC; the CPU executes main()the instructions of the program and returnreturns to the operating system when it encounters a statement; the operating system releases the process content and removes the process from the process queue.

Explain clearly how the program becomes a process. Let's talk about how to operate the process in the process-API.

Process creation process-interface API / C library functions

1. Create a child process fork ()

(1) Header file

#include <unistd.h>

(2) Function prototype

#define int pid_t 
pid_t fork( void);

Return value: Two values ​​are returned after a successful call, the parent process returns the PID of the child process, and the child process returns 0; -1 is returned if the call fails.

(3) Function description The
main function main () will automatically create a process when it runs, called the parent process; the fork () system call is used to create a new process, called the child process. When creating a child process, the child process will have its own process control block (task_struct) in the kernel, thus having a PID different from the parent process. But at the same time, the child process copies the rest of the parent process (stack, code segment, etc.), and fork () is saved as a system call in the stack of the parent process and the child process, so it will return twice, resulting in two return values.
Insert picture description here

// p1.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
    printf("hello world (pid:%d)\n", (int) getpid());
    int rc = fork();
    if (rc < 0) {			// 调用失败退出
        fprintf(stderr, "fork failed\n");
        exit(1);
    } else if (rc == 0) {	// 子进程内rc=0
        printf("hello, I am child (pid:%d)\n", (int) getpid());
    } else {				// 父进程内rc为子进程ID  getpid()为父进程ID
        printf("hello, I am parent of %d (pid:%d)\n", rc, (int) getpid());
    }
    return 0;
}

Insert picture description here
The PID of the parent process is 3838, and the PID of the child process is 3827. After the child process is created, the two processes will execute the next instruction after the fork () system call, so they will not be output hello world, but the following two lines can be output according to different return values ​​of rc.

2. Block the current process wait ()

(1) Header file

#include <sys/wait.h>

(2) Function prototype

#define int pid_t 
pid_t wait (int * status);

Parameters: When status is not NULL, the end status value of the child process will be returned by the parameter status; if you do not care about the end status of the child process, you can set status to NULL.
Return Value: The PID of the child process is returned if the execution is successful, and -1 is returned if it fails.

(3) Function description
When wait () is not used, the parent process and the child process will run at the same time; after using wait (), the parent process will wait for the child process to complete and return to the child process PID before executing.

// p2.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main(int argc, char *argv[])
{
    printf("hello world (pid:%d)\n", (int) getpid());
    int rc = fork();
    if (rc < 0) {			// 调用失败退出
        fprintf(stderr, "fork failed\n");
        exit(1);
    } else if (rc == 0) {	// 子进程内rc=0
        printf("hello, I am child (pid:%d)\n", (int) getpid());
	    sleep(1);			// 等待一段时间再退出当前进程
    } else {				// wc为子进程PID
        int wc = wait(NULL);
        printf("hello, I am parent of %d (wc:%d) (pid:%d)\n",
	       rc, wc, (int) getpid());
    }
    return 0;
}

Insert picture description here
The PID of the parent process is 838, the PID of the child process is 841, and the return value of wait () is wc = 841.

3. Exec () function cluster

(1) Header file

#include <unistd.h>

(2) The function prototype
exec refers to a group of function families. There is no specific exec (). Now we choose execvp () as an example.

int execvp(const char *file, char *const argv[]);

Parameters: file name for the file needs to run, argv [] is a list of input parameters
Return value: the successful implementation of the function does not return, directly or -1 on failure

(3) Function description The
exec () function cluster can make the child process get rid of the similarity of the content of the parent process and execute a completely different program.

// p3.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>

int main(int argc, char *argv[])
{
    printf("hello world (pid:%d)\n", (int) getpid());
    int rc = fork();
    if (rc < 0) {
        fprintf(stderr, "fork failed\n");
        exit(1);
    } else if (rc == 0) {
        printf("hello, I am child (pid:%d)\n", (int) getpid());
        char *myargs[3];			// strdup()字符串拷贝库函数
        myargs[0] = strdup("wc");   // 程序: "wc" (字符统计)
        myargs[1] = strdup("p3.c"); // 参数: 需要统计的文件
        myargs[2] = NULL;           // 命令行结束标志
        execvp(myargs[0], myargs);  // 统计行、单词、字节数
        printf("this shouldn't print out");
    } else {
        int wc = wait(NULL);
        printf("hello, I am parent of %d (wc:%d) (pid:%d)\n",
	       rc, wc, (int) getpid());
    }
    return 0;
}

Insert picture description here
The child process reloads the character statistics program, counting p3.c's line count 32, word count 123, and byte count 966.

4. What is the use of the interface API?

In computer science, Shell is commonly known as a shell (used to distinguish it from a core) and refers to software (command parser) that provides an operating interface for users, such as cmd under windows and bash under unix. When the shell is opened, the shell is equivalent to the parent process. The shell can accept commands, and then use fork () to create a child process to execute the command. After the execution is completed, the child process returns to the shell to accept the next command.
(1) Input from the console wc p3.c > newfile.rtfwill create the p3.cnumber of lines, words and bytes counted by the child process in the background and write it newfile.rtf.
Insert picture description here
Insert picture description here
(2) When inputting ./p4in the console , the p4.cnumber of lines, words, and bytes counted by the subprocess will be created in the background , and then created p4.outputand written, and the input cat p4.outputcan display the content.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <assert.h>
#include <sys/wait.h>

int main(int argc, char *argv[])
{
    int rc = fork();
    if (rc < 0) {
        fprintf(stderr, "fork failed\n");
        exit(1);
    } else if (rc == 0) {
		// 重定向输出到文件
		close(STDOUT_FILENO); 
		open("./p4.output", O_CREAT|O_WRONLY|O_TRUNC, S_IRWXU);

		// 重载"wc"程序
        char *myargs[3];			// strdup()字符串拷贝库函数
        myargs[0] = strdup("wc");   // 程序: "wc" (字符统计)
        myargs[1] = strdup("p4.c"); // 参数: 需要统计的文件
        myargs[2] = NULL;           // 命令行结束标志
        execvp(myargs[0], myargs);  // 统计行、单词、字节数
    } else {
        int wc = wait(NULL);
		assert(wc >= 0);
    }
    return 0;
}

Insert picture description here
Code Homework: Use of Process API

5. System command, interface API and system call relationship

System call: A set of "special" interfaces that the operating system provides for user program calls. User programs can obtain services provided by the operating system kernel through this set of "special" interfaces. For example, the user can create a process by calling sys_fork () through the process system.

Application Programming Interface (API): A function interface that programmers can use directly in user space. It is a predefined function, such as a fork () function, which provides an application with the ability to access a set of system calls.

System command: The system command is a layer higher than the API. It is actually an executable program. It internally references the user programming interface (API) to achieve the corresponding function. The
Insert picture description here
Insert picture description here
process is as follows: use the system command to gcc -o p1 p1.c -Wall -Werrorcall the gcc compiler to compile Generate an executable program p1, then use the system command to ./p1execute the program creation process, run the API function in the process, and find the system call creation subprocess corresponding to the kernel space fork()according to frok()the system call number sys_fork().

"Introduction to Operating System" study notes (3): CPU virtualization (mechanism)

Published 21 original articles · praised 8 · visits 1495

Guess you like

Origin blog.csdn.net/K_Xin/article/details/104636421
Recommended