[Ostep] 02 Virtualized CPU-Process

Process

Process abstraction

Process is the most basic abstraction.

The informal definition of a process is very simple: a process is a running program. The program itself has no life cycle, it is just some instructions (or some static data) on the disk. It is the operating system that makes these bytes run and the program works.

The operating system decides when to make the CPU run and where the instructions are. By constantly switching the instructions of different programs in the memory, the class abstracts the illusion of executing multiple processes at the same time.

You can naturally think of the IO interrupt mode in the original group. It uses a callback-like method to make the CPU interrupt the currently running program, turn off the interrupt, push the breakpoint address on the stack, turn on the interrupt, and jump to the interrupt vector. The memory space pointed to, the interrupt service will save the current scene , such as the status of the register, etc., and the scene will be restored after the service ends .

This saving and restoring is like the context switch of the operating system to the process.

In order to figure out what we want to save and restore, we must figure out what a process will use, or in the abstraction of the operating system, what constitutes a process. Here is a noun to describe him- machine state .

The machine state includes main memory and register state. The main memory state is the main memory space used by the process, and the process also uses registers (and special registers such as PC).

Strategy (policy) and mechanism (mechanism), when implementing the operating system, the strategy and mechanism will be divided into two modules. It can be understood that the mechanism is the details and components of the strategy. For example, in the scheduling policy of a program, the context switching operation is called a mechanism, and the strategy is to select which process to perform context switching.

Describe the process

data structure

The proc structure of xv6:

// the registers xv6 will save and restore$
// to stop and subsequently restart a process
struct context {
    
     
    int eip; 
    int esp; 
    int ebx; 
    int ecx; 
    int edx; 
    int esi;
    int edi; 
    int ebp;
};

// the different states a process can be in 
enum proc_state {
    
     UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE };

// the information xv6 tracks about each process 
// including its register context and state 
struct proc {
    
     
    char *mem;                  // Start of process memory
    uint sz;                    // Size of process memory
    char *kstack;               // Bottom of kernel stack ocessfor this process
    enum proc_state state;      // Process state 
    int pid;                    // Process ID
    struct proc *parent;        // Parent process
    void *chan;                 // If non-zero, sleeping on chan
    int killed;                 // If non-zero, have been killed
    struct file *ofile[NOFILE]; // Open files
    struct inode *cwd;          // Current directory 
    struct context context;     // Switch here to run process
    struct trapframe *tf;       // Trap frame for the // current interrupt
};

Process status

A process has many states, mainly: running (running), ready (ready) and blocked (blocked).

The OS will provide some APIs for thread operations, and they will at least include these: create (create), destroy (destroy), wait (wait), state (state) and other control interfaces (miscellaneous control).

Process creation

The OS loads the code and static data into the main memory space, but before that, it also needs to allocate the process stack space (run time stack).

There are some other initialization tasks, such as IO. In UNIX, each process has three file handles by default for input, output, and error.

Finally, by jumping to the entry address of the process, the CPU starts to execute the next machine instructions.

system call instance

fork

Fork is a system call used to create a process under Linux. (Note that it is a process, not a thread)

This interface is a bit strange, but it is very consistent with the original intent of fork-fork, when it is called in a process, the current process and the child process will return from the fork call, but the return value of the child process is 0, and the parent process returns to the child The process id of the process (when the return value is less than 0, it means an error occurred).

Here is a simplest creation process package:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int createThread(void(*callback)(void)) {
    
    
    fflush(stdout); fflush(stdin); fflush(stderr);
    int rc = fork();
    if (rc < 0) {
    
    
        exit(1);
    } else if (rc == 0) {
    
     
        callback(); exit(0);
    }
    return rc;
}

Since the buffer is completely copied when the process is created, in order to avoid unnecessary things in the child process buffer, we refresh it first.

Then we call fork. After the main process returns from fork, it does not enter any of the following conditional branches, while the child process will enter the second conditional branch and is destroyed after calling callback.

The destruction of the child process here is very important, otherwise the child process will return from the createThread function and execute the logic of the main process. Unless we have to do this, it is better to destroy it directly.

wait

wait is used to wait for the end of a child process. The waited child process is waited for in sequential calls according to the creation order. After the child process ends, the parent process will return from the wait call.

There is also a waitpid, which can provide a specific pid to wait, and specifically check man.

exec

exec is used to execute other programs. After calling exec, the current process will load code and static data from a given executable program, overwrite its own code segments and static data, the stack will be reinitialized, and the parameters Pass to the process and start execution.

Example:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include "thread.h"

void thread() {
    
    
    execl("/bin/ls", "ls", "-al", NULL);
}

int main(int argc, char* argv[]) {
    
    
    createThread(thread);
    return 0;
}

exec series of system calls:exec[v|l][p][e]

v, which means that the given parameters are given in char*;

l indicates that the parameter list is given in char**.

p, means that programs and commands will be searched from the path variable without giving the full path.

e, means the use of new environment variables.

Example:

char* args[] = {
    
    "ls", "-al", NULL};
char *env[] = {
    
    "AA=aa","BB=bb",NULL};

execv("/bin/ls", args);
execvp("ls", args);
execvpe("ls", args, env);

execl("/bin/ls", "ls", "-al", NULL);
execlp("ls", "ls", "-al", NULL);
execle("ls", "ls", "-al", NULL, env);

The principle of shell implementation is usually: first create a child process through fork, then call exec in the child process to overwrite the current process, and finally wait for the end of the child process through wait.

Fork and exec allow us to do some interesting work before creating another subprocess program, such as redirecting output or input streams to a file:

$ wc ./thread.c > out

pipe

Pipe is a pipe, similar to a queue structure, which can realize cross-process communication (data sharing between processes is limited).

c Prototype:

int pipe(int[2] fd);

Return -1 means error, 0 means success.

After the call returns, the memory space pointed to by fd will be written into two file handles (actually located in memory) in sequence, one for reading and the other for writing, and almost any standard I/O function can be used for processing.

For example work8.c:

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include "thread.h"

// 管道句柄
static int fd[2], readPipe, writePipe;

// 子进程 1
void child1() {
    
    
    char buf[1024] = {
    
    0};
    printf("child1 in:");
    fflush(stdout);

    while (scanf("%s", buf) != -1) {
    
    
        // 从 stdin 读,写入 pipe
        write(writePipe, buf, sizeof(buf));
        memset(buf, 0, sizeof(buf));

        // 让出 cpu,让子进程 2 输出
        // (这种做法并不稳定,应该通过进程共享内存实现进程同步锁,这里不展开)
        sleep(0);
        printf("child1 in: ");
        fflush(stdout);
    }
}

// 子进程 2
void child2() {
    
    
    char buf[1024];
    while (read(readPipe, buf, sizeof(buf))) {
    
    
        printf("child2 out: ");
        printf(buf);
        printf("\n");
        fflush(stdout);
    }
}

int main() {
    
    
    // 创建管道
    if (pipe(fd) != -1) {
    
    
        readPipe = fd[0];
        writePipe = fd[1];
    } else {
    
    
        printf("error");
        exit(1);
    }

    // 创建进程
    createThread(child1);
    createThread(child2);

    // 主进程守护进程 1
    wait();

    return 0;
}

It will be blocked when reading and writing. For the reader, it will wait if there is no data. For the writer, if it has been written and has not been taken by the reader, it will wait.

dup and dup2

Here are two interesting functions dupand dup2:

int dup (int oldfd)
int dup2 (int oldfd, int newfd)

A special syntax is allowed in the shell to direct the stdout of the child process to be run to a file,

E.g:

$ ls -alh > out.txt

Then we look at out.txt:

The output of ls is directed to out.txt.

$ cat out.txt
总用量 72K
drwxrwxr-x 2 devgaolihai devgaolihai 4.0K 12月 31 18:21 .
drwxrwxr-x 6 devgaolihai devgaolihai 4.0K 12月 31 17:58 ..
-rw-rw-r-- 1 devgaolihai devgaolihai  535 12月 28 18:38 05_fork.c
-rwxrwxr-x 1 devgaolihai devgaolihai  17K 12月 31 18:15 a.out
-rw-rw-r-- 1 devgaolihai devgaolihai  683 12月 31 18:16 dup.c
-rw-rw-r-- 1 devgaolihai devgaolihai    0 12月 31 18:21 out.txt
-rw-rw-r-- 1 devgaolihai devgaolihai  283 12月 31 15:48 thread.h
-rw-rw-r-- 1 devgaolihai devgaolihai  314 12月 29 22:19 threadTest.c
-rw-rw-r-- 1 devgaolihai devgaolihai  435 12月 30 21:55 work1.c
-rw-rw-r-- 1 devgaolihai devgaolihai  372 12月 31 17:35 work2.c
-rw-rw-r-- 1 devgaolihai devgaolihai  586 12月 31 17:11 work3.c
-rw-rw-r-- 1 devgaolihai devgaolihai  884 12月 31 17:29 work4.c
-rw-rw-r-- 1 devgaolihai devgaolihai  184 12月 31 17:35 work7.c
-rw-rw-r-- 1 devgaolihai devgaolihai 1.2K 12月 31 17:07 work8.c
-rwx------ 1 devgaolihai devgaolihai    9 12月 31 18:15 work8_test.txt

This function can be dup2achieved, an example is given below:

dup2 will map the file represented by the second parameter to the file represented by the first parameter.

#include <fcntl.h>
#include <stdio.h>
#include "thread.h"

// 子进程
void child() {
    
    
    printf("child pro");
}

int main() {
    
    
    // 相当于将 STDOUT_FILENO 变成 target
    // 以后对 STDOUT_FILENO 的操作全部变成对 target 的操作
    int target = open("./dup_out.txt", O_CREAT | O_TRUNC | O_RDWR, 0664);
    dup2(target, STDOUT_FILENO);
    createThread(child);

    return 0;
}

Instead, dupit returns another file handle of the file represented by the parameter.

Guess you like

Origin blog.csdn.net/qq_16181837/article/details/112295623