Operating System Practice 05 - File Descriptors and System Calls

Operating System Practice 05 - File Descriptors and System Calls

1. Concept

1.1 File descriptors

definition:

  • a non-negative integer;
  • Applications use file descriptors to access files;
  • file descriptor, abbreviated as fd.

When opening an existing file or creating a new file, the kernel returns a file descriptor; when opening an existing file or creating a new file, the kernel returns a file descriptor.

1.2 System calls

open a file

int open(char *path, int flags, mode_t mode);
  • The kernel will return a file descriptor fdto represent the file
  • When reading and writing, you need to use fdthe specified file to be read and written

read and write files

int read(int fd, void *buf, size_t size);
int write(int fd, void *buf, size_t size);

fdIt is the file descriptor returned by open, which is used to specify the file to be read and written.

1.3 Examples

// exe1.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
// 使用open系统调用需要包含以上三个头文件
#include <unistd.h>
// 使用read/write系统调用需要包含头文件unistd.h

int main()
{
    
    
    int fd;
    // O_RDONLY表示以只读方式打开
    fd = open("/etc/hosts", O_RDONLY); 

    char buf[1024];
    int count;
    // 从文件中读取数据存放到buf中,read返回实际读取的字节个数
    count = read(fd, buf, sizeof(buf));
	// 设置buf中的文本以0结尾,并打印
    buf[count] = 0;
    puts(buf);

    close(fd);
    return 0;
}

Compile and run.

$ cc -o exe1 exe1.c
$ ./exe1
127.0.0.1 localhost

2. Kernel implementation

2.1 file structure

Used by the kernel file结构体to represent an open file. The file structure stores the information of the opened file:

  • The index node corresponding to the file inode;
  • The current access location of the file;
  • Open mode of the file: read-only, write-only, readable and writable.

2.2 File Descriptor Table

The file descriptor table is an array:

  • The element type of the array is a pointer, and the pointer points to a file structure;
  • Used to save opened files.

When the kernel opens the file:

  • Allocate a file structure to represent the opened file;
  • Save the file structure pointer in the file descriptor table.

The process of opening a file is as follows:

insert image description here

  1. Find the corresponding file 索引节点inode;
  2. Allocate a file structure, the field of the file structure inodepoints to the inode in step 1, and the file access position field of the file structure is initialized to 0;
  3. Find a free item from the file descriptor table, point to the file structure in step 2, and return the subscript of the free item in the array, ie fd.

2.3 Process control block

A process control block is a data structure used by the operating system to represent the state of a process. Store all the information needed to describe the process and control the operation of the process:

  • process identification information;
  • Processor state;
  • Process scheduling information;
  • The open file list, that is, the file descriptor table, records the files opened by the process.

2.4 Private file descriptor table

insert image description here

The file descriptor table is private to the process:

  • Each process has a private file descriptor table;
  • If there are N processes in the operating system, there are correspondingly N file descriptor tables.

Two processes open different files, the file descriptor may be the same

  • Process A opens the file a.txt, and the return value of open is 3;
  • Process B opens the file b.txt, and the return value of open may also be 3.

Examples are as follows:

There are two files ac and bc in the current directory

// 程序a.c打开文件/etc/passwd
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

int main()
{
    
    
    int fd;

    fd = open("/etc/passwd", O_RDONLY); 
    printf("open(/etc/passwd) = %d\n", fd);
    close(fd);

    return 0;
}
// 程序b.c打开文件/etc/hosts
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

int main()
{
    
    
    int fd;

    fd = open("/etc/hosts", O_RDONLY); 
    printf("open(/etc/hosts) = %d\n", fd);
    close(fd);

    return 0;
}

Compile and run:

$ ls
a.c b.c
$ cc -o a.exe a.c
$ ./a.exe
open(/etc/passwd) = 3
$ cc -o b.exe b.c
$ ./b.exe
open(/etc/hosts) = 3

Although different files are opened, the returned file descriptors are the same.

3. Standard input and output

3.1 Introduction

insert image description here

When each process executes, three standard files are automatically opened:

  • The standard input file, usually corresponding to the keyboard of the terminal;

  • The standard output file, usually corresponding to the screen of the terminal;

  • The standard error output file, usually corresponding to the terminal screen.


The first three entries in the process's file descriptor table have been opened:

  • Item 0 corresponds to standard input;
  • Item 1 corresponds to standard output;
  • Item 2 corresponds to standard error output.

3.2 Predefined file descriptors

// exe2.c
#include <unistd.h>

int main()
{
    
    
    char buf[80];
    int count;
	// read返回读取字节的实际大小
    count = read(0, buf, sizeof(buf));
    buf[count] = 0;
    write(1, buf, count);
}

File descriptors 0 and 1 can be used directly. Read a line from file descriptor 0 and write the read content to file descriptor 1. Compile and run the program.

$ cc -o exe2 exe2.c
$ ./exe2
hello
hello

3.3 New open file

Verify that the newly opened file has a file descriptor of 3.

// exe3.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

int main()
{
    
    
    int fd;

    fd = open("/etc/hosts", O_RDONLY);  
    printf("open(/etc/hosts) = %d\n", fd);

    return 0;
}

Compile and run the program.

$ cc -o exe3 exe3.c
$ ./exe3
open(/etc/hosts) = 3

The first 3 items of the file descriptor table have been occupied, and the newly opened file descriptor must be 3.


// exe4.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

int main()
{
    
    
    int fd;
	// 先关闭预定义的文件描述符2,再打开新文件
    close(2);
    fd = open("/etc/hosts", O_RDONLY);  
    printf("open(/etc/hosts) = %d\n", fd);

    return 0;
}

Item 2 of the file descriptor table is free, and the expected newly opened file descriptor is 2. Compile and run the program.

$ cc -o exe4 exe4.c
$ ./exe4
open(/etc/hosts) = 2

4. Descriptor inheritance

4.1 fork system call

// 原型
#include <unistd.h>

pid_t fork(void);

Create a subprocess:

  • Create a separate address space for the child process;

  • Create a separate file descriptor table for the child process.


The child process copies the following attributes of the parent process:

  • The content of the code segment and data segment;

  • file descriptor table;

  • The child process inherits the file descriptors opened in the parent process.

4.2 Examples

The function dumpreads fdthe contents of the pointed to file and prints it.

// exe5.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

// 函数dump读取fd指向的文件内容并打印
void dump(int fd)
{
    
    
    char buf[128];
    int count;

    count = read(fd, buf, sizeof(buf));
    buf[count] = 0;
    puts(buf);
}
int main()
{
    
    
    pid_t pid;
    int fd;
	// 父进程打开文件/etc/passwd,返回fd
    fd = open("/etc/passwd", O_RDONLY);
    pid = fork();
    if (pid == 0)
        // 在子进程中使用dump显示文件内容
        dump(fd);
    return 0;
}

The child process inherits the file descriptor of the parent process fdand can use it. Compile and run the program.

$ cc -o exe5 exe5.c
$ ./exe5
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:s

The child process correctly prints the contents of the file /etc/passwd, indicating that the file descriptor opened by the parent process fdis valid in the child process.

5. System call dup

5.1 dup prototype

// 原型
#include <unistd.h>

int dup(int oldfd);

Function:

  • oldfdCreate a new file descriptor by duplicating the file descriptornewfd
  • newfdand oldfdpoint to the same file as

parameter:

  • oldfd: the copied file descriptor

return value:

  • If successful, returns the newly copied file descriptor;
  • Returns non-zero on failure.

Before dup: the first 3 items of the file descriptor table have been occupied, oldfdpointing to the second item (subscript) of the file descriptor table.

insert image description here

After dup: dup finds a free entry, that is, the third item (subscript) of the file descriptor table is free. So dup returns newfd3.

insert image description here

5.2 dup2 prototype

// 原型
#include <unistd.h>

int dup2(int oldfd, int newfd);

Function:

  • oldfdCreate a new file descriptor by copying the file descriptor newfd;

  • newfdand oldfdpoint to the same file.


parameter:

  • oldfd: the copied file descriptor;

  • newfd: The newly created file descriptor.


return value

  • If successful, returns the newly copied file descriptor;

  • Returns non-zero on failure.


5.3 Explicit output to log file

// exe6.c
#include <stdio.h>
#include <unistd.h>
#include <sys/stat.h>
#include <fcntl.h>

int main()
{
    
    
    int fd;
    // 在当前目录下创建一个文件log
    fd = open("log", O_CREAT|O_RDWR, 0666); 
    // 将字符串"hello"写到文件log中
    write(fd, "hello\n", 6);  
    close(fd);
    return 0;
}

Compile and run the program.

$ cc -o exe6 exe6.c
$ ./exe6
$ cat log
hello

5.4 Redirect to log file

insert image description here

The file descriptor table at the time of process initialization, the first 3 file descriptors have been opened.

// exe7.c
#include <stdio.h>
#include <unistd.h>
#include <sys/stat.h>
#include <fcntl.h>

int main()
{
    
    
    int fd;
    // 在当前目录下创建一个文件log
    fd = open("log.txt", O_CREAT|O_RDWR, 0666); 
    // 使用dup2将标准输出重定向到文件log, 文件描述符1是标准输出,fd指向log文件
    dup2(fd, 1); 
    // 标准输出已经定向到文件log,之后通过标准输出写文件log,不再需要使用fd,因此关闭fd
    close(fd); 
    // 将字符串"hello"写到标准输出,标准输出已经定向到文件log,最终输出保存到文件log
    write(1, "hello\n", 6); 
    return 0;
}

The file descriptor table after using open("log") is as follows, because the first 3 items of the file descriptor table have been occupied, so the newly opened file descriptor is 3.

insert image description here


The operation of dup2(fd,1) on the file descriptor table is as follows:

  • First close file descriptor 1;

insert image description here

  • Then point the file descriptor 1 to the file description fd

insert image description here


The file descriptor table after using close(fd) is as follows.

insert image description here


Compile and run the program.

$ cc -o exe7 exe7.c
$ ./exe7
$ cat log
hello

6. System call pipe

6.1 pipe prototype

// 原型
#include <unistd.h>

int pipe(int fd[2]);

Function: Create a readable and writable pipe, which has a read end and a write end.

Parameters: fd[0]the read end of the pipeline; fd[1]the write end of the pipeline.

Return value: If successful, return 0; if failed, return non-zero.

insert image description here

  1. Create a FIFO 队列for storing data.
  2. Create two file structures: the read end of the pipeline, which reads data from the first-in-first-out queue; the write end of the pipeline, which writes data to the first-in-first-out queue.
  3. Returns two file descriptors fd[0]and fd[1]: fd[0]points to the read end of the pipe; fd[1]points to the write end of the pipe.

6.2 Example 1

// exe8.c
#include <stdio.h>
#include <unistd.h>

int main()
{
    
    
    int fd[2];
    char buf[32];

    pipe(fd);
    // 通过write(fd[1])将字符串hello,发送给管道
    write(fd[1], "hello", 6); 
    // 通过read(fd[0])从管道中读取数据
    read(fd[0], buf, sizeof(buf)); 
    printf("Receive:%s\n", buf); 
    return 0;
}

Compile and run the program.

$ cc -o exe8 exe8.c
$ ./exe8
Receive:hello

insert image description here

6.3 Example 2

// exe9.c
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main()
{
    
    
    int pid;    
    int fd[2];
    char buf[32];
	
    // 先创建管道
    pipe(fd); 
    // 再创建子进程,子进程将继承文件描述符fd[0]和fd[1]
    pid = fork(); 
    if (pid == 0) {
    
    
        // child
        close(fd[0]); 
        // 子进程write(fd[1])将字符串hello写入管道
        write(fd[1], "hello", 6); 
        exit(0);
    }
    // parent
    close(fd[1]); 
    // 父进程read(fd[0])从管道中读取数据
    read(fd[0], buf, sizeof(buf)); 
    printf("Receive:%s\n", buf); 
    return 0;
}

Compile and run the program.

$ cc -o exe9 exe9.c
$ ./exe9
Receive:hello

The parent process uses pipe() to create the pipe. After using fork(), the child process copies the file descriptor table of the parent process

insert image description here


Child process close(fd[0]): Close the read end of the pipe and use the write end of the pipe. Parent process close(fd[1]): Close the write end of the pipe and use the read end of the pipe.

insert image description here

6.4 Example 3

// exe10.c
#include <stdio.h>
#include <unistd.h>

int main()
{
    
    
    int pid;    
    int fd[2];
    char buf[32];

    // 创建管道
    pipe(fd); 
    // 创建子进程,子进程将继承文件描述符fd[0]和fd[1]
    pid = fork();
    if (pid == 0) {
    
     
        // child
        // 子进程将标准输出定向到管道的写端(fd[1]),子进程使用标准输出将数据发送到父进程
        dup2(fd[1], 1); 
        close(fd[0]);
        close(fd[1]);
		
        // 子进程write(fd[1])将字符串hello写入管道
        write(1, "hello", 6); 
        exit(0);
    }
    // parent
    // 父进程将标准输入定向到管道的读端(fd[0])
    dup2(fd[0], 0); 
    close(fd[0]);
    close(fd[1]);

    // 父进程read(fd[0])从管道中读取数据
    read(0, buf, sizeof(buf)); 
    printf("Receive:%s\n", buf); 
    return 0;
}

Compile and run the program.

$ cc -o exe10 exe10.c
$ ./exe10
Receive:hello

The parent process uses pipe() to create the pipe. After using fork(), the child process copies the file descriptor table of the parent process

insert image description here


Child process dup2(fd[1], 1), directing standard output to the write end of the pipe fd[1]. The parent process dup2(fd[0], 0), which directs standard input to the read end of the pipe fd[0].

insert image description here


The child process close(fd[0])/close(fd[1])closes the read and write ends of the pipe. The parent process close(fd[0])/close(fd[1])closes the read and write ends of the pipe.

The parent and child processes are connected by pipes, and the standard output of the child process is connected to the standard input of the parent process.

insert image description here


The shell provides the pipe command, i.e. cat /etc/passwd | wc -lthe catstandard output of a command is connected to wcthe standard input of a command. There are 45 rows of statistics wc. /etc/passwdModify the program exe10.cto realize the above pipeline command.

// exe11.c
#include <stdio.h>
#include <unistd.h>

int main()
{
    
    
    int pid;    
    int fd[2];
    char buf[32];

    pipe(fd); 
    pid = fork();
    if (pid == 0) {
    
     
        // child
        dup2(fd[1], 1); 
        close(fd[0]);
        close(fd[1]);

        // write(1, "hello", 6); 
        // 执行cat命令将文件/etc/passwd的内容送往标准输出
        execlp("cat", "cat", "/etc/passwd", NULL);
        exit(0);
    }
    // parent
    dup2(fd[0], 0); 
    close(fd[0]);
    close(fd[1]);

    read(0, buf, sizeof(buf)); 
    // 执行wc命令将读取标准输入,统计行的个数
    execlp("wc", "wc", "-l", NULL);
    // printf("Receive:%s\n", buf); 
    return 0;
}

Compile and run the program.

$ cc -o exe11 exe11.c
$ ./exe11
45
$ cat /etc/passwd | wc -l
45

Consistent with the results of the pipeline command cat /etc/passwd | wc -l.

Guess you like

Origin blog.csdn.net/weixin_46003347/article/details/123614676