Operating System Practice 05 - File Descriptors and System Calls
Article directory
1. Concept
1.1 File descriptors
definition:
- a non-negative integer;
- Applications use file descriptors to access files;
file descriptor
, abbreviated asfd
.
When opening an existing file or creating a new file, the kernel returns a file descriptor; when opening an existing file or creating a new file, the kernel returns a file descriptor.
1.2 System calls
open a file
int open(char *path, int flags, mode_t mode);
- The kernel will return a file descriptor
fd
to represent the file - When reading and writing, you need to use
fd
the specified file to be read and written
read and write files
int read(int fd, void *buf, size_t size);
int write(int fd, void *buf, size_t size);
fd
It is the file descriptor returned by open, which is used to specify the file to be read and written.
1.3 Examples
// exe1.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
// 使用open系统调用需要包含以上三个头文件
#include <unistd.h>
// 使用read/write系统调用需要包含头文件unistd.h
int main()
{
int fd;
// O_RDONLY表示以只读方式打开
fd = open("/etc/hosts", O_RDONLY);
char buf[1024];
int count;
// 从文件中读取数据存放到buf中,read返回实际读取的字节个数
count = read(fd, buf, sizeof(buf));
// 设置buf中的文本以0结尾,并打印
buf[count] = 0;
puts(buf);
close(fd);
return 0;
}
Compile and run.
$ cc -o exe1 exe1.c
$ ./exe1
127.0.0.1 localhost
2. Kernel implementation
2.1 file structure
Used by the kernel file结构体
to represent an open file. The file structure stores the information of the opened file:
- The index node corresponding to the file
inode
; - The current access location of the file;
- Open mode of the file: read-only, write-only, readable and writable.
2.2 File Descriptor Table
The file descriptor table is an array:
- The element type of the array is a pointer, and the pointer points to a file structure;
- Used to save opened files.
When the kernel opens the file:
- Allocate a file structure to represent the opened file;
- Save the file structure pointer in the file descriptor table.
The process of opening a file is as follows:
- Find the corresponding file
索引节点inode
; - Allocate a file structure, the field of the file structure
inode
points to the inode in step 1, and the file access position field of the file structure is initialized to 0; - Find a free item from the file descriptor table, point to the file structure in step 2, and return the subscript of the free item in the array, ie
fd
.
2.3 Process control block
A process control block is a data structure used by the operating system to represent the state of a process. Store all the information needed to describe the process and control the operation of the process:
- process identification information;
- Processor state;
- Process scheduling information;
- The open file list, that is, the file descriptor table, records the files opened by the process.
2.4 Private file descriptor table
The file descriptor table is private to the process:
- Each process has a private file descriptor table;
- If there are N processes in the operating system, there are correspondingly N file descriptor tables.
Two processes open different files, the file descriptor may be the same
- Process A opens the file a.txt, and the return value of open is 3;
- Process B opens the file b.txt, and the return value of open may also be 3.
Examples are as follows:
There are two files ac and bc in the current directory
// 程序a.c打开文件/etc/passwd
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
int main()
{
int fd;
fd = open("/etc/passwd", O_RDONLY);
printf("open(/etc/passwd) = %d\n", fd);
close(fd);
return 0;
}
// 程序b.c打开文件/etc/hosts
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
int main()
{
int fd;
fd = open("/etc/hosts", O_RDONLY);
printf("open(/etc/hosts) = %d\n", fd);
close(fd);
return 0;
}
Compile and run:
$ ls
a.c b.c
$ cc -o a.exe a.c
$ ./a.exe
open(/etc/passwd) = 3
$ cc -o b.exe b.c
$ ./b.exe
open(/etc/hosts) = 3
Although different files are opened, the returned file descriptors are the same.
3. Standard input and output
3.1 Introduction
When each process executes, three standard files are automatically opened:
-
The standard input file, usually corresponding to the keyboard of the terminal;
-
The standard output file, usually corresponding to the screen of the terminal;
-
The standard error output file, usually corresponding to the terminal screen.
The first three entries in the process's file descriptor table have been opened:
- Item 0 corresponds to standard input;
- Item 1 corresponds to standard output;
- Item 2 corresponds to standard error output.
3.2 Predefined file descriptors
// exe2.c
#include <unistd.h>
int main()
{
char buf[80];
int count;
// read返回读取字节的实际大小
count = read(0, buf, sizeof(buf));
buf[count] = 0;
write(1, buf, count);
}
File descriptors 0 and 1 can be used directly. Read a line from file descriptor 0 and write the read content to file descriptor 1. Compile and run the program.
$ cc -o exe2 exe2.c
$ ./exe2
hello
hello
3.3 New open file
Verify that the newly opened file has a file descriptor of 3.
// exe3.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
int main()
{
int fd;
fd = open("/etc/hosts", O_RDONLY);
printf("open(/etc/hosts) = %d\n", fd);
return 0;
}
Compile and run the program.
$ cc -o exe3 exe3.c
$ ./exe3
open(/etc/hosts) = 3
The first 3 items of the file descriptor table have been occupied, and the newly opened file descriptor must be 3.
// exe4.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
int main()
{
int fd;
// 先关闭预定义的文件描述符2,再打开新文件
close(2);
fd = open("/etc/hosts", O_RDONLY);
printf("open(/etc/hosts) = %d\n", fd);
return 0;
}
Item 2 of the file descriptor table is free, and the expected newly opened file descriptor is 2. Compile and run the program.
$ cc -o exe4 exe4.c
$ ./exe4
open(/etc/hosts) = 2
4. Descriptor inheritance
4.1 fork system call
// 原型
#include <unistd.h>
pid_t fork(void);
Create a subprocess:
-
Create a separate address space for the child process;
-
Create a separate file descriptor table for the child process.
The child process copies the following attributes of the parent process:
-
The content of the code segment and data segment;
-
file descriptor table;
-
The child process inherits the file descriptors opened in the parent process.
4.2 Examples
The function dump
reads fd
the contents of the pointed to file and prints it.
// exe5.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
// 函数dump读取fd指向的文件内容并打印
void dump(int fd)
{
char buf[128];
int count;
count = read(fd, buf, sizeof(buf));
buf[count] = 0;
puts(buf);
}
int main()
{
pid_t pid;
int fd;
// 父进程打开文件/etc/passwd,返回fd
fd = open("/etc/passwd", O_RDONLY);
pid = fork();
if (pid == 0)
// 在子进程中使用dump显示文件内容
dump(fd);
return 0;
}
The child process inherits the file descriptor of the parent process fd
and can use it. Compile and run the program.
$ cc -o exe5 exe5.c
$ ./exe5
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:s
The child process correctly prints the contents of the file /etc/passwd, indicating that the file descriptor opened by the parent process fd
is valid in the child process.
5. System call dup
5.1 dup prototype
// 原型
#include <unistd.h>
int dup(int oldfd);
Function:
oldfd
Create a new file descriptor by duplicating the file descriptornewfd
newfd
andoldfd
point to the same file as
parameter:
- oldfd: the copied file descriptor
return value:
- If successful, returns the newly copied file descriptor;
- Returns non-zero on failure.
Before dup: the first 3 items of the file descriptor table have been occupied, oldfd
pointing to the second item (subscript) of the file descriptor table.
After dup: dup finds a free entry, that is, the third item (subscript) of the file descriptor table is free. So dup returns newfd
3.
5.2 dup2 prototype
// 原型
#include <unistd.h>
int dup2(int oldfd, int newfd);
Function:
-
oldfd
Create a new file descriptor by copying the file descriptornewfd
; -
newfd
andoldfd
point to the same file.
parameter:
-
oldfd
: the copied file descriptor; -
newfd
: The newly created file descriptor.
return value
-
If successful, returns the newly copied file descriptor;
-
Returns non-zero on failure.
5.3 Explicit output to log file
// exe6.c
#include <stdio.h>
#include <unistd.h>
#include <sys/stat.h>
#include <fcntl.h>
int main()
{
int fd;
// 在当前目录下创建一个文件log
fd = open("log", O_CREAT|O_RDWR, 0666);
// 将字符串"hello"写到文件log中
write(fd, "hello\n", 6);
close(fd);
return 0;
}
Compile and run the program.
$ cc -o exe6 exe6.c
$ ./exe6
$ cat log
hello
5.4 Redirect to log file
The file descriptor table at the time of process initialization, the first 3 file descriptors have been opened.
// exe7.c
#include <stdio.h>
#include <unistd.h>
#include <sys/stat.h>
#include <fcntl.h>
int main()
{
int fd;
// 在当前目录下创建一个文件log
fd = open("log.txt", O_CREAT|O_RDWR, 0666);
// 使用dup2将标准输出重定向到文件log, 文件描述符1是标准输出,fd指向log文件
dup2(fd, 1);
// 标准输出已经定向到文件log,之后通过标准输出写文件log,不再需要使用fd,因此关闭fd
close(fd);
// 将字符串"hello"写到标准输出,标准输出已经定向到文件log,最终输出保存到文件log
write(1, "hello\n", 6);
return 0;
}
The file descriptor table after using open("log") is as follows, because the first 3 items of the file descriptor table have been occupied, so the newly opened file descriptor is 3.
The operation of dup2(fd,1) on the file descriptor table is as follows:
- First close file descriptor 1;
- Then point the file descriptor 1 to the file description fd
The file descriptor table after using close(fd) is as follows.
Compile and run the program.
$ cc -o exe7 exe7.c
$ ./exe7
$ cat log
hello
6. System call pipe
6.1 pipe prototype
// 原型
#include <unistd.h>
int pipe(int fd[2]);
Function: Create a readable and writable pipe, which has a read end and a write end.
Parameters: fd[0]
the read end of the pipeline; fd[1]
the write end of the pipeline.
Return value: If successful, return 0; if failed, return non-zero.
- Create a FIFO
队列
for storing data. - Create two file structures: the read end of the pipeline, which reads data from the first-in-first-out queue; the write end of the pipeline, which writes data to the first-in-first-out queue.
- Returns two file descriptors
fd[0]
andfd[1]
:fd[0]
points to the read end of the pipe;fd[1]
points to the write end of the pipe.
6.2 Example 1
// exe8.c
#include <stdio.h>
#include <unistd.h>
int main()
{
int fd[2];
char buf[32];
pipe(fd);
// 通过write(fd[1])将字符串hello,发送给管道
write(fd[1], "hello", 6);
// 通过read(fd[0])从管道中读取数据
read(fd[0], buf, sizeof(buf));
printf("Receive:%s\n", buf);
return 0;
}
Compile and run the program.
$ cc -o exe8 exe8.c
$ ./exe8
Receive:hello
6.3 Example 2
// exe9.c
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int main()
{
int pid;
int fd[2];
char buf[32];
// 先创建管道
pipe(fd);
// 再创建子进程,子进程将继承文件描述符fd[0]和fd[1]
pid = fork();
if (pid == 0) {
// child
close(fd[0]);
// 子进程write(fd[1])将字符串hello写入管道
write(fd[1], "hello", 6);
exit(0);
}
// parent
close(fd[1]);
// 父进程read(fd[0])从管道中读取数据
read(fd[0], buf, sizeof(buf));
printf("Receive:%s\n", buf);
return 0;
}
Compile and run the program.
$ cc -o exe9 exe9.c
$ ./exe9
Receive:hello
The parent process uses pipe() to create the pipe. After using fork(), the child process copies the file descriptor table of the parent process
Child process close(fd[0])
: Close the read end of the pipe and use the write end of the pipe. Parent process close(fd[1])
: Close the write end of the pipe and use the read end of the pipe.
6.4 Example 3
// exe10.c
#include <stdio.h>
#include <unistd.h>
int main()
{
int pid;
int fd[2];
char buf[32];
// 创建管道
pipe(fd);
// 创建子进程,子进程将继承文件描述符fd[0]和fd[1]
pid = fork();
if (pid == 0) {
// child
// 子进程将标准输出定向到管道的写端(fd[1]),子进程使用标准输出将数据发送到父进程
dup2(fd[1], 1);
close(fd[0]);
close(fd[1]);
// 子进程write(fd[1])将字符串hello写入管道
write(1, "hello", 6);
exit(0);
}
// parent
// 父进程将标准输入定向到管道的读端(fd[0])
dup2(fd[0], 0);
close(fd[0]);
close(fd[1]);
// 父进程read(fd[0])从管道中读取数据
read(0, buf, sizeof(buf));
printf("Receive:%s\n", buf);
return 0;
}
Compile and run the program.
$ cc -o exe10 exe10.c
$ ./exe10
Receive:hello
The parent process uses pipe() to create the pipe. After using fork(), the child process copies the file descriptor table of the parent process
Child process dup2(fd[1], 1)
, directing standard output to the write end of the pipe fd[1]
. The parent process dup2(fd[0], 0)
, which directs standard input to the read end of the pipe fd[0]
.
The child process close(fd[0])/close(fd[1])
closes the read and write ends of the pipe. The parent process close(fd[0])/close(fd[1])
closes the read and write ends of the pipe.
The parent and child processes are connected by pipes, and the standard output of the child process is connected to the standard input of the parent process.
The shell provides the pipe command, i.e. cat /etc/passwd | wc -l
the cat
standard output of a command is connected to wc
the standard input of a command. There are 45 rows of statistics wc
. /etc/passwd
Modify the program exe10.c
to realize the above pipeline command.
// exe11.c
#include <stdio.h>
#include <unistd.h>
int main()
{
int pid;
int fd[2];
char buf[32];
pipe(fd);
pid = fork();
if (pid == 0) {
// child
dup2(fd[1], 1);
close(fd[0]);
close(fd[1]);
// write(1, "hello", 6);
// 执行cat命令将文件/etc/passwd的内容送往标准输出
execlp("cat", "cat", "/etc/passwd", NULL);
exit(0);
}
// parent
dup2(fd[0], 0);
close(fd[0]);
close(fd[1]);
read(0, buf, sizeof(buf));
// 执行wc命令将读取标准输入,统计行的个数
execlp("wc", "wc", "-l", NULL);
// printf("Receive:%s\n", buf);
return 0;
}
Compile and run the program.
$ cc -o exe11 exe11.c
$ ./exe11
45
$ cat /etc/passwd | wc -l
45
Consistent with the results of the pipeline command cat /etc/passwd | wc -l
.