Linux Basic IO (2): In-depth understanding of Linux file descriptors

I. Introduction

 In the previous blog, we initially learned the system interface of Linux file operation. It is not difficult to find that these system functions are related tofile descriptor closeRelated: The return value of the open function is a file descriptor, and the write function, read function, and close function are all forfile descriptorto operate.
insert image description here
 This makes us can't help thinking, what exactly is a file descriptor? Let's start with the conclusion that the essence of a file descriptor is 数组下标. At this time, you will definitely have a lot of confusion, let me explain it to you in detail below.

Two, Linux standard file descriptor

[Question 1]: Why does fd always start from 3 when opening a file? What are 0, 1, and 2? Go
image-20221103143600412

straight to the conclusion: standard files with fd = 0~2 are opened by default , respectively expressed as

  • 0: standard input → keyboard
  • 1: stdout → monitor
  • 2: stderr → monitor

There is no proof, let's use the code to verify the above conclusion:

  1. Read data directly from standard input
    image-20221103145021094
  2. write data to standard input
    image-20221103145427113

[Summary 1]:

  1. The files with fd = 0~2 correspond to standard input, standard input and standard error respectively
  2. The standard file is opened by default, so fd will always increase from 3 (unless the standard file is manually closed)

3. The relationship between the file descriptor and the FILE structure

FILEIt is a file structure defined by C language, which contains various file information. One thing that is certain is that the FILE structure must be encapsulated fd. Why? Let’s look at the following analysis of ideas:

1. The inevitability of using the system interface
 files are stored on the disk and belong to peripherals. Who has access to the peripheral? Only the operating system. Because the operating system must provide stable services for the upper side, and manage various software and hardware resources for the lower side .
 If file operations can bypass the operating system, how does the operating system know whether a file has been created or destroyed, and how can it provide you with stable services? Based on the above simple understanding, it is not difficult for us to understand, To access hardware resources, it must go through the operating system . The operating system does not trust anyone
 from the perspective of security and reducing the cost of use . Just like a bank, it will not open its vaults directly to the public, but will only have a few business windows to provide services for everyone. The same is true for the operating system, and the window provided by the operating system is the system interface.  Through our logical deduction so far, we can already draw the following conclusion: To access peripherals, we must use the system interface provided by the operating system . Therefore, the essence of various file operation functions in C language is  the encapsulation of the system interface . The fd verification method is bound to be encapsulated in is also very simple and direct:



fd

image-20221103151323233

Fourth, the mapping relationship between processes and files

Before understanding the relationship between the two, let's answer a few questions first, so that everyone can have a basic understanding:

[Question 2]: What is the essence of opening?
 Answer: The essence is to load the file into memory. Why is it loaded into memory? This is determined by the von Neumann system

[Question 3]: Who is doing the opening, accessing, and closing of files?
 Answer: These operations are all done by calling functions. Is it our operation? Haha, it is simple to think so. When we compile and generate an executable file, is there any file operation? The answer is obviously no. Our program files will only execute the corresponding code when they are running, and then perform the corresponding file operations. So the essence of these operations will be byprocessCompleted. So file operations are essentially the connection between a process and an open file .

[Question 4]: How does the OS manage a large number of files?
 Answer: When the file is opened, the file exists in the memory, so of course there will be a large number of files in the memory. Does the operating system manage these files? The answer is yes. How to manage it? Obviously yesDescribe first, organize laterThe design idea of ​​:
 When a file is opened, the corresponding kernel data structure struct file must be created in the kernel (described first), and then each file is organized in the form of a linked list (organized later). The pseudocode form is as follows:

struct file
{
     
     
   // 文件内容和属性成员变量
   struct file* next;
   struct file* prev;
}

 With the above basic understanding, we will talk about the mapping relationship between processes and files based on the Linux kernel source code:

There is a file pointer of type task_structin the process control block , : there is   a pointer array of member type in , and each pointer variable in the array corresponds to the file opened by the process. So fd is essentially the subscript of the fd_array array . The structure represents various basic information of the file.  Using such a picture to sort out ideas for everyone:files_structfiles _struct
insert image description here
insert image description here
files_structfile* file*

insert image description here

[Question 5]: Allocation rules for file descriptors
 Answer: Traverse the fd_arr array from the beginning, find the smallest unused subscript, and allocate a new file. If we manually close the file with fd = 1, then the fd of the newly created file is equal to 1.

Guess you like

Origin blog.csdn.net/whc18858/article/details/127712384