Linux system calls open, write, read, close, and related summaries

When learning the C language, we learned about some IO operations related to the C language, such as fopen, fwrite, fread, fprintf, fclose and other related functions. They are all functions provided by the C library functions. Calls are encapsulated. Although Linux is implemented by C language, in order to make us understand Linux better, we need to understand some IO operations that are closer to the bottom layer, so we need to understand the basic system calls - open, write, read , close

First, let's understand the system calls of open, write, read, and close

open

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int open(const char *pathname, int flags, mode_t mode);
  • 1
  • 2
  • 3
  • 4
  • 5

open has three parameters 
pathname: the target file name to be opened or created 
flags: there are multiple parameters for multiple operations on the file, and these multiple parameters can be ORed, that is, the flags 
parameter:

  1. O_RDONLY: read-only open
  2. O_WRONLY: open for write only
  3. O_RDWR: read, write open
  4. O_CREAT: If the file does not exist, create the file
  5. O_APPEND: Append write

Parameter 1, 2, 3, one and only one must be specified, when parameter 4 is used, the third parameter mode of open must be used: the access permission of the new file

Return value: Success: File descriptor (fd) for newly opened file 
Failed: -1

write

#include <unistd.h>

ssize_t write(int fd, const void *buf, size_t count);
  • 1
  • 2
  • 3
  • 4

fd: file descriptor 
buf: written buffer 
count: the length of characters written, that is, how much you need to write

read

#include <unistd.h>
ssize_t read(int fd, void *buf, size_t count);

  • 1
  • 2
  • 3
  • 4

The parameters of read are very similar to the parameters of write, only the meaning of the second parameter is somewhat different. Its buf is the buffer that needs to be read.

close

#include <unistd.h>
int close(int fd);

  • 1
  • 2
  • 3
  • 4

The parameter of close is relatively simple. This operation cannot be missed. As long as fd is used, it must be closed.

The key parameter fd is involved in these functions, so to understand these functions, you must first understand the file descriptor (fd).

What is a file descriptor, this is a relatively abstract concept, let's take a look at the following picture

write picture description here

There is a files pointer in the PCB structure, which points to a file_struct structure, and there is a file* fd array in the file_struct structure, which stores the file pointer, which is used to point to different file files, and fd is It can be understood as the subscript of this pointer array, so to open a file, we can get the fd of the file.

The allocation principle of fd: 
In the files_struct array, use the smallest subscript that has not been used as the new file descriptor. 
The operating system uses the first three elements of the array by default, with subscript 0 pointing to standard input (stdin), subscript 1 pointing to standard output (stdout), and subscript 2 pointing to standard error (stderr). 
Therefore, under normal circumstances, the new fd starts from 3, but if we close the default fd, the fd of the new file starts from the closed fd.

Speaking of fd, we have to distinguish between FILE and fd

FILE is a structure provided in the C library, and fd is a system call, which is closer to the bottom layer, so fd must be encapsulated in FILE.

We can take a look at the FILE structure: 
typedef struct _IO_FILE FILE; in /usr/include/stdio.h

It has this section in its structure:

struct _IO_FILE {
  int _flags;       /* High-order word is _IO_MAGIC; rest is flags. */
#define _IO_file_flags _flags

//缓冲区相关
  /* The following pointers correspond to the C++ streambuf protocol. */
  /* Note:  Tk uses the _IO_read_ptr and _IO_read_end fields directly. */
  char* _IO_read_ptr;   /* Current read pointer */
  char* _IO_read_end;   /* End of get area. */
  char* _IO_read_base;  /* Start of putback+get area. */
  char* _IO_write_base; /* Start of put area. */
  char* _IO_write_ptr;  /* Current put pointer. */
  char* _IO_write_end;  /* End of put area. */
  char* _IO_buf_base;   /* Start of reserve area. */
  char* _IO_buf_end;    /* End of reserve area. */
  /* The following fields are used to support backing up and undo. */
  char *_IO_save_base; /* Pointer to start of non-current get area. */
  char *_IO_backup_base;  /* Pointer to first valid character of backup area */ 
  char *_IO_save_end; /* Pointer to end of non-current get area. */

  struct _IO_marker *_markers;

  struct _IO_FILE *_chain;

  int _fileno;//fd的封装
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26

You can see that int_fileno is the encapsulation of fd. There is a large section of buffer-related content at the beginning of this part. Why should I list it? First, let’s look at a very strange example:

  #include <stdio.h>                                                            
  #include <string.h>
  #include <unistd.h>
  #include <sys/stat.h>
  #include <sys/types.h>
  #include <fcntl.h>

  int main(){
      const char *msg1 = "hello printf\n";
      const char *msg2 = "hello fwrite\n";
      const char *msg3 = "hello write\n";

      printf(msg1);
      fwrite(msg2, 1, strlen(msg2), stdout);
      write(1, msg3, strlen(msg3));
      fork();
      return 0;
  }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19

运行结果: 
[rlh@localhost test]$ ./hello 
hello printf 
hello fwrite 
hello write

But when we redirect the output of the process, you will find something weird: 
[rlh@localhost test]$ ./hello > file

[rlh@localhost test]$ cat file 
hello write 
hello printf 
hello fwrite 
hello printf 
hello fwrite

Why is this? This is related to the buffered data of the C library. The buffered data of the C library is divided into three types (1), unbuffered (2), line buffer (3), and full buffer. 
Line buffering means writing to the display, and full buffering means writing to a file. 
In the above phenomenon, write is not affected because it is a system call and has no buffer, while printf and fwrite will have their own buffer. When redirection to a normal file occurs, it will change from line buffering to full Buffering, that is, writing to the file, but the data in our buffer, even if fork, will not be refreshed immediately. When the process exits, it will be refreshed and written to the file, but copy-on-write will occur during fork. That is, when the parent process is ready to refresh, the child process already has a copy of the same data, so the above phenomenon will occur.

Learn about redirects. 
There are three types of redirection:

  1. Output redirection (>) is to close the content pointed to by the subscript fd of 1
  2. Input redirection (<) Similarly, it is to close the content pointed to by the 0 subscript of fd
  3. Append redirection (>>) followed by an additional option

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324619832&siteId=291194637