Linux system programming (1): File I/O

References

1. Basic knowledge of UNIX

1.1 UNIX architecture (shown in the figure below)

  • In a strict sense, an operating system can be defined as a software that controls computer hardware resources and provides a program running environment. This software is usually called the kernel because it is relatively small and is located at the core of the environment.
    • The interface to the kernel is called a system call (shaded area in the figure below)
    • The public function library is built on the system call interface. Applications can use either the public function library or the system call
    • A shell is a special application that provides an interface for running other applications

Insert image description here

1.2 Files and Directories

1.2.1 File system

  • The UNIX file system is a hierarchical structure of directories and files. The starting point of everything is a directory called the root . The name of this directory is the character "/"
  • A directory is a file containing directory entries. Logically, each directory entry can be thought of as containing a file name and information describing the attributes of the file .
    • File attributes refer to the file type (whether it is an ordinary file or a directory, etc.), file size, file owner, file permissions (whether other users can access the file), and the last modification time of the file , etc.
    • The stat and fstat functions return an information structure containing all file attributes.

1.2.2 File name

  • Each name in the directory is called a file name (flename)
    • Only two characters, the slash (/) and the null character, cannot appear in file names.
    • Slashes are used to separate file names that make up a path name, and null characters are used to terminate a path name.
  • For portability, POSIX.1 recommends limiting file names to the following character sets: letters (a~z, A~Z), numbers (0~9), periods (.), dashes (-) and Underscore(_)
  • When you create a new directory, two file names are automatically created: . (called dot) and... (called dot)
    • Dot points to the current directory and dot points to the parent directory.
    • At the top level of the root directory, dot dot is the same as dot

1.2.3 Path name

  • A sequence of one or more file names separated by slashes (can also start with a slash) becomes a pathname (pathmamme)
    • A path name starting with a slash is an absolute path name, otherwise it is called a relative path name . A relative path name points to a file relative to the current directory.
    • The name of the file system root (/) is a special absolute path name that does not contain the file name

1.2.4 Working directory

  • Each process has a working directory, sometimes called the current working directory. All relative path names are interpreted starting from the working directory. A process can change its working directory using the chdir function.
  • The relative path name doc/memo/joe refers to the file (or directory) joe in the memo directory in the doc directory in the current working directory.
    • It can be seen from the path name that both doc and memo should be directories, but it cannot be distinguished whether joe is a file or a directory.
  • The path name /urs/lib/lint is an absolute path name , which refers to the file (or directory) lint in the lib directory in the usr directory in the root directory

1.3 Input and output

1.3.1 File descriptor

  • A file descriptor is usually a small, non-negative integer that the kernel uses to identify the file being accessed by a specific process . When the kernel opens an existing file or creates a new file, it returns a file descriptor

1.3.2 Standard input, standard output and standard error

  • Whenever a new program is run, all shells open three file descriptors for it, namely standard input, standard output, and standard error.

1.3.3 Unbuffered I/O

  • The functions open, read, write, lseek, and close provide unbuffered I/O. These functions use file descriptors.

1.3.4 Standard I/O

  • The standard I/O functions provide a buffered interface for those unbuffered I/O functions. The most familiar standard I/O function is printf.

1.4 Procedures and processes

1.4.1 Procedure

  • A program is an executable file stored in a directory on disk . The kernel uses the exec function to read the program into memory and execute the program

1.4.2 Processes and process IDs

  • The execution instance of a program is called a process , and some operating systems use tasks to represent the program being executed.
  • UNIX systems ensure that each process has a unique numerical identifier, called a process ID. The process ID is always a non-negative integer

1.4.3 Process control

  • There are 3 main functions for process control: fork, exec and waitpid (there are 7 variations of the exec function, but they are often referred to collectively as the exec function)

1.4.4 Threads and thread IDs

  • Usually, a process has only one control thread: a set of machine instructions executed at a certain time . Some problems are much easier to solve if there are multiple threads of control working on different parts of it. In addition, multiple control threads can also take full advantage of the parallel capabilities of multi-processor systems.
  • All threads within a process share the same address space, file descriptors, stack, and process-related attributes . Because they have access to the same storage area, threads need to synchronize their access to shared data to avoid inconsistencies.
  • Like processes, threads are also identified by ID . However, a thread only works within the process it belongs to. A thread ID in one process has no meaning in another process. When working on a specific thread in a process, you can refer to it using the thread's ID.

1.5 Error handling

  • When an error occurs in a UNIX system function, a negative value is usually returned , and the integer variable errno is usually set to a value with specific information. Some functions use another convention instead of returning a negative value on error. For example, most functions that return a pointer to an object will return a null pointer on an error
  • POSIX.1 and ISO C define errno as a symbol that expands to a modifiable integer lvalue
    • It can be an integer containing the error number or a function that returns a pointer to the error number.
  • In an environment that supports threads, multiple threads share the process address space, and each thread has its own local errno to prevent one thread from interfering with another thread.
  • Two rules should be noted for errno
    • First: if no error occurs, its value is not cleared by the routine. Therefore, the value of a function is checked only if its return value indicates an error.
    • Second: no function will set the errno value to 0, and all constants defined in <errno.h> are not 0

1.6 User identification

1.6.1 User ID

  • The user ID in the password file entry is a numerical value that identifies each different user to the system. The system administrator determines a user's user ID at the same time as he or she determines a user's login name. Users cannot change their user ID, usually each user has a unique user ID
  • The user with user ID 0 is the root user (root) or superuser (superuser) . In the password file, there is usually a login item whose login name is root, and the privileges of this user are called superuser privileges. Certain operating system functions are only available to super users, who have free control over the system.

1.6.2 Group ID

  • The password file entry also includes the user's group ID, which is a numeric value. Group IDs are also assigned by the system administrator when specifying a user login name. Generally, there are multiple logins with the same group ID in the password file . Groups are used to group users into projects or departments. This mechanism allows resources to be shared among members of the same group
  • The group file maps the group name to a numeric group ID. The group file is usually /etc/group
  • For each file on disk, the file system stores the user ID and group ID of the file's owner. Only 4 bytes are needed to store these two values ​​(assuming each is stored as a double-byte integer value) . During permission verification, comparing strings is more time consuming than comparing integers.
  • But for users, it is more convenient to use names than numbers, so the password file contains the mapping relationship between login names and user IDs, and the group file contains the mapping relationship between group names and group D.

1.7 Signals

  • A signal is used to notify a process that something has happened . For example, if a process performs a division operation and its divisor is 0, a signal named SIGEPE (Floating Point Exception) is sent to the process. Processes have the following three ways of handling signals:

    • (1) Ignore the signal . Some signals indicate hardware exceptions, such as dividing by 0 or accessing storage units outside the process address space. Because the consequences of these exceptions are uncertain, this method of processing is not recommended.
    • (2) Process according to the system default method . For a divisor of 0, the system default method is to terminate the process
    • (3) Provide a function that is called when a signal occurs, which is called catching the signal . By providing a self-written function, you can know when a signal is generated and handle it in the desired way.
  • Signals can occur in many situations. There are two ways to generate signals on the terminal keyboard

    • The interrupt key (usually Delete key or Crl+C) and the exit key (usually Ctrl+\) , which are used to interrupt the currently running process
    • Call the kill function . Calling this function from one process sends a signal to another process. Of course, there are some limitations to this: when sending a signal to a process, you must be the owner of that process or the superuser

1.8 Time value

  • UNIX systems have used two different time values
    • (1) Calendar time . This value is the cumulative number of seconds that have elapsed since a specific time at 00:00:00 on January 1, 1970, Coordinated Universal Time (UTC) (early manuals referred to UTC as Greenwich Mean Time). These time values ​​can be used to record the last modification time of the file, etc.
      • The system basic data type time_t is used to save this time value
    • (2) Process time . Also known as CPU time, it measures the CPU resources used by a process. Process time is measured in clock ticks. Each second used to be taken as 50, 60 or 100 clock ticks
      • The system basic data type clock_t stores this time value
  • When measuring the execution time of a process, the UNIX system maintains 3 process time values ​​for a process
    • clock time
      • Clock time is also called wall clock time. It is the total time that a process runs. Its value is related to the number of processes running simultaneously in the system.
    • User CPU time
      • User CPU time is the amount of time spent executing user instructions
    • System CPU time
      • System CPU time is the time it takes for the process to execute the kernel program
      • The sum of user CPU time and system CPU time is often called CPU time

1.9 System calls and library functions

  • What is a system call?

    • The Application Programming Interface (API) implemented by the operating system and provided to external applications is a bridge for data interaction between applications and the system.
    • All operating systems provide entry points for various services through which programs request services from the kernel. Various versions of UNIX implementations provide a well-defined, limited number of entry points directly into the kernel. These entry points are called system calls.
  • Generic library functions may invoke one or more kernel system calls, but they are not the entry point to the kernel.

    • For example, the printf function calls the write system call to output a string
    • But the functions strcpy (copy a string) and atoi (convert ASCII to integer) do not use any kernel system calls
  • System calls and library functions both take the form of C functions, both of which provide services to the application

    • Library functions can be replaced, but system calls usually cannot be replaced
    • System calls usually provide a minimal interface, while library functions usually provide more complex functions.
  • C standard library function and system function/call relationship: a case of how to print "hello" to the screen

    • The system call is equivalent to a shallow encapsulation of the system function (function in the man page)

Insert image description here

2. UNIX standards and implementation

2.1 UNIX standardization

2.1.1 IOS C

  • The ISO C standard is now maintained and developed by ISO/TEC's International Standards Working Group for the C Programming Language. The working group is called ISO/IEC JTC1/SC22/WG14, or WG14 for short. The intent of the ISO C standard is to provide portability of C programs to a large number of different operating systems, not just UNIX systems.
  • Header files defined by the ISO C standard

Insert image description here

2.1.2 IEEE POSIX.1

  • POSIX.1 is a family of standards originally developed by the IEEE (Institute of Electrical and Electronics Engineers). POSIX.1 refers to Portable Operating System Interface . It originally referred to only IEEE Standard 1003.1-1988 (Operating System Interface), but later expanded to include many standards and draft standards marked as 1003, such as shells and utilities (1003.2, this tutorial uses 1003.1)
    • Since the 1003.1 standard specifies an interface rather than an implementation, it does not distinguish between system calls and library functions. All routines in the standard are called functions.
  • Required header files defined by the POSIX.1 standard

Insert image description here

2.2 UNIX system implementation

2.2.1 4.4 BSD

  • BSD (Berkeley Sofware Distibution) was developed and distributed by the Computer Systems Research Group of the University of California, Berkeley. 4.2BSD came out in 1983, 4.3BSD was released in 1986, and 4.4BSD was released in 1994

2.2.2 FreeBSD

  • FreeBSD is based on the 4.4BSD-Lite operating system. After the Computer Systems Research Group at the University of California, Berkeley, decided to terminate its research and development work on the BSD version of the UNIX operating system, and the 386BSD project was ignored for a long time, the FreeBSD project was formed in order to continue to adhere to the BSD series.

2.2.3 Linux

  • Linux was developed by Linus Torvalds in 1991 as a replacement for MNIX
  • Linux is an operating system that provides a rich programming environment similar to UNIX. Linux is free to use under the guidance of the GNU Public License.

2.2.4 Mac OS X

  • Mac OS X uses completely different technology than its previous versions. Its core operating system is called "Darwin" and is based on a combination of the Mach kernel, the FreeBSD operating system, and drivers with object-oriented frameworks and other kernel extensions.

2.2.5 Solaris

  • Solaris is a version of UNIX developed by Sun Microsystems (now Oracle)

2.3 Basic system data types

  • Certain implementation-related data types are defined in the header file <sys/types.h>, which are called basic system data types.

  • Some commonly used basic system data types

Insert image description here

3. Text I/O

3.1 Introduction

  • Available file I/O functions: open (open) file, read (read) file, write (write) file, etc.
  • Most file I/O in UNIX systems requires only 5 functions: open, read, write, lseek and close

The functions described in this chapter are often called unbuffered I/O (as opposed to the standard I/O functions).

  • Unbuffered means that each read and write calls a system call in the kernel.
  • These unbuffered I/O functions are not part of ISO C, but they are part of POSIX1

3.2 File descriptor

  • To the kernel, all open files are referenced by file descriptors

    • The file descriptor is a non-negative integer
    • When opening an existing file or creating a new file, the kernel returns a file descriptor to the process
    • When reading or writing a file, use the file descriptor returned by open or creat to identify the file, and pass it as a parameter to read or write
  • By convention, UNIX system shells

    • File descriptor 0 is associated with the process's standard input
    • File descriptor 1 is associated with the process's standard output
    • File descriptor 2 is associated with the process's standard error
  • In POSIX.1-compliant applications, the magic numbers 0, 1, and 2, although standardized, should be replaced with the symbolic constants STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO to improve readability. These constants are defined in the header file <unistd.h>

A file descriptor is a pointer to a file structure
PCB process control block: essentially a structure, its members are file descriptor tables

Insert image description here

3.3 Functions open and openat (open or create a file)

3.3.1 Function open and openat parameter analysis

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>  // 定义 flags 参数

int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode); // 仅当创建新文件时才使用第三个参数,表明文件权限

int openat(int dirfd, const char *pathname, int flags);
int openat(int dirfd, const char *pathname, int flags, mode_t mode);
  • pathname: the pathname of the file to be opened or created
  • flags: Used to describe multiple options of this function. Use one or more of the following constants to perform an "OR" operation to form the flags parameter.
    • O_RDONLY (open for reading only), O_WRONLY (open for writing only), O_RDWR (open for reading and writing), O_EXEC (open for execution only), O_SEARCH (open for search only, used for directories)
    • O_APPEND (append to the end of the file each time you write)
    • O_CREAT (if this file does not exist, create it, used together with the third parameter mode )
    • O_EXCL (if O_CREAT is also specified and the file already exists, an error occurs)
    • O_NONBLOCK (Set non-blocking mode for this file opening operation and subsequent I/O operations )
    • O_TRUNC (if this file exists and is successfully opened for write-only or read-write, truncate its length to 0 )
  • function return value
    • If successful, returns the file descriptor
    • If an error occurs, -1 is returned
  • The dirfd parameter distinguishes the open and openat functions. There are three possibilities.
    • The path parameter specifies an absolute path name. In this case, the dirfd parameter is ignored and the openat function is equivalent to the open function.
    • The path parameter specifies the relative path name, and the dirfd parameter points out the starting address of the relative path name in the file system. The dirfd parameter is obtained by opening the directory where the relative path name is located.
    • The path parameter specifies a relative path name, and the dirfd parameter has the special value AT_FDCWD. In this case, the pathname is taken in the current working directory and the openat function is similar in operation to the open function
  • The openat function is one of the new functions in the latest version of POSIX.1, hoping to solve two problems
    • First, allow threads to use relative path names to open files in the directory, instead of only opening the current working directory.
      • All threads in the same process share the same current working directory, so it is difficult to have multiple different threads of the same process working in different directories at the same time
    • Second, time-of-check-to-time-of-use (TOCTTOU) errors can be avoided
      • The basic idea of ​​the TOCTTOU bug is: if there are two file-based function calls, where the second call depends on the result of the first call, then the program is vulnerable . Because the two calls are not atomic operations, the file may have changed between the two function calls, which will cause the result of the first call to be no longer valid, making the final result of the program incorrect.

3.3.2 Filename and pathname truncation

  • In POSIX.1 the constant _POSIX_NO_TRUNC determines whether to truncate file names or path names that are too long, or to return an error . Depending on the type of file system, this value can vary. You can use fpathconf or pathconf to query what kind of behavior a directory supports, whether to truncate overly long file names or to return an error.
  • If _POSIX_NO_TRUNC is valid, when the entire path name exceeds PATH_MAX, or any file name in the path name exceeds NAME_MAX, an error is returned and errno is set to ENAMETOOLONG

3.4 Function close (close an open file)

#include <unistd.h>

int close(int fd);
  • function return value

    • If successful, return 0
    • If an error occurs, -1 is returned
  • Closing a file also releases all record locks held by the process on the file.

  • When a process terminates, the kernel automatically closes all its open files. Many programs take advantage of this feature without explicitly closing the file with close.

3.5 Function creat (create a new file)

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int creat(const char *pathname, mode_t mode);
  • function return value

    • If successful, returns the file descriptor opened for writing only.
    • If an error occurs, -1 is returned
  • This function is equivalent to

    open(path, O_WRONLY | O_CREAT | O_TRUNC, mode)
    

One disadvantage of creat is that it opens the created file for writing only . Before the new version of open was provided, if you wanted to create a temporary file, write to the file, and then read from the file, you must call creat, close, and then open. Now you can call the open implementation in the above way

3.3-3.5 Case

Case 1

// open.c
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
    
    
    int fd;
    fd = open("./AUTHORS.txt", O_RDONLY);
    printf("fd = %d\n", fd);
    
    close(fd);	
    
    return 0;
}
$ gcc open.c -o open

$ ./open
# 输出如下,表示文件存在并正确打开
fd = 3

Case 2

// open2.c
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
    
    
    int fd;
    fd = open("./AUTHORS.cp", O_RDONLY | O_CREAT, 0644); // rw-r--r--
    printf("fd = %d\n", fd);

    close(fd);

    return 0;
}
$ gcc open2.c -o open2

$ ./open2
fd = 3

$ ll 
# 创建了一个新文件 AUTHORS.cp,且文件权限对应于 0644
-rw-r--r-- 1 yue yue    0 9月  10 22:19 AUTHORS.cp

Case 3

// open3.c
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
    
    
    int fd;
    // 如果文件存在,以只读方式打开并且截断为 0
    // 如果文件不存在,则把这个文件创建出来并指定权限为 0644
    fd = open("./AUTHORS.cp", O_RDONLY | O_CREAT | O_TRUNC, 0644); // rw-r--r--
    printf("fd = %d\n", fd);

    close(fd);

    return 0;
}
$ gcc open3.c -o open3

$ ./open3
# 输出如下,表示文件存在并正确打开
fd = 3

$ ll 
# 首先在 AUTHORS.cp 文件中输入内容,然后经过 O_TRUNC 截断后为 0
-rw-r--r-- 1 yue yue    0 9月  10 22:19 AUTHORS.cp

Case 4

  • When creating a file, specify the file access permission mode, and the permissions are also affected by umask. The conclusion is
    • File permissions = mode & ~umask
$ umask
0002 # 表明默认创建文件权限为 ~umask = 775(第一个 0 表示八进制)
// open4.c
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
    
    
    int fd;
    fd = open("./AUTHORS.cp2", O_RDONLY | O_CREAT | O_TRUNC, 0777); // rwxrwxrwx
    printf("fd = %d\n", fd);

    close(fd);

    return 0;
}
$ gcc open4.c -o open4

$ ./open4
fd = 3

$ ll 
# 创建了一个新文件 AUTHORS.cp2,且文件权限为 mode & ~umask = 775(rwxrwxr-x)
-rwxrwxr-x 1 yue yue    0 9月  10 22:38 AUTHORS.cp2*

Case 5

  • Common errors in open function
    • Open file does not exist
    // open5.c
    #include <unistd.h>
    #include <fcntl.h>
    #include <stdio.h>
    #include <errno.h>
    #include <string.h>
    
    int main(int argc, char *argv[]) {
          
          
        int fd;
    
        fd = open("./AUTHORS.cp4", O_RDONLY);
        printf("fd = %d, errno = %d : %s\n", fd, errno, strerror(errno));
    
        close(fd);
    
        return 0;
    }
    
    $ gcc open5.c -o open5
    
    $ ./open5
    fd = -1, errno = 2 : No such file or directory
    
    • Open a read-only file in writing mode (there is no corresponding permission to open the file)
    // open6.c
    #include <unistd.h>
    #include <fcntl.h>
    #include <stdio.h>
    #include <errno.h>
    #include <string.h>
    
    int main(int argc, char *argv[]) {
          
          
        int fd;
    
        fd = open("./AUTHORS.cp3", O_WRONLY); // AUTHORS.cp3 文件权限为只读
        printf("fd = %d, errno = %d : %s\n", fd, errno, strerror(errno));
    
        close(fd);
    
        return 0;
    }
    
    $ gcc open6.c -o open6
    
    $ ./open6
    fd = -1, errno = 13 : Permission denied
    
    • Open directory for writing only
    $ mkdir mydir # 首先创建一个目录
    
    // open7.c
    #include <unistd.h>
    #include <fcntl.h>
    #include <stdio.h>
    #include <errno.h>
    #include <string.h>
    
    int main(int argc, char *argv[]) {
          
          
        int fd;
    
        fd = open("mydir", O_WRONLY);
        printf("fd = %d, errno = %d : %s\n", fd, errno, strerror(errno));
    
        close(fd);
    
        return 0;
    }
    
    $ gcc open7.c -o open7
    
    $ ./open7
    fd = -1, errno = 21 : Is a directory
    

3.6 Function lseek (explicitly set offset for an open file)

#include <sys/types.h>
#include <unistd.h>

off_t lseek(int fd, off_t offset, int whence);
  • Each open file has a "current file offset" associated with it, which is usually a non-negative number measuring the number of bytes counted from the beginning of the file.

  • The l in lseek represents long integer type

  • function return value

    • If successful, return the new file offset
    • If an error occurs, -1 is returned
  • By system default, when a file is opened, this offset is set to 0 unless the O_APPEND option is specified.

  • The interpretation of the parameter offset is related to the value of the parameter whence

    • If whence is SEEK_SET, the offset of the file is set to offset bytes from the beginning of the file.
      • SEEK_SET(0) absolute offset
    • If whence is SEEK_CUR, the offset of the file is set to its current value plus offset. Offset can be positive or negative.
      • SEEK_CUR(1) Offset relative to current position
    • If whence is SEEK_END, the offset of the file is set to the file length plus offset. Offset can be positive or negative.
      • SEEK_END (2) offset relative to the end of the file
  • lseek only records the current file offset in the kernel, it does not cause any I/O operations. This offset is then used for the next read or write operation

  • The file offset can be larger than the current length of the file , in which case the next write to the file will lengthen the file and create a hole in the file, which is allowed. Bytes that are in the file but have not been written are read as 0

Case 1

  • File reading and writing use the same offset position
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <string.h>
    #include <fcntl.h>
    
    int main(void) {
          
          
        int fd, n;
        char msg[] = "It's a test for lseek\n";
        char ch;
    
        fd = open("lseek.txt", O_RDWR | O_CREAT, 0644);
        if (fd < 0) {
          
          
            perror("open lseek.txt error");
            exit(1);
        }
    
        // 使用 fd 对打开的文件进行写操作,读写位置位于文件结尾处
        write(fd, msg, strlen(msg));
        // 若注释下行代码,由于文件写完之后未关闭,读、写指针在文件末尾,所以不调节指针,直接读取不到内容
        lseek(fd, 0, SEEK_SET); // 修改文件读写指针位置,位于文件开头
    
        while ((n = read(fd, &ch, 1))) {
          
          
            if (n < 0) {
          
          
                perror("read error");
                exit(1);
            } 
            write(STDOUT_FILENO, &ch, n);  // 将文件内容按字节读出,写出到屏幕
        }
    
        close(fd);
    
        return 0;
    }
    

Case 2

  • Use lseek to get file size
    // lseek_size.c
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <string.h>
    #include <fcntl.h>
    
    int main(int argc, char *argv[]) {
          
          
        int fd = open(argv[1], O_RDWR);
        if (fd == -1) {
          
          
            perror("open error");
            exit(1);
        }
    
        int length = lseek(fd, 0, SEEK_END);
        printf("file size: %d\n", length);
    
        close(fd);
    
        return 0;
    }
    
    $ gcc lseek_size.c -o lseek_size
    $ ./lseek_size fcntl.c  # fcntl.c 文件大小为 678
    678
    

Case 3

  • Extend file size using lseek
    • For the file size to truly expand, IO operations must be caused
    // 修改案例 2 中下行代码(扩展 111 大小)
    // 这样并不能真正扩展,使用 cat 命令查看文件大小未变化
    int length = lseek(fd, 111, SEEK_END);
    
    // 在 printf 函数下行写如下代码(引起 IO 操作)
    write(fd, "\0", 1); // 结果便是在扩展的文件尾部追加文件空洞
    
  • Files can be directly extended using the truncate function
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <string.h>
    #include <fcntl.h>
    
    int main(int argc, char*argv[]) {
          
          
        int ret = truncate("dict.cp", 250);
        printf("ret = %d\n", ret);
    
        return 0;
    }
    

The file size read by lseek is always relative to the file header. Using lseek to read the file size actually uses the offset difference between the initial and final positions of the read and write pointers . For a newly opened file, the initial positions of the read and write pointers are at the beginning of the file. If you use this to expand the file size, you must cause IO, so at least one character must be written.

3.7 Function read (read data from open file)

#include <unistd.h>

// ssize_t 表示带符号整型;void* 表示通用指针
// 参数1:文件描述符;参数2:存数据的缓冲区;参数3:缓冲区大小
ssize_t read(int fd, void *buf, size_t count);
  • function return value
    • If the read is successful, the number of bytes read is returned. If the end of the file is reached, 0 is returned.
    • If an error occurs, -1 is returned
    • If -1 is returned and errno = EAGIN or EWOULDBLOCK, it means that it is not that read failed, but that read is reading a device file/network file in a non-blocking manner, and the file has no data.
  • There are various situations in which the actual number of bytes read is less than the requested number of bytes read.
    • 1. When reading a normal file, the end of the file is reached before the required number of bytes is read.
      • For example, if there are 30 bytes before the end of the file is reached, and 100 bytes are required to be read, read returns 30. The next time read is called, it will return 0 (end of file)
    • 2. When reading from a terminal device, usually one line at most is read at a time.
    • 3. When reading from the network, the buffering mechanism in the network may cause the return value to be less than the number of bytes required to be read.
    • 4. When reading from a pipe or FIFO, if the pipe contains less than the required number of bytes, then read will only return the actual number of bytes available.
    • 5. When reading from some record-oriented devices (such as tapes), at most one record is returned at a time
    • 6. When a signal causes an interruption and part of the data has been read

3.8 Function write (write data to open file)

#include <unistd.h>

// 参数1:文件描述符;参数2:待写出数据的缓冲区;参数3:数据大小
ssize_t write(int fd, const void *buf, size_t count);
  • function return value

    • If the write is successful, the number of bytes written is returned ( the return value is usually the same as the parameter count value, otherwise an error occurs )
    • If an error occurs, -1 is returned
  • A common reason for write errors is that the disk is full, or the file length limit for a given process has been exceeded.

  • For normal files, writing starts at the current offset of the file. If the O_APPEND option is specified when the file is opened, the file offset is set to the current end of the file before each write operation. After a successful write, the file offset is incremented by the number of bytes actually written

Blocking and non-blocking

  • Block : When a process calls a blocking system function, the process is placed in a sleep state. At this time, the kernel schedules other processes to run until the event the process is waiting for occurs (such as receiving a data packet on the network). , or when the sleep time specified by calling sleep is up), it is possible for it to continue running. The opposite of the sleep state is the running state. In the Linux kernel, processes in the running state are divided into two situations:

    • Being scheduled for execution . The CPU is in the context of the process. The program counter stores the instruction address of the process. The general register stores the intermediate results of the process's operation. It is executing the instructions of the process and is reading and writing the address space of the process.
    • Ready state . The process does not need to wait for any events to occur and can be executed at any time. However, the CPU is currently executing another process, so the process is waiting in a ready queue to be scheduled by the kernel.
  • Reading a regular file will not block . No matter how many bytes are read, read will definitely return within a limited time. Reading from the terminal device or the network is not necessarily the case . If the data input from the terminal does not have a newline character, calling read to read the terminal device will block. If no data packet is received on the network, calling read to read from the network will block. As for whether it will block. How long it takes is also uncertain. If no data arrives, it will remain blocked there. Similarly, writing to a regular file will not block , but writing to a terminal device or network will not.

    • /dev/tty – terminal file

Block reading terminal

// block_readtty.c
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

int main(void) {
    
    
    char buf[10];
    int n;
    
    n = read(STDIN_FILENO, buf, 10);
    if (n < 0){
    
    
        perror("read STDIN_FILENO");
        exit(1);
    }
    write(STDOUT_FILENO, buf, n);
    
    return 0;
}
$ gcc block_readtty.c -o block
$ ./block  # 此时程序在阻塞等待输入,下面输入 hello 后回车即结束
hello
hello

non-blocking read terminal

// nonblock_readtty.c
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>

#define MSG_TRY "try again\n"
#define MSG_TIMEOUT "time out\n"

int main(void) {
    
    
    char buf[10];
    int fd, n, i;
    
    // 设置 /dev/tty 非阻塞状态(默认为阻塞状态)
    fd = open("/dev/tty", O_RDONLY | O_NONBLOCK); 
    if(fd < 0) {
    
    
        perror("open /dev/tty");
        exit(1);
    }
    printf("open /dev/tty ok... %d\n", fd);

    for (i = 0; i < 5; i++) {
    
    
        n = read(fd, buf, 10);
        if (n > 0) {
    
      // 说明读到了东西
            break;
        }
        if (errno != EAGAIN) {
    
      
            perror("read /dev/tty");
            exit(1);
        } else {
    
    
            write(STDOUT_FILENO, MSG_TRY, strlen(MSG_TRY));
            sleep(2);
        }
    }

    if (i == 5) {
    
    
        write(STDOUT_FILENO, MSG_TIMEOUT, strlen(MSG_TIMEOUT));
    } else {
    
    
        write(STDOUT_FILENO, buf, n);
    }

    close(fd);

    return 0;
}
$ gcc block_readtty.c -o block
$ ./block  # 此时程序在阻塞等待输入,下面输入 hello 后回车即结束
hello
hello

3.9 I/O efficiency

  • Use read/write function to implement file copy
// 将一个文件的内容复制到另一个文件中:通过打开两个文件,循环读取第一个文件的内容并写入到第二个文件中
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>

int main(int argc, char* argv[]) {
    
    
    char buf[1];  // 定义一个大小为 1 的字符数组,用于存储读取或写入的数据
    int n = 0;

    // 打开第一个参数所表示的文件,以只读方式打开
    int fd1 = open(argv[1], O_RDONLY);
    if (fd1 == -1) {
    
    
        perror("open argv1 error");
        exit(1);
    }

    // 打开第二个参数所表示的文件,以可读写方式打开,如果文件不存在则创建,如果文件存在则将其清空
    int fd2 = open(argv[2], O_RDWR | O_CREAT | O_TRUNC, 0664);
    if (fd2 == -1) {
    
    
        perror("open argv2 error");
        exit(1);
    }

    // 循环读取第一个文件的内容,每次最多读取 1024 字节
    // 将返回的实际读取字节数赋值给变量 n
    while ((n = read(fd1, buf, 1024)) != 0) {
    
    
        if (n < 0) {
    
    
            perror("read error");
            break;
        }
        // 将存储在 buf 数组中的数据写入文件描述符为 fd2 的文件
        write(fd2, buf, n);
    }

    close(fd1);
    close(fd2);

    return 0;
}
  • Use fputc/fgetc function to implement file copy
// 使用了 C 标准库中的文件操作函数 fopen()、fgetc() 和 fputc() 来实现文件的读取和写入
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>

int main(int argc, char* argv[]) {
    
    
    FILE *fp, *fp_out;
    int n = 0;
    
    fp = fopen("hello.c", "r");
    if (fp == NULL) {
    
    
        perror("fopen error");
        exit(1);
    }

    fp_out = fopen("hello.cp", "w");
    if (fp_out == NULL) {
    
    
        perror("fopen error");
        exit(1);
    }

    // 判断是否读取到文件结束符 EOF
    while ((n = fgetc(fp)) != EOF) {
    
    
        fputc(n, fp_out);  // 将读取的字符写入输出文件
    }

    fclose(fp);
    fclose(fp_out);

    return 0;
}
  • read/write: Each time you write a byte, you will constantly switch between kernel mode and user mode, so it is very time-consuming.
  • fgetc/fputc: There is a 4096 buffer, so it is not written byte by byte, and there is less switching between the kernel and the user ( pre-reading into the buffered output mechanism )

System functions are not necessarily faster than library functions. Where library functions can be used, use library functions.
Standard I/O functions come with user buffers. System calls do not have user-level buffers. System buffers are available.

  • Time results for read operations on Linux with different buffer lengths
    • Most file systems use some kind of read ahead buffering technology to improve performance . When a sequential read is detected, the system attempts to read in more data than the application requires, assuming that the application will read the data quickly. The effect of read-ahead can be seen in the figure below: the clock time with a buffer length as small as 32 bytes is almost the same as with a larger buffer length

Insert image description here

3.10 File sharing

  • UNIX systems support sharing open files between different processes

  • The kernel uses three data structures to represent open files. The relationship between them determines the impact that one process may have on another process in terms of file sharing.

    • (1) Each process has a record entry in the process table. The record entry contains a table of open file descriptors, which can be regarded as a vector, with each descriptor occupying one entry. Associated with each file descriptor are:
      • file descriptor flag
      • pointer to a file table entry
    • (2) The kernel maintains a file table for all open files. Each file entry contains
      • File status flags (read, write, add, sync, non-blocking, etc.)
      • Current file offset
      • Pointer to the v node table entry of the file
    • (3) Each open file (or device) has a v-node structure. The v node contains pointers to the file type and functions for performing various operations on the file. For most files, the v-node also contains the file's i-node (index node). This information is read from disk into memory when the file is opened, so all relevant information about the file is always available.
  • Open file kernel data structure

Insert image description here

The difference in scope between file descriptor flags and file status flags : the former applies only to one descriptor of a process, while the latter applies to all descriptors in any process pointing to the given file table entry

3.11 Atomic operations

Generally speaking, an atomic operation refers to an operation consisting of multiple steps . If the operation is performed atomically, either all steps are performed or none is performed. It is impossible to perform only a subset of all steps.

3.11.1 Append to a file

  • Consider a process that appends data to the end of a file

    • For a single process, this program can work normally, but if multiple processes use this method to append data to the same file at the same time, problems will occur.
    if(lseek(fd, OL, 2) < 0)
        err_sys("lseek error");
    if(write(fd, buf, 100) != 100)
        err_sys("write error");
    
  • Suppose there are two independent processes, A and B, both appending to the same file. Each process has opened the file without using the O_APPEND flag.

    • At this point, each process has its own file entry, but shares a v-node entry
    • Assume that process A calls lseek, which sets the current offset of the file in process A to 1500 bytes (at the end of the current file)
    • Then the kernel switches processes, and process B executes lseek and also sets its current offset to the file to 1500 bytes (at the end of the current file)
    • B then calls write, which increases B's current file offset of the file to 1600. Because the length of the file has increased, the kernel updates the current file length in v-node to 1600
    • Then, the kernel performs process switching to resume operation of process A. When A uses write, it starts writing data to the file from its current file offset (1500), thus overwriting the data just written to the file by process B.

    The problem lies in the logical operation "first locate the end of the file, then write", which uses two separate function calls

    • Workaround: Make these two operations an atomic operation for other processes . Any operation that requires more than one function call is not atomic because the kernel may temporarily suspend the process between function calls.
    • UNIX systems provide an atomic method for such an operation, which is to set the O_APPEND flag when opening the file. This causes the kernel to set the current offset of the process to the end of the file before each write operation, so there is no need to call lseek before each write operation.

3.11.2 Functions pred and pwrite

#include <unistd.h>

ssize_t pread(int fd, void *buf, size_t count, off_t offset);
ssize_t pwrite(int fd, const void *buf, size_t count, off_t offset);
  • pread function return value

    • If successful, the number of bytes read will be returned. If the end of the file has been read, 0 will be returned.
    • If an error occurs, -1 is returned
  • pwrite function return value

    • If successful, returns the number of bytes written
    • If an error occurs, -1 is returned
  • Calling pred is equivalent to calling lseek and then calling read . However, pred has the following important differences from this sequential call.

    • When calling pread, its positioning and reading operations cannot be interrupted.
    • Do not update current file offset

3.12 Functions dup and dup2 (copy an existing file descriptor)

#include <unistd.h>

// dup 主要起一个保存副本的作用
int dup(int oldfd);
// dup2 = dupto 将 oldfd 复制给 newfd,返回 newfd
int dup2(int oldfd, int newfd);

// cmd: F_DUPFD
// 可变参数 3:
    // 被占用的,返回最小可用的
    // 未被占用的,返回 = 该值的文件描述符
int fcntl(int fd, int cmd, ...)
  • function return value

    • If successful, return the new file descriptor
    • If an error occurs, -1 is returned
  • The new file descriptor returned by dup must be the smallest number of currently available file descriptors.

  • For dup2, you can use the newfd parameter to specify the value of the new descriptor.

    • If newfd is already open, close it first
    • If oldfd = newfd, dup2 returns newfd without closing it
    • Otherwise, the FD_CLOEXEC file descriptor flag of newfd is cleared, so that newfd is open when the process calls exec
  • Another way to copy a descriptor is to use the fcntl function. The following function calls are equivalent

    dup(oldfd);
    fcntl(oldfd, F_DUPFD, 0);
    
    // 以下情况并不完全等价
    // (1) dup2 是一个原子操作,而 close 和 fcnt1 包括两个函数调用
        // 有可能在 close 和 fcntl 之间调用了信号捕获函数,它可能修改文件描述符
    // (2) dup2 和 fcntl 有一些不同的 errno
    dup2(oldfd, newfd);
    
    close(newfd);
    fcntl(oldfd, F_DUPFD, newfd);
    

dup case

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <pthread.h>

// argc 表示参数个数,argv[] 是参数列表
int main(int argc, char *argv[]) {
    
    
    // 只读方式打开 argv[1] 指定的文件
    int oldfd = open(argv[1], O_RDONLY);       // 012  --- 3

    // 创建一个新的文件描述符 newfd,并与 oldfd 指向同一文件,最后返回新的文件描述符
    int newfd = dup(oldfd);    // 4

    printf("newfd = %d\n", newfd);

	return 0;
}

dup2 case

  • Copy an existing file descriptor fd1 to another file descriptor fd2, and then use fd2 to modify the file pointed to by fd1
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <fcntl.h>
    #include <unistd.h>
    #include <pthread.h>
    
    int main(int argc, char *argv[]) {
          
          
        int fd1 = open(argv[1], O_RDWR);       // 012  --- 3
        int fd2 = open(argv[2], O_RDWR);       // 0123 --- 4
    
        // fd2 指向 fd1
        int fdret = dup2(fd1, fd2);         // 返回 新文件描述符 fd2
        printf("fdret = %d\n", fdret);
    
        // 打开一个文件,读写指针默认在文件头:如果写入的文件是非空的,写入的内容默认从文件头部开始写,会覆盖原有内容
        int ret = write(fd2, "1234567", 7); // 写入 fd1 指向的文件
        printf("ret = %d\n", ret);
    
        // 将输出到 STDOUT 的内容重定向到文件里
        dup2(fd1, STDOUT_FILENO);           // 将屏幕输入,重定向给 fd1 所指向的文件
    
        printf("---------886\n");
    
    	return 0;
    }
    

fcntl case

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <pthread.h>
#include <fcntl.h>

int main(int argc, char* argv[]) {
    
    
    int fd1 = open(argv[1], O_RDWR);

    printf("fd1 = %d\n", fd1);

    // 参数 3:传入一个文件描述符 k,如果 k 没被占用,则直接用 k 复制 fd1 的内容。如果 k 被占用,则返回描述符表中最小可用描述符
    // 0 被占用,fcntl 使用文件描述符表中可用的最小文件描述符返回
    int newfd = fcntl(fd1, F_DUPFD, 0);
    printf("newfd = %d\n", newfd);

    // 7 未被占用,返回 = 该值的文件描述符
    int newfd2 = fcntl(fd1, F_DUPFD, 7);
    printf("newfd2 = %d\n", newfd2);

    int ret = write(newfd2, "YYYYYYY", 7);
    printf("ret = %d\n", ret);

    return 0;
}
$ gcc ls-R.c -o fcntl2
$ ./fcntl2 mycat.c
fd1 = 3
newfd = 4
newfd2 = 7
ret = 7

3.13 Functions sync, fsync and fdatasync

  • Traditional UNIX system implementations have a buffer cache or page cache in the kernel , and most disk I/O is performed through the buffer.
    • When writing data to a file, the kernel usually copies the data to a buffer first, then queues it, and writes it to disk later. This method is called delayed writing.
    • Normally, the kernel writes all deferred write data blocks to disk when it needs to reuse the buffer for other disk block data.
    • In order to ensure the consistency of the actual file system on the disk and the content in the buffer, the UNIX system provides three functions: sync, fsync and fdatasync
#include <unistd.h>

int fsync(int fd);
int fdatasync(int fd);

void sync(void);
  • function return value
    • If successful, return 0
    • If an error occurs, -1 is returned
  • sync just queues all modified block buffers into the write queue and then returns. It does not wait for the actual disk write operation to end.
    • A system daemon called update periodically calls (usually every 30 seconds) the sync function, which ensures that the kernel's block buffers are flushed regularly
  • The fsync function only works on a file specified by the file descriptor fd , and waits for the disk write operation to complete before returning.
    • fsync can be used in applications such as databases that need to ensure that modified blocks are written to disk immediately
  • The fdatasync function is similar to fsync, but it only affects the data portion of the file
    • In addition to data, fsync also updates file attributes synchronously

3.14 Function fcntl (change the attributes of an open file)

#include <unistd.h>
#include <fcntl.h>

// 参数 3 可以是整数或指向一个结构的指针
int fcntl(int fd, int cmd, ... /* int arg */ );
  • function return value
    • If successful, it depends on cmd
      • Copy an existing descriptor: F_DUPFD or F_DUPFD_CLOEXEC, returning a new file descriptor
      • Get/set the file descriptor flag: F_GETFD or F_SETFD, return the corresponding flag
      • Get/set file status flag: F_GETFL or F_SETFL, return the corresponding flag
      • Get/set asynchronous I/O ownership: F_GETOWN or F_SETOWN, return a positive process ID or negative process group ID
      • Get/set record lock: F_GETLK, F_SETLK or F_SETLKW
    • If an error occurs, -1 is returned

Case

// 终端文件默认是阻塞读的,这里用 fcntl 将其更改为非阻塞读
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MSG_TRY "try again\n"

int main(void) {
    
    
    char buf[10];
    int flags, n;

    flags = fcntl(STDIN_FILENO, F_GETFL);
    if (flags == -1) {
    
    
        perror("fcntl error");
        exit(1);
    }
    flags |= O_NONBLOCK; // 与或操作,打开 flags
    int ret = fcntl(STDIN_FILENO, F_SETFL, flags);
    if (ret == -1) {
    
    
        perror("fcntl error");
        exit(1);
    }

tryagain:
    n = read(STDIN_FILENO, buf, 10);
    if (n < 0) {
    
    
        if (errno != EAGAIN) {
    
    
            perror("read /dev/tty");
            exit(1);
        }
        sleep(3);
        write(STDOUT_FILENO, MSG_TRY, strlen(MSG_TRY));
        goto tryagain;
    }
    write(STDOUT_FILENO, buf, n);

    return 0;
}

3.15 Function ioctl

#include <sys/ioctl.h>

int ioctl(int fd, unsigned long request, ...);
  • function return value
    • If an error occurs, -1 is returned
    • If successful, return other values
  • Manage device I/O channels and control device characteristics (mainly used in device drivers)
  • Usually used to obtain the physical characteristics of a file (this characteristic has different values ​​for different file types)

3.16 Incoming and outgoing parameters

#include <string.h>

char* strcpy(char* dest, const char* src);
char* strcpy(char* dest, const char* src, size_t n);
  • Incoming parameters: src

    • pointer as function parameter
    • Usually modified with const keyword
    • The pointer points to the valid area and the read operation is performed inside the function.
  • Outgoing parameters: dest

    • pointer as function parameter
    • Before the function call, the space pointed to by the pointer can be meaningless, but it must be valid
    • Do write operations inside the function
    • After the function call is completed, it serves as the function return value
#include <string.h>

char* strtok(char* str, const char* delim);
char* strtok_r(char* str, const char* delim, char** saveptr);
  • Pass in and out parameters: saveptr
    • pointer as function parameter
    • Before the function call, the space pointed by the pointer has actual meaning.
    • Inside the function, read first and then write .
    • After the function call is completed, it serves as the function return value

Guess you like

Origin blog.csdn.net/qq_42994487/article/details/132842199