【Linux】 —— File operating system call

1. File I / O function

The available file I / O functions we call include opening a file, reading a file, and writing a file. Most file I / O in UNIX systems require 5 functions: open, read, write, lseek, and close. For the kernel, all open files are referenced through file descriptors. When opening an existing file or creating a new file, the kernel returns a file descriptor to the process. When reading or writing a file, use open or create to return the file descriptor to identify the file and pass it as a parameter to read or write. Next, let's take a good look at some of these five functions.

1. Open
calls the open function to open or create a file

#include<fcntl.h>
int open(const char* filename,int flag,.../*mode_t mode */);

(1) filename: the name of the file to be opened or created. note! If you just give the file name, it will search under the current path, so it should be the given file path and name.
(2) flag: indicates the way the file is opened. You can use one or more of the following constants to perform the OR operation to form the flag parameter
. The three constants in the following figure must specify one and only one
Insert picture description here. The constant in the following figure is the optional
O_APPEND: it is appended to the file every time it is written. the end of
O_CREAT: if this file does not exist, create it. But this time you need to use the third parameter mode, use it to specify the file access permission bit
O_EXCL: If O_CREAT is also specified, and the file exists, an error will occur. This method can be used to test whether a file exists. If it does not exist, the file is created. This makes both creating and testing an atomic operation.
O_TRUNC: If this file exists, and it is opened for write-only or read-write, the length is truncated to 0.
Part of the synchronization input and output options
O_DSYNC: makes each write wait for the physical I / O operation to complete, but if the write operation Does not affect the reading of the data just written, does not wait for the file attributes to be updated
O_RSYNC: makes each file descriptor as a parameter read operation waits until any pending write operation to the same part of the file is completed
O_SYNC: makes every The second write waits until the physical I / O operation is completed, including the I / O required for the file attribute update caused by the write operation

2, close
calls the close function to close an open file

#include<fcntl.h>
int close(int filename);

Closing a file also releases all record locks that the process added to the file. note! When a process terminates, the kernel automatically closes all open files. Many programs take advantage of this feature without explicitly closing the open file with close.

3. lseek
can call lseek to explicitly set the offset of the opened file. Each open file has a "current file offset" associated with it. Generally, read and write operations start from the current file offset and increase the offset by the number of bytes read and written. According to the system default, when opening a file, unless the O_APPEND option is specified, the offset is set to 0.

#include<fcntl.h>
int lseek(int fd,int size,int flag);

(1) flag: It is the moving mark, the starting position of the movement. SEEK_SET is to set the file offset to size bytes from the beginning of the file. SEEK_CUR is to set the file offset to its current value plus size, size can be positive or negative. SEEK_END is to set the file offset to the file length plus size, and the size can be positive or negative.
(2) Return value: If lseek executes successfully, it returns the new file offset. note! For ordinary files, the offset must be non-negative, but some devices may also allow negative offsets. So when comparing the return value of lseek, do not test whether it is less than 0, but test whether it is equal to -1.

lseek only records the current file offset in the kernel, it does not cause any I / O operations. This offset is then used for the next read or write operation. The file offset can be larger than the current length of the file. In this case, the next write to the file will lengthen the file and form a hole in the file, which is actually allowed. Bytes that are in the file but have not been written are read as 0. It should also be noted that for newly written data, disk blocks need to be allocated, but for the empty area we mentioned before, disk blocks are not required.

4, read
calls the read function to read data from the open file

#include<fcntl.h>
int read(int fd,void *buf,size_t size);

(1) fd: the file read, specified by the return value of open
(2) Return value: if successful, returns the number of bytes read. If the end of the file has been reached, it returns 0. The following situations can make the actual number of bytes read less than the number of bytes required to read.
a. When reading the file, the end of the file has been reached before reading the required number of bytes,
b. When reading from the terminal device, usually read at most one line at a time
c. When reading from the network, the buffer mechanism in the network may cause a return value Less than the number of bytes required to read
d. When reading from a pipe or FIFO, if the pipe contains less than the required number of bytes, then read will only return the actual number of bytes available
e. When recording from certain aspects The device (such as tape) reads at most one record at a time
f. When a signal causes an interrupt
(3) void * is used to indicate a general pointer.

5, write
calls the write function to write data to the open file

#include<fcntl.h>
int write(int fd,void *buf,size_t size);

His return value is the same as the value of the parameter size, otherwise it means an error. The common cause of errors is that the disk is full or exceeds the file length limit of a given process.
For ordinary files, the write operation starts at the file's current offset. If the O_APPEND option is specified when the file is opened, the file offset is set to the current end of the file before each write operation. After a successful write, the file offset increases the number of bytes actually written.

6, dup and dup2 functions
These two functions are used to copy an existing file descriptor

#include<unistd.h>
int dup(int fileds);
int dup2(int fileds,int fileds2);

The new file descriptor returned by dup must be the smallest value in the currently available file descriptor. With dup2, you can use the filedes2 parameter to brake the value of the new descriptor. If filedes2 is already open, close it first. If fileds is equal to fileds2, then dup2 returns filedes2 without closing it.

7, stat, fstat and lstat functions

#include<sys/stat.h>
int stat(const char *restrict pathname,struct stat *restrict buf);
int fstat(int fileds,struct stat *buf);
int lstat(const char *restrict pathname,struct stat *restrict buf);

Return value: All three functions return 0 on success, and -1 on error. stat returns the information structure related to the named file. The fstat function obtains information about the file that has been opened on the file descriptor fileds. The lstack function is similar to stat, but when the named file is a symbolic link, lstat returns the information about the symbolic link. information.

Two, atomic operation

An atomic operation refers to an operation composed of multiple steps. If the operation is performed atomically, either all steps are executed or one step is not executed. It is impossible to execute only a subset of all steps.
(1) Adding to a file
There is no O_APPEND option in the open operation described earlier. There is no effect on a single process, but for a multi-process to use this method to add data to the same file at the same time, it will cause problems. Because the relationship between the data structures is shared, it is assumed that there are two independent processes A and B that add the same file, each process has opened the file but does not use the O_APPEND flag, each The process has its own file entry, but shares a v-node entry, so that the write operation of the two processes will cause the data in the file to be overwritten.
The logical operation "locate to the end of the file and write" it uses two separate function calls. The solution to the problem is that these two operations become an atomic operation for other processes.
(2) Pread and pwrite functions The
function prototype is as follows:

#include<unistd.h>
ssize_t pread(int flags,void *buf,size_t nbytes,off_t offset);
ssize_t pwrite(int flags,void *buf,size_t nbytes,off_t offset);

Return value: Pread reads the number of bytes, if it has reached the end of the file, it returns 0, if it fails, it returns -1; pwrite returns the number of bytes written if it succeeds, and returns -1 if there is an error.
Calling pred is equivalent to calling sequentially lseek and read, calling pwrite is equivalent to calling lseek and write sequentially

3. Examples

Practice one:
With the above basic functions for performing I / O operations, let ’s take a practise and store the data entered by the user on the interface to a.txt, and then display all the contents of a.txt to the terminal as a whole on

int main()
{
	int fd = open("a.txt", O_RDWR | O_CREAT, 0664);//权限设置值
	assert(-1 != fd);
	
	while(1)
	{
		printf("input: ");
		char buff[128] = {0};
		fgets(buff,128,stdin);//从用户获取数据,stdin标准输入,会把最后的回车符也放在buff中
		
		if(strncmp(buff,"end",3) == 0)
		{
			break;
		}
		
		int n =write(fd,buff,strlen(buff));
		if(n<=0)
		{
			perror("write error:");//和printf很像,但是他主要是打的出错信息
			exit(0);
		}
	}
	
	printf(****************************a.txt:*************************\n);
	lseek(fd,0,SEEK_SET);//将文件读写游标移动到开始位置
	
	while(1)
	{
		char buff[128] = {0};//从文件里面读取数据往buff中写
		int n = read(fd,buff,127);
		if( n == 0 )
		{
			printf("END\n");
			break;
		}
		else if(n<0)
		{
			perror("read error: ");
			exit(0);
		}
		else
		{
			printf("%s",buff);
		}
	}
	close(fd);
}

Practice two:
test the parent and child process through the code to share the file descriptor opened before the fork

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <unistd.h>
#include <string.h>

int main()
{
	int fd = open("a.txt", O_RDWR | O_CREAT, 0664);//权限设置值
	assert(-1 != fd);

	pid_t n = fork();
	assert(-1 != n);

	if(0 == n)
	{
		while(1)
		{
			char c = 0;
			int len = read(fd, &c, 1);
			if(n <= 0)
			{
				break;
			}
			printf("child:: %c\n", c);
			sleep(1);
		}
	}
	else
	{
		while(1)
		{
			char c = 0;
			int len = read(fd, &c, 1);
			if(n <= 0)
			{
				break;
			}
			printf("father:: %c\n", c);
			sleep(1);
		}
	}

	close(fd);

	exit(0);
}

The two different execution results are as follows:
Insert picture description here
From the above execution results, we can get that the file descriptor opened before the fork can be accessed by the parent and child processes, and the file read and write offset is shared. Characters are not shared . The internal implementation process is as follows:
Insert picture description here

Fourth, the difference between library functions and system call functions

(1) Concept
First, we have to make clear what is a library function and what is a system call function. The fopen, fread, fwrite, fclose, and fseek that we learned before are all the library functions we refer to, and the read, write, close, etc. we listed at the beginning of this article are system call functions. For example, we often encounter such a question, which of fread and read is more efficient? In fact, it is not an absolute read with high efficiency. When there are few files read, because fread will have call consumption from user mode to kernel mode, but for reading a large amount of data, fread operation is all Put into the user access area how much users use to get how much but the read operation is how much data to read how much data.
This leads us to the concept of library functions and system call functions. The system call function is an interface that the system kernel runs out to call in user space. The system call function is called by user mode and executed in kernel mode. Corresponding to this is the library function, the library function is implemented in the function library file, and only needs to be executed in the user mode during execution.

(2) Difference
In fact, in the concept, we can clearly know their difference. The library function is in the function library file, and the system call function is implemented in the system kernel. Next, we carefully explain the implementation principle of the system call function with open as the column.
In our system, the relationship between them is as shown in the following figure:
Insert picture description here
1. First find the system call number corresponding to the function, save it to the exa register
2, the system call function triggers 0x80 interrupt, and then fall into the kernel, the kernel begins to execute the interrupt Handler. The important instructions of 0x80 interrupt are as follows

call [_sys_call_table+eax*4]

This instruction is mainly to let the system call number stored in the eax register find the kernel function method in the kernel system call table and execute it.
3. After the function is called, there will be a return fd and an integer value. Put the integer value in the eax register and then switch to user mode. Then a mov instruction moves the value of the eax register to the address pointed by fd. This is equivalent to saving the return value of the function.
The specific process is as follows:
Insert picture description here

Published 98 original articles · won praise 9 · views 3641

Guess you like

Origin blog.csdn.net/qq_43412060/article/details/105460239