Thank you for your likes, favorites, comments, triple support
This article is included in the column [ Linux System Programming ]
❀I hope it can be helpful to everyone❀
This article was originally created by Feng Junzi
foreword
For file operations, I don’t know if you have any contact with them, such as (fopen, fclose... etc.), then when you first came into contact with these file interfaces, did you have some indescribable ignorance, not only did not understand, but also Difficult to engrave in the mind.
The content of this chapter will not only explain the knowledge of file operation content, but also teach an important concept, which will affect your learning of Linux system programming!
document
What is a document?
For our daily use, files seem to be the games we run every day, app (.exe) or the c language code we write (.c and .cpp, etc.), they are all stored on the disk. And this kind of point of view has a narrower sense of knowledge, in fact, it has a wider scope for documents.
A file is not just the data it stores, but also its attributes (such as its permissions, its owner, all groups, its creation time, its type).
Everything is a file
For the operating system, as long as it can be read by input or output by output, it is called a file!
And our keyboards and monitors also have the ability to read and write, which means that they can also be called files, and the files in the narrow sense that we use every day are stored on the disk. If we want to read and write, we must also Reading and writing to the disk means that the disk is also a file.
For hardware read and write operations, only the operating system has permission to read and write!
file operations
The operation on the file is nothing more than the operation on the data of the file and the attributes of the file.
So for files, we know that it has the function of reading and writing, so who is this attribute of reading and writing aimed at?
Files on disk: have the attributes of read (read) and write (write).
Display: printf, cout -> This is the property of write.
Keyboard: scanf, cin -> This is the attribute of read (read)
Are you a little surprised by the properties of the keyboard and monitor? , Aren’t we reading it on the monitor, otherwise how would we see this article? Isn't the keyboard entered by us typing one by one, but why is it reading?
To give an example, a file is stored on the disk. If we want to access a file, we need to write code to run it before accessing the file. The essence is that the process is actually accessing the file!
Therefore, we cannot understand their attributes from our own perspective, we should understand them from the perspective of the process, and the process is loaded into memory. For a process, we need to read data from the keyboard, and the process can output information to the display for us to see.
Review the file operation functions of C language
Open a file, the interface provided by C language
write file
#include<stdio.h> #include<unistd.h> #include<string.h> int main() { FILE* fl = fopen("file.txt","w"); if(fl == NULL) { perror("Open"); return 1; } //文件write const char* str1 = "hello fputs\n"; fputs(str1,fl); const char* str2 = "hello fwrite\n"; fwrite(str2,strlen(str2),1,fl); const char* str3 = "hello fprintf\n"; fprintf(fl,"%s",str3); fclose(fl); return 0; }
fopen opens in the form of w. If the file with the name does not exist, a file with the name will be created in the current directory.
And where is this file created based on? Current directory? Who's current directory?
The answer is the current working directory of the process!
read file
Is the effect the same as the cat command?
So let's simulate cat
Simulation of cat command
system interface
The interface functions of the file operations we reviewed above are all interfaces provided by C language, but through the knowledge we have learned before, if we want to read and write hardware, only the operating system has this permission, and the user has no permission. . The functions such as fopen we use are all interfaces at the language level, so we can know that the functions provided by the language must call the system interface at the bottom of its implementation.
So here is a concept of why we should learn Linux system programming. During the waiting and replacement of the learning process before us, we have also come into contact with the interface provided by the system. We can clearly feel that the language level provides Interface, although we say it will be concise and convenient to use, but our actual understanding of it is slightly insufficient. Just look at the file operation functions provided by the C language we are using now. Although we know how to use it, it is difficult to understand it. What have you done.
Therefore, learning Linux system programming is to better understand the underlying principles, strengthen our knowledge, and our vision. Why? Because different languages have different implementation methods for their respective functions, and the encapsulation of the underlying system interface is also different. Therefore, the purpose of learning the system interface is to better understand what the language-level functions are doing. And another advantage is that it will be more handy to learn the functional interfaces provided to you by various languages.
language cross-platform
Here is another concept, the cross-platform nature of the language, what is cross-platform? For example, if you write a code on Linux, it can run not only under Linux system, but also under Windows. This is cross-platform.
So will the system interfaces provided by Windows and Linux be the same? But it is different.
So how does the language realize that different platforms provide different system interfaces but realize the cross-platform nature of the language?
The answer is that in order to be compatible with different platforms, those senior programmers who implement the C language library must implement a language-level function for each platform, and then use the corresponding functions on different platforms through dynamic tailoring through conditional compilation.
File operation interface provided by Linux system
open
man 2 open
Let's take a look at the first open
int open(const char *pathname, int flags);
First look at the return value is an int type, it returns the file descriptor fd of the file if the file is opened successfully (we will talk about the file descriptor in the next article), and if the open fails, it returns -1 .
The first parameter is pathname is the path, which is easy to understand.
The second parameter is flags (identification), what is this? This is very interesting, this involves the knowledge of bit identifiers
Let's first take a look at what man uses for this parameter?
These capitalized words can be understood as options, and from the naming style, we can know that these are all macros. For these options, each option has a different meaning, some represent addition, and some represent creation... (Everyone Come down and see all)
So if there are so many options, how can we pass them all in? If a function represents a parameter, it will not only look bloated and ugly for the function, but also unforgivable for the system interface, and the efficiency is too low. The use of bit identifiers solves this problem very well.
bit identifier (important)
If an int is used as a parameter, how many bits does it have? 28
0000 0000 0000 0000
And if each bit represents an option, can the judgment of 28 options be completed through an int type? Yes
This method is efficient and takes up very little space, perfect!
example
Pass in multiple options by bitwise AND
Now let's get started using open
We want to perform write-only operations on the log.txt file, so we pass O_WRONLY
And there is no log.txt file in our current directory
After running the program, it was found that the file or folder was not found. Why, according to the fopen we use in C language, if there is no such file, shouldn’t it be automatically created? Why is it not created here?
This refers to what we just said, the interface provided to you at the language level encapsulates the system interface in order to make it easier for us to use.
So if we want the effect of automatically creating this file if there is no such file, then the system interface function gives us the O_CREAT macro option.
Now let's run the program again
Although the log.txt file is created, there is a problem with its file permissions!
At this time, the second open will work
int open(const char *pathname, int flags, mode_t mode);
Its third parameter is to set the permissions of the file if the file is to be created
We are no strangers to file permission settings.
At this time, we enter 0666 (octal) to represent -rw-rw-rw
After running the program, although the permissions of the newly generated file are much more correct than last time, why is it -rw-rw-r--? We set the obvious -rw-rw-rw-
This is because of the existence of the mask !
umask (user-created file mask)
Use the umask command to view the current umask value
Here my umask value is 0002 (octal) converted to binary is 0 000 000 010
So how does the mask allow me to set the 0666 permission to program 0664?
0666: 0 110 110 110 -> 0 110 110 110
0002: 0 000 000 010 -> bitwise inversion 1 111 111 101
->Bitwise AND operation & -------------------
Result: 0 110 110 100
This is what the mask does
So is there a way to reset the mask? The system provides an interface
mode_t umask(mode_t mask);
Set the umask value to 0
The permissions of the new file created at this time are correct!
close
There is nothing to say about close, the file descriptor fd passed in to the file can close the file
write
fd is the file descriptor of the file to be written, buf is the string to be input, and count is the number of bytes of buf expected to be written
The return value is the number of bytes actually written to buf
Let's take a look at the code below
Pay attention to the parameters of open, and want to write aa in the opened log.txt file.
At this point log.txt already has data
After running the code
It is found that log.txt did not clear the data in the original log.txt before writing to aa, this is because we have passed an option less in open
At this time, the complete function of fopen("log.txt","w") is realized.
And if you want to realize the function of fopen("log.txt", "a"), you need to replace O_TRUNC with O_APPEND
run the program several times
This realizes the complete function of fopen("log.txt", "a")!
read
The parameter is similar to the return value field write, except that the buf here needs to read the content from the file
Read successfully!
Getting to know the file descriptor fd
For the content of the file descriptor, I will write another article and write another chapter specifically for the file, which will explain the close relationship between the file descriptor and the file in detail. Now let's get to know it briefly
First of all, for the code we wrote above, we can always see that the fd we output is 3
It seems that the file descriptor can be found from the results as a series of consecutive numbers
And if it is a series of consecutive numbers, why does it start from 3?
This is because when we run the c language program, the three streams of stdin (standard input), stdout (standard output), and stderr (standard error) will be opened by default, and these three stream files correspond to our keyboard respectively. display, display.
And their file descriptors are stdin (0), stdout (1), stderr (2).
How to prove it?
Since 1 represents standard output, can this program also output hello world to the screen?
correct!
What is FILE
Let me talk about the conclusion first, it is a structure that stores almost all the contents (including attributes) of an opened file
How to prove it?
_fileno represents the file descriptor fd
For FILE, I will explain it in detail in the next article!