Linux - file manipulation and system interface

                                                              

                                               Thank you for your likes, favorites, comments, triple support

                                              This article is included in the column [ Linux System Programming ]   

                                                     ❀I hope it can be helpful to everyone❀

                                                       This article   was originally created by Feng Junzi

        ​​​​​​​        ​​​​​​​        ​​​​​​​               

foreword

 For file operations, I don’t know if you have any contact with them, such as (fopen, fclose... etc.), then when you first came into contact with these file interfaces, did you have some indescribable ignorance, not only did not understand, but also Difficult to engrave in the mind.

The content of this chapter will not only explain the knowledge of file operation content, but also teach an important concept, which will affect your learning of Linux system programming!

document

What is a document? 

For our daily use, files seem to be the games we run every day, app (.exe) or the c language code we write (.c and .cpp, etc.), they are all stored on the disk. And this kind of point of view has a narrower sense of knowledge, in fact, it has a wider scope for documents.

A file is not just the data it stores, but also its attributes (such as its permissions, its owner, all groups, its creation time, its type).

Everything is a file

For the operating system, as long as it can be read by input or output by output, it is called a file!

And our keyboards and monitors also have the ability to read and write, which means that they can also be called files, and the files in the narrow sense that we use every day are stored on the disk. If we want to read and write, we must also Reading and writing to the disk means that the disk is also a file.

For hardware read and write operations, only the operating system has permission to read and write!

file operations

The operation on the file is nothing more than the operation on the data of the file and the attributes of the file.

So for files, we know that it has the function of reading and writing, so who is this attribute of reading and writing aimed at?

Files on disk: have the attributes of read (read) and write (write).

Display: printf, cout -> This is the property of write.

Keyboard: scanf, cin -> This is the attribute of read (read)

Are you a little surprised by the properties of the keyboard and monitor? , Aren’t we reading it on the monitor, otherwise how would we see this article? Isn't the keyboard entered by us typing one by one, but why is it reading?

To give an example, a file is stored on the disk. If we want to access a file, we need to write code to run it before accessing the file. The essence is that the process is actually accessing the file!

Therefore, we cannot understand their attributes from our own perspective, we should understand them from the perspective of the process, and the process is loaded into memory. For a process, we need to read data from the keyboard, and the process can output information to the display for us to see.

Review the file operation functions of C language

Open a file, the interface provided by C language

write file

 #include<stdio.h>
 #include<unistd.h>
 #include<string.h>
 int main()
 {
   FILE* fl = fopen("file.txt","w");                                                                                                                                           
   if(fl == NULL)
   {
     perror("Open");
     return 1;
   }
   //文件write
   const char* str1 = "hello fputs\n";
   fputs(str1,fl);
   const char* str2 = "hello fwrite\n";
   fwrite(str2,strlen(str2),1,fl);
   const char* str3 = "hello fprintf\n";
   fprintf(fl,"%s",str3);
   fclose(fl);
   return 0;
 }

fopen opens in the form of w. If the file with the name does not exist, a file with the name will be created in the current directory.

And where is this file created based on? Current directory? Who's current directory?

The answer is the current working directory of the process!

 read file

Is the effect the same as the cat command?

So let's simulate cat

Simulation of cat command

system interface

The interface functions of the file operations we reviewed above are all interfaces provided by C language, but through the knowledge we have learned before, if we want to read and write hardware, only the operating system has this permission, and the user has no permission. . The functions such as fopen we use are all interfaces at the language level, so we can know that the functions provided by the language must call the system interface at the bottom of its implementation.

So here is a concept of why we should learn Linux system programming. During the waiting and replacement of the learning process before us, we have also come into contact with the interface provided by the system. We can clearly feel that the language level provides Interface, although we say it will be concise and convenient to use, but our actual understanding of it is slightly insufficient. Just look at the file operation functions provided by the C language we are using now. Although we know how to use it, it is difficult to understand it. What have you done.

Therefore, learning Linux system programming is to better understand the underlying principles, strengthen our knowledge, and our vision. Why? Because different languages ​​have different implementation methods for their respective functions, and the encapsulation of the underlying system interface is also different. Therefore, the purpose of learning the system interface is to better understand what the language-level functions are doing. And another advantage is that it will be more handy to learn the functional interfaces provided to you by various languages.

language cross-platform

Here is another concept, the cross-platform nature of the language, what is cross-platform? For example, if you write a code on Linux, it can run not only under Linux system, but also under Windows. This is cross-platform.

So will the system interfaces provided by Windows and Linux be the same? But it is different.

So how does the language realize that different platforms provide different system interfaces but realize the cross-platform nature of the language?

The answer is that in order to be compatible with different platforms, those senior programmers who implement the C language library must implement a language-level function for each platform, and then use the corresponding functions on different platforms through dynamic tailoring through conditional compilation.

File operation interface provided by Linux system

open

man 2 open

Let's take a look at the first open

int open(const char *pathname, int flags);

First look at the return value is an int type, it returns the file descriptor fd of the file if the file is opened successfully (we will talk about the file descriptor in the next article), and if the open fails, it returns -1

The first parameter is pathname is the path, which is easy to understand.

The second parameter is flags (identification), what is this? This is very interesting, this involves the knowledge of bit identifiers

Let's first take a look at what man uses for this parameter?

 These capitalized words can be understood as options, and from the naming style, we can know that these are all macros. For these options, each option has a different meaning, some represent addition, and some represent creation... (Everyone Come down and see all)

So if there are so many options, how can we pass them all in? If a function represents a parameter, it will not only look bloated and ugly for the function, but also unforgivable for the system interface, and the efficiency is too low. The use of bit identifiers solves this problem very well.

bit identifier (important)

If an int is used as a parameter, how many bits does it have? 28 

0000 0000 0000 0000

And if each bit represents an option, can the judgment of 28 options be completed through an int type? Yes 

This method is efficient and takes up very little space, perfect!

example

 

 Pass in multiple options by bitwise AND

Now let's get started using open

We want to perform write-only operations on the log.txt file, so we pass O_WRONLY 

 

 And there is no log.txt file in our current directory

 After running the program, it was found that the file or folder was not found. Why, according to the fopen we use in C language, if there is no such file, shouldn’t it be automatically created? Why is it not created here?

This refers to what we just said, the interface provided to you at the language level encapsulates the system interface in order to make it easier for us to use.

So if we want the effect of automatically creating this file if there is no such file, then the system interface function gives us the O_CREAT macro option.

 Now let's run the program again

 Although the log.txt file is created, there is a problem with its file permissions!

At this time, the second open will work

int open(const char *pathname, int flags, mode_t mode);

Its third parameter is to set the permissions of the file if the file is to be created

We are no strangers to file permission settings.

At this time, we enter 0666 (octal) to represent -rw-rw-rw

After running the program, although the permissions of the newly generated file are much more correct than last time, why is it -rw-rw-r--? We set the obvious -rw-rw-rw- 

 This is because of the existence of the mask !

umask (user-created file mask)

 Use the umask command to view the current umask value

Here my umask value is 0002 (octal) converted to binary is 0 000 000 010

So how does the mask allow me to set the 0666 permission to program 0664?

0666: 0 110 110 110                ->                          0 110 110 110 

0002: 0 000 000 010 -> bitwise inversion 1 111 111 101

                                                   ->Bitwise AND operation & -------------------

Result: 0 110 110 100

This is what the mask does

So is there a way to reset the mask? The system provides an interface

mode_t umask(mode_t mask);

                                                                                     

 Set the umask value to 0

 

 The permissions of the new file created at this time are correct!

close

 There is nothing to say about close, the file descriptor fd passed in to the file can close the file

write

 fd is the file descriptor of the file to be written, buf is the string to be input, and count is the number of bytes of buf expected to be written

The return value is the number of bytes actually written to buf

Let's take a look at the code below

Pay attention to the parameters of open, and want to write aa in the opened log.txt file.

At this point log.txt already has data

After running the code

 

 It is found that log.txt did not clear the data in the original log.txt before writing to aa, this is because we have passed an option less in open

 

 At this time, the complete function of fopen("log.txt","w") is realized.

And if you want to realize the function of fopen("log.txt", "a"), you need to replace O_TRUNC with O_APPEND

run the program several times 

 This realizes the complete function of fopen("log.txt", "a")!

read

 The parameter is similar to the return value field write, except that the buf here needs to read the content from the file

 

 Read successfully!

Getting to know the file descriptor fd

 For the content of the file descriptor, I will write another article and write another chapter specifically for the file, which will explain the close relationship between the file descriptor and the file in detail. Now let's get to know it briefly

First of all, for the code we wrote above, we can always see that the fd we output is 3

 

 It seems that the file descriptor can be found from the results as a series of consecutive numbers

And if it is a series of consecutive numbers, why does it start from 3?

This is because when we run the c language program, the three streams of stdin (standard input), stdout (standard output), and stderr (standard error) will be opened by default, and these three stream files correspond to our keyboard respectively. display, display.

And their file descriptors are stdin (0), stdout (1), stderr (2).

How to prove it?

 Since 1 represents standard output, can this program also output hello world to the screen?

 correct!

What is FILE

Let me talk about the conclusion first, it is a structure that stores almost all the contents (including attributes) of an opened file

How to prove it?

 _fileno represents the file descriptor fd

 For FILE, I will explain it in detail in the next article!

Guess you like

Origin blog.csdn.net/fengjunziya/article/details/131045529