C language file operation (including detailed steps)


1. Why use files?

When we are writing a project, we naturally think of saving the previously written data. And only when we choose to delete the data ourselves, the data ceases to exist. This involves the problem of data persistence. Our general data persistence methods include storing data in disk files and storing them in databases. Here we talk about how to put data into disk files.

2. What is a file?

A file on disk is a file. For example, the contents of folders placed in the C drive of the computer are files. But in program design, we generally talk about two types of files: program files and data files (classified from the perspective of file functions).

1. Program file

Including source program file (suffix .c), object file (windows environment suffix .obj), executable program (windows environment suffix .exe).

2. Data files

The content of the file is not necessarily the program, but the data read and written when the program is running, such as the file from which the program needs to read data, or the file that outputs the content.

Most of what this blog discusses are data files. Because we have to learn how to input the data in the file into the memory and how to output the data in the program to the file. The input and output of data dealt with in previous chapters all target the terminal, that is, data is input from the keyboard of the terminal, and the running results are displayed on the monitor. In fact, sometimes we will output the information to the disk, and then read the data from the disk into the memory for use when needed. Here, the files on the disk are processed.

3. File name

A file must have a unique file identifier for user identification and reference.
The file name consists of 3 parts: file path + file name trunk + file suffix
For example: c:\code\test.txt
For convenience, the file identifier is often called the file name .

3. File opening and closing

1. File pointer

In the cache file system, the key concept is "file type pointer", referred to as **"file pointer"**.
Each used file has opened up a corresponding file information area in the memory, which is used to store the relevant information of the file (such as the name of the file, the status of the file and the current location of the file, etc.). This information is stored in a structure variable. The structure type is declared by the system, named FILE.

For example, the stdio.h header file provided by the VS2013 compilation environment has the following file type declaration:

struct _iobuf {
    
    
        char *_ptr;
        int   _cnt;
        char *_base;
        int   _flag;
        int   _file;
        int   _charbuf;
        int   _bufsiz;
        char *_tmpfname;
       };
typedef struct _iobuf FILE;
FILE* pf;//文件指针变量

The contents of the FILE type of different C compilers are not exactly the same, but they are similar.
Whenever a file is opened, the system will automatically create a variable of the FILE structure according to the situation of the file, and fill in the information, and the user does not need to care about the details.
Generally, the variables of this FILE structure are maintained through a FILE pointer, which is more convenient to use.
Below we can create a FILE* pointer variable:

FILE* pf;//file pointer variable

Define pf as a pointer variable pointing to FILE type data. You can make pf point to the file information area of ​​a certain file (it is a structure variable). The file can be accessed through the information in the file information area. That is to say, the file associated with it can be found through the file pointer variable.
insert image description here

2. File opening and closing

Files should be opened before reading and writing, and should be closed after use.
When writing a program, when opening a file, a FILE* pointer variable will be returned to point to the file, which is equivalent to establishing the relationship between the pointer and the file.
ANSIC stipulates that the fopen function is used to open the file, and the fclose is used to close the file.
It should be remembered that when the file is opened and the data is processed, the file must be closed, otherwise the data may be lost.

//打开文件
FILE * fopen ( const char * filename, const char * mode );
//关闭文件
int fclose ( FILE * stream );

For the way of writing and reading files, it is enough to focus on the following.

file usage meaning If the specified file does not exist
"r" (read-only) To enter data, open an existing text file go wrong
"w" (write only) To output data, open a text file create a new file
“a” (append) Add data to the end of the text file create a new file
"rb" (read-only) To enter data, open a binary file go wrong
"wb" (write only) To output data, open a binary file create a new file

Example code:

/* fopen fclose example */
#include <stdio.h>
int main ()
{
    
    
  FILE * pFile;
  //打开文件
  pFile = fopen ("myfile.txt","w");//以输出的形式(写)打开文件
  //文件操作
  if (pFile!=NULL)
 {
    
    
    fputs ("fopen example",pFile);//以字符串的形式写入
    //关闭文件
    fclose (pFile);
 }
  return 0; 
}

4. Sequential reading and writing of files

The output/write of the file is to write data into the file, and the input/read of the file is to read the contents of the file into the memory.

insert image description here

The following functions for reading and writing files are required to master

Function Function name apply to
character input function fgetc all input streams
character output function fputc all output streams
text line input function fgets all input streams
text line output function fputs all output streams
format input function fscanf all input streams
format output function fprintf all output streams
binary input fread document
binary output fwrite document

Four, fseek function

Positions the file pointer based on the file pointer's position and offset. As the name implies, the file pointer is also a pointer, which can point to a certain position in a string. The parameters it needs to receive are:
insert image description here
the first parameter is the name of the file pointer (stream), the second parameter is the backward offset of the file pointer, and the third parameter is one of the three options specified in the fseek function one.
insert image description here
The first of these three items is SEEK_CUR, that is, the offset of the current file pointer starts to offset backwards. The second item is SEEK_END, that is, the offset starts from the end of the file. Of course, the offset number must be negative to read the contents of the file. The third item is SEEK_SET, which is the backward offset from the front end of the file. for example:

#include <stdio.h>
int main ()
{
    
    
  FILE * pFile;
  pFile = fopen ( "example.txt" , "wb" );
  fputs ( "This is an apple." , pFile );
  fseek ( pFile , 9 , SEEK_SET );
  fputs ( " sam" , pFile );
  fclose ( pFile );
  return 0; }

Why is the result printed out in Notepad finally is This is a sample? The reason is that in the first fputs, This is an apple. is first put into Notepad. When the fseek function is called, from the current file The pointer is offset by 9 bytes backwards, and the file pointer initially points to the first address of the file by default. Therefore, after offsetting 9 bytes backwards (one byte offset including spaces), it points to the address of the last space. The second time the fputs function starts writing the content of "sam" at the address pointed to by the file pointer last time. So the final result of the program running is as follows:
insert image description here

Five, ftell function

Returns the offset of the file pointer relative to the starting position.
insert image description here
This function is relatively simple, the input parameter is the file pointer stream, and the return value type is int, that is, the offset that the file pointer points to is returned.

#include <stdio.h>
int main ()
{
    
    
  FILE * pFile;
  long size;
  pFile = fopen ("myfile.txt","rb");
  if (pFile==NULL) perror ("Error opening file");
  else
 {
    
    
    fseek (pFile, 0, SEEK_END);   // non-portable
    size=ftell (pFile);
    fclose (pFile);
    printf ("Size of myfile.txt: %ld bytes.\n",size);
 }
  return 0; 
}

insert image description here

Because it is the offset from the end of the file content relative to the starting position. Then the result is 17.
The result of running the code is:
insert image description here

Six, rewind function

Return the position of the file pointer to the beginning of the file.
insert image description here
The return value type of the rewind function is void, and the parameter it needs is a file pointer stream. This function is relatively simple, let's give an example directly.

#include <stdio.h>
int main ()
{
    
    
  int n;
  FILE * pFile;
  char buffer [27];
  pFile = fopen ("myfile.txt","w+");
  for ( n='A' ; n<='Z' ; n++)
    fputc ( n, pFile);
  rewind (pFile);
  fread (buffer,1,26,pFile);
  fclose (pFile);
  buffer[26]='\0';
  puts (buffer);
  return 0; }

Result of running the code:
insert image description here
And a notepad with this content in the program's folder produces:
insert image description here

Seven, text files and binary files

Depending on how the data is organized, data files are called text files or binary files.

Data is stored in binary form in memory, and if it is output to external storage without conversion, it is a binary file.

If it is required to store in the form of ASCII code on the external storage, it needs to be converted before storage. A file stored in the form of ASCII characters is a text file. (For example, if the integer 10000 needs to be output to the disk in ASCII code, the storage form on the disk is 10000).

If there is an integer 10000, if it is output to the disk in the form of ASCII code, it will occupy 5 bytes (one byte for each character) on the disk, and if it is output in binary form, it will only occupy 4 bytes on the disk (VS2013 test ).

Let's use the integer 10000 as an example. If output to disk in binary form, it is stored in binary form on disk. But when we go to the bottom of the file to look at the text in binary form, it is all garbled and unintelligible (but the machine can understand it). At this point we move the text file to the compiler (VS2019). And there is a binary editor in the compiler that can translate the garbled code into a binary number and display it. The detailed steps are as follows:

code:

#include <stdio.h>
int main()
{
    
    
 int a = 10000;
 FILE* pf = fopen("test.txt", "wb");
 fwrite(&a, 4, 1, pf);//二进制的形式写到文件中
 fclose(pf);
 pf = NULL;
 return 0; }

Go to the bottom of the file to view the text:
insert image description here
move the text to the compiler and operate according to the following legend:
insert image description here
At this time, we open the text in the compiler:
insert image description here
what makes 10000 stored in binary form become 10 27 00 00? The reason is that we first write out the binary sequence of 10000, which is: 00000000 00000000 00100111 00010000, and every four digits is a hexadecimal number. The result is 00 00 27 10, but our compiler stores it in little endian. That is, the low bits of the data are stored in the low addresses of the memory, and the high bits of the data are stored in the high addresses. Then the stored form is: 10 27 00 00.

8. Judgment of the end of file reading

1. Incorrect use of feof function

During the file reading process, the return value of the feof function cannot be used to directly determine whether the file is over.
Instead, it is applied when the file reading ends, judging whether the reading fails to end, or the end of the file is encountered. (The feof function is the result of judging the end process rather than judging the end)

1. Whether the reading of the text file is finished, judge whether the return value is EOF (getc) or NULL (fgets)
For example:
fgetc judges whether it is EOF.
insert image description here

fgets judges whether the return value is NULL.
 insert image description here
2. Judging the end of reading the binary file, and judging whether the return value is less than the actual number to be read.
For example:
fread judges whether the return value is less than the actual number to be read.

Example of proper use of the feof function in file text:

#include <stdio.h>
#include <stdlib.h>
int main(void) {
    
    
    int c; // 注意:int,非char,要求处理EOF
    FILE* fp = fopen("test.txt", "r");
    if(!fp) {
    
    
        perror("File opening failed");
        return EXIT_FAILURE;
   }
 //fgetc 当读取失败的时候或者遇到文件结束的时候,都会返回EOF
    while ((c = fgetc(fp)) != EOF) // 标准C I/O读取文件循环
   {
    
     
       putchar(c);
   }
 //判断是什么原因结束的
    if (ferror(fp))
        puts("I/O error when reading");
    else if (feof(fp))
        puts("End of file reached successfully");
    fclose(fp);
}

Example of proper use of the feof function in a binary file:

#include <stdio.h>
enum {
    
     SIZE = 5 };
int main(void) {
    
    
    double a[SIZE] = {
    
    1.,2.,3.,4.,5.};
    FILE *fp = fopen("test.bin", "wb"); // 必须用二进制模式
    fwrite(a, sizeof *a, SIZE, fp); // 写 double 的数组
    fclose(fp);
    double b[SIZE];
    fp = fopen("test.bin","rb");
    size_t ret_code = fread(b, sizeof *b, SIZE, fp); // 读 double 的数组
    if(ret_code == SIZE) {
    
    
        puts("Array read successfully, contents: ");
        for(int n = 0; n < SIZE; ++n) printf("%f ", b[n]);
        putchar('\n');
   } else {
    
     // error handling
       if (feof(fp))
          printf("Error reading test.bin: unexpected end of file\n");
       else if (ferror(fp)) {
    
    
           perror("Error reading test.bin");
       }
   }
    fclose(fp);
}

Nine, file buffer

When it comes to the file buffer, we naturally think of the input buffer, that is, when a character is input from the keyboard, it is not directly input to the disk, but first placed in the input buffer, and when the input buffer After the characters are filled, the file buffer can input characters to the disk.

The same is true for file buffers. Data output from memory to disk will be sent to the buffer in memory first, and then sent to disk together after the buffer is filled. If data is read from the disk to the computer, the data read from the disk file is input to the memory buffer (full of the buffer), and then the data is sent from the buffer to the program data area (program variables, etc.) one by one. The size of the buffer is determined by the C compilation system.

Test code:

#include <stdio.h>
#include <windows.h>
//VS2013 WIN10环境测试
int main()
{
    
    
 FILE*pf = fopen("test.txt", "w");
 fputs("abcdef", pf);//先将代码放在输出缓冲区
 printf("睡眠10秒-已经写数据了,打开test.txt文件,发现文件没有内容\n");
 Sleep(10000);
 printf("刷新缓冲区\n");
 fflush(pf);//刷新缓冲区时,才将输出缓冲区的数据写到文件(磁盘)
 //注:fflush 在高版本的VS上不能使用了
 printf("再睡眠10秒-此时,再次打开test.txt文件,文件有内容了\n");
 Sleep(10000);
 fclose(pf);
 //注:fclose在关闭文件的时候,也会刷新缓冲区
 pf = NULL;
 return 0; }

We can test this code. When the program first reaches the fgets function, we immediately open the test.txt text file. We will find that there is no content in it, and we open test.txt again when we use the fflush function to refresh the file buffer. When you enter a text file, you will find that there is already input content in it. It can be confirmed that there is indeed a file buffer.

Because of the existence of the buffer, when the C language operates the file, it needs to refresh the buffer or close the file at the end of the file operation. Failure to do so can cause problems reading and writing files.


Originality is not easy. If this article is helpful to you, please give it a like and support, thank you!

Guess you like

Origin blog.csdn.net/ZJRUIII/article/details/120552735