Standard I/O library summary

Preface

Without accumulating silicon, one step cannot reach thousands of miles, and without accumulating small streams, there is no river. Some small problems and hidden BUGs encountered in the work are often caused by incomplete grasp of some small knowledge points. It is still necessary to summarize some commonly used knowledge points, which can be consulted later to improve work efficiency.

 

1. Concept:

1. The standard I/O library implements this library on many operating systems, and it is specified by the ISO C standard. The standard I/O library is a buffered I/O, which handles many details, such as buffer allocation, and I/O to optimize the length. These processes save users from worrying about how to choose the correct block length. This makes it easy for users to use, but if you understand the operation of the I/O library functions in depth, it will also bring some problems.

      All file I/O is for file descriptors. When a file is opened, a file descriptor is returned, and then the file descriptor is used for subsequent I/O operations. As for the standard I/O libraries, their operations revolve around streams. When opening or creating a file with the standard I/O library, we have associated a stream with a file.

      When opening a stream, the standard I/O function fopen returns a pointer to a FILE object (type FILE *, called a file pointer). The object is usually a structure, which contains all the information needed by the standard I/O library to manage the stream, including: the file descriptor for the actual I/O, the pointer to the buffer for the stream, and the buffer The length of the area, the number of characters currently in the buffer, and error flags.

2. Standard input, standard output and standard error:
       Three streams are predefined for a process, and these three streams can be automatically used by the process. They are: standard input, standard output and standard error. These three standard I/O streams are referenced by predefined file pointers stdin, stdout and stderr. These three file pointers are defined in the header file <stdio.h> (the predefined file descriptors STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO in the file I/O are also defined in the header file <unistd.h>).

3. Buffering: (define different buffer lengths, and the amount of CPU time required to perform I/O is different) The purpose of buffering
      provided by the standard I/O library is to minimize the number of read and write calls . The standard I/O library also automatically buffers each I/O stream, thus avoiding the trouble that the application needs to consider this point.

      The standard I/O library provides three types of buffers:

      1) Full buffer: In this case, the actual I/O operation is performed after the standard I/O buffer is filled. For files residing on disk (such as regular files), full buffering is usually implemented by the standard I/O library.

       2) Line buffering, the standard I/O library only performs I/O operations after writing a line, or when a newline character is encountered in the input and output. Line buffers are usually used for terminal devices, such as terminal devices corresponding to standard input and standard output.
      Note: The length of the buffer used by the standard I/O library to collect each line is fixed, so as long as the buffer is filled, even if a newline character has not been written, the I/O operation is performed.

      3) Without buffering: The
      standard I/O library does not buffer characters. For example, if you use the standard I/O function fputs to write 15 characters to an unbuffered stream, the function may use the write system call function to immediately write these characters to the associated open file.
      The standard error stream stderr is usually unbuffered, which allows the error information to be displayed as soon as possible, regardless of whether it contains a newline character.

 

2. Standard I/O function analysis:

1. Open and close the stream:

      1.1 Open the stream fopen:

      Header file: #include <stdio.h>
      Function prototype: FILE *fopen(const char *restrict pathname, const char *restrict type); 
      freopen and fdopen will not be discussed.
      Return value: Return the file pointer if successful, or NULL if an error occurs.
      Function: Open a specified file.
      Parameters:
      pathname: file path
      type: the way the file is opened, usually in the following ways. The values ​​are as follows:
      r rb Open read-only; the file must already exist.
      w wb Open in write-only mode; if the file does not exist, create it, if the file already exists, truncate the file length to 0 bytes and rewrite it, that is, replace the original file content.
      a ab Write only to open the end of the appended file; only append data at the end of the file, and create it if it does not exist.
      r+ rb+ Open for reading and writing; reading and writing are allowed, and the file must already exist.
      w+ wb+ Open for reading and writing, set the length to zero; allow reading and writing, if the file does not exist, create it, if the file already exists, cut the file length to 0 bytes and write again.
      a+ ab+ Open for reading and writing, append to the end of the file; allow reading and appending data, and create if the file does not exist.

      1.2 Close the stream fclose:

      Header file: #include <stdio.h>
      Function prototype: int fclose(FILE *fp); 
      Return value: Return 0 if successful, and EOF if error occurs.
      Description: Before the file is closed, flush the output data in the buffer. Any input data in the buffer is discarded. If the standard I/O library has automatically allocated a buffer for the stream, the buffer is released; when a process terminates normally (by calling exit directly or returning from the main function), all standards (assigned to the process) I/O streams will be flushed, and all open I/O streams will be closed.

    1.3 Error code errno and print error information functions perror, strerror:
    The errors of the system call in Linux are stored in errno, errno is maintained by the operating system, and the nearest error is stored, that is, the next error code will overwrite the previous error . Various errors correspond to an error code. errno is declared in errno.h and is an integer variable. All error codes are positive integers.
    Printing errno directly will only print an integer value and no error can be seen. You can use the perror or strerror function to interpret errno as a string and then print it.

    Header file: #include <stdio.h>
    Function prototype: void perror(const char *s);
    Function: Print system error information.
    Parameter: s represents the string prompt
    Return value: no return value
    Description: The perror function prints the error information to the standard error output, first prints the string pointed to by the parameter s and then prints the ":" sign, and then prints according to the current errno value wrong reason. Output format:
    const char *s: strerror(errno), prompt: the reason for the system error.

    Header file: #include <string.h>
    Function prototype: char *strerror(int errnum);
    Function: According to the error number, it returns the error reason string.
    Parameter errnum: error code errno.
    Return value: the string corresponding to the error code errnum.
    Note: The error code of some functions is not stored in errno but returned by the return value. Then strerror can be used, usage: fputs(strerror(n), stderr).

2. Read and write streams:
    2.1 Once the stream is opened, you can choose from three different types of unformatted I/O, and read and write them:
    1) One character I/O at a time. Use these functions to read or write one character at a time, read one character at a time: getc, fgetc, getchar; output one character putc, fputc, putchar at a time.
    2) I/O one line at a time. Use these functions to read or write one line at a time, read one line: fgets; write one line: fputs. Each line is terminated by a newline character. When calling fgets, the maximum line length that can be processed should be stated.
    3) Direct I/O (also known as binary I/O). The fread and fwrite functions support this type of I/O. Each I/O operation reads or writes a certain number of objects, and each object has a specified length. These two functions are often used to read or write one structure at a time from a binary file.
    4) Supplement: Formatting I/O functions, such as printf and scanf.
   
    2.2.1 The following three functions are used to read one character time:
    header file: #include <stdio.h>
    Function Prototype:
    int getc (the FILE * FP);
    int fgetc (FP the FILE *);
    int getchar (void);
    three The return value of this function: if successful, it returns the next character, if the end of the file has been reached or an error occurs, it returns EOF.
    1) The usage of fgetc() and getc() is the same, getc() is a macro definition function, and fgetc() is a function.
    2) getchar() reads a character from the standard input stdin. When the program is waiting for you to input, you can input multiple characters, and the program continues to execute after pressing Enter. But getchar reads only one character.
    3) The return value of these three functions is to return the next character in the stream, and then convert it from the unsigned char type to the int type.
    4) EOF is a symbolic constant defined in the stdio.h file with a value of -1. EOF is not an outputable character, so it cannot be displayed on the screen.
    
    2.2.2 The following three functions are used to output one character at a time:
    Header file: #include <stdio.h>
    Function prototype:
    int putc(int c, FILE *fp);
    int fputc(int c, FILE *fp);
    int putchar(int c);
    The return values ​​of the three functions: c if successful, and EOF if an error occurs.
    
    2.3.1 Input one line at a time for the following two functions:
    Header file: #include <stdio.h>
    Function prototype:
    char *fgets(char *restrict buf, int n, FILE *restrict fp);
    char *gets(char *buf );
    Return value: buf is returned if successful, NULL is returned if the end of the file has been reached or an error occurs;
    1) These two functions both specify the buffer address, and the read line will be sent to it. gets reads from standard input, while fgets reads from the specified stream.
    2) fgets must specify the length of the buffer n. It stops when reading (n-1) characters, or when reading a newline character, or when it reaches the end of the file. The characters read are sent to the buffer. The buffer ends with a null character. If the number of characters in the line (including the last newline character) exceeds n-1, fgets returns an incomplete line. When fgets is called next time, it will continue to read the rest of the line.
    3) gets is a deprecated function. The problem is that users cannot specify the length of the buffer when using gets, which may cause buffer overflow (if the line is longer than the buffer length) and write to the storage space after the buffer, resulting in unpredictable consequences.
    4) Another difference between gets and fgets is that gets does not store newline characters in the buffer.
       
    2.3.2 The following two functions output one line at a time:
    Header file: #include <stdio.h>
    Function prototype:
    int *fputs(const char *restrict str, FILE *restrict fp);
    int *puts(const char *str) ;
    Return value: return a non-negative value if successful, and return EOF if an error occurs;
    1) fputs writes a null-terminated string to the specified stream, and the end terminator null is not written out. Usually, there is a newline before the null character, but this is not always required.
    2) puts writes a null-terminated string to standard output, and does not write the terminator. However, puts will write newline characters to standard output.
   
    2.4.1 Direct I/O (Binary I/O)
    When we want to read or write an entire structure (such as writing a structure), if we use getc/fgetc or putc/fputc to read and write a structure, we must loop through the entire structure, processing one byte each time, and reading or writing at a time One byte, which would be very troublesome and time-consuming. If you use fputs and fgets, when outputting, fputs stops when it encounters a null character, and the structure may contain null characters, so it cannot be used to achieve the requirement of reading the structure. If the input data contains null characters or newline characters during input, fgets will not work correctly. Therefore, the following two functions are required to perform binary I/O operations.
    Header file: #include <stdio.h>
    Function prototype:
    size_t fread(void *restrict ptr, size_t size, size_t nobj, FILE *restrict fp);
    Function: Read the contents of the file into the buffer.
    size_t fwrite(const void *restrict ptr, size_t size, size_t nobj, FILE *restrict fp);
    Function: Write the contents of the buffer to the file.
    The return value of the two functions: the number of objects read or written;
   
    two common usages:
    1) Read or write a binary array, for example, in order to write the 2nd to 5th elements of a floating-point array to a file, You can write the following program:
    float data[10];
    
    if(fwrite(&data[2], sizeof(float), 4, fp) != 4)
        err_sys("fwrite error");
    Among them, the specified size is the length of each array element, and nobj is the number of elements to be written;
    
    2) Read or write a structure. For example,
    struct{         short count;         long total;         char name[NAMESIZE];     }item;     if(fwrite(&item, sizeof(item), 1, fp) != 1)         err_sys("fwrite error");     where, specify sizeof (item) is the length of the structure, nobj is 1 means the number of objects to be written;




    


    Description: 

    1) For fread, if an error occurs or the end of the file is reached, this number can be less than nobj. In this case, you should call ferror or feof to determine which case it belongs to

    2) Usually the return value of fread is a non-negative integer. If the return value is negative, there must be an error.

    3) For fwrite, if the return value is less than the nobj to be written, an error occurs. The return value of fwrite can be compared with nobj to determine whether the function execution is wrong.

    2.4.2 feof and ferror
    header files: #include <stdio.h>
    Function prototype: int feof(FILE * stream); 
    related functions fopen, fgetc, fgets, fread.
    Function: Check whether the file stream has reached the end of the file.
    Function description: feof() is used to detect whether the end of file (EOF) has been read. The parameter stream is the file pointer returned by fopen(). If an error occurs or the file pointer reaches the end of the file (EOF), it returns TRUE, otherwise it returns FALSE. That is, if the end of the file is reached, it returns a non-zero value, otherwise it returns 0. 
    A non-zero return value means that the end of the file has been reached.
    Usage:
    while(!feof(pf)) {         ch = fgetc(pf);         putchar(ch);         fputc(ch, pf2);     }     Header file: #include <stdio.h>     Function prototype: int ferror(FILE *stream );     Related functions fopen, fgetc, fgets, fread.     Function: Test the error identifier of a given stream.      Return value If the error identifier associated with the stream is set, the function returns a non-zero value, otherwise it returns a zero value.     Usage:     c = fgetc(fp);




    







    if( ferror(fp))
    {         printf("read file error!\n");     }     clearerr(fp);     If we try to open a write-only empty file, an error will occur. clearerr(fp); Used to clear the error identifier in the stream.



    2.5 Special instructions

    The fgets function reads a line ending with'\n' (including'\n') from the file pointed to by stream and stores it in the buffer s, and adds a'\0' to the end of the line to form a complete character string.

    For fgets(),'\n' is a special character, and'\0' has nothing special. If you read'\0', it will be read as a normal character. If there is a'\0' character (or 0x00 byte) in the file, after calling fgets(), it is impossible to determine whether the'\0' in the buffer is the character read from the file or the end automatically added by fgets() Therefore, fgets() is only suitable for reading text files and not binary files, and all characters in the text file should be visible characters, and there can be no'\0' .

 

3. Take the reading and writing of binary files and text files as examples to deepen the study of the above theory.

    Focus on learning fopen, fread, fwrite functions; fgetc, fputc functions as examples;

    3.1 Use the fopen, fread, and fwrite functions to copy binary files and text files. Set different data segment lengths COUNT to take a look at the accuracy and efficiency of file reading and writing.

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h> 

#define COUNT 1024

/*
*复制文件
*/
int file_copy(char *str, char *dstr)
{

	FILE *sp = NULL;
	FILE *fp = NULL;
	char ch[COUNT] = {0};
	int i = 0;
	int count = 0;
	
	if((str == NULL) || (dstr == NULL))
	{
		printf("input parameter is NULL!\n");
		return -1;

	}

	//打开二进制文件rb,文本文件r
	sp = fopen(str, "rb");
	if(sp == NULL)
	{
		printf("open file %s failed!\n", str);
		return -1;

	}

	fp = fopen(dstr, "ab");
	if(fp == NULL)
	{
		printf("open file %s failed!\n", dstr);
		return -1;


	}

	struct stat f_stat;
	if (stat(str, &f_stat) == -1)
	{
		fclose(sp);
		fclose(fp);
		return -1;

	} 
	
	f_stat.st_size;

	
	if(f_stat.st_size % COUNT == 0)
	{
		count = f_stat.st_size / COUNT;
	}
	else
	{
		count = f_stat.st_size / COUNT + 1;
	}
	printf("count = %d , f_stat.st_size = %d \n", count, f_stat.st_size);
	
	for(i = 0; i < count; i++)
	{
		if(fread(ch, sizeof(char), COUNT, sp) != COUNT)
		{
			//如果出错或者文件指针到了文件末尾(EOF)则返回 TRUE,否则返回 FALSE。
			if(feof(sp))
			{
				//如果设置了与流关联的错误标识符,该函数返回一个非零值,否则返回一个零值。
				if( ferror(sp) )
				{
					printf("read file %s error!\n", str);
					
					fclose(sp);
					fclose(fp);
					return -1;
				}
				else
				{
					printf("remainder = %d	\n", f_stat.st_size % COUNT);
					if(fwrite(ch, sizeof(char), f_stat.st_size % COUNT, fp) != f_stat.st_size % COUNT)
					{
						printf("write file %s error!\n", dstr);
						
						fclose(sp);
						fclose(fp);
						return -1;
					
					}
				
				}

				clearerr(fp);//清除错误标识符
				break;
			}

		}
		else
		{
			if(fwrite(ch, sizeof(char), COUNT, fp) != COUNT)
			{
				printf("write file %s error!\n", dstr);
				
				fclose(sp);
				fclose(fp);
				return -1;

			}
		}

	}
	
	fclose(sp);
	fclose(fp);

	return 0;
}


int main(int argc, char *argv[])
{

	int ret = 0;

	printf("%s %d argc:%d\r\n", __FUNCTION__, __LINE__, argc );

	if((argv[1] == NULL) || (argv[2] == NULL) || (argc < 3))
	{
		printf("input parameter is NULL!\n");
		return -1;

	}

	printf("argv0 = %s\r\n", argv[0]);
	printf("argv1 = %s\r\n", argv[1]);
	printf("argv1 = %s\r\n", argv[2]);


	ret = file_copy( argv[1], argv[2]);
	
	if(ret != 0)
	{
		printf("file_copy error!\n");
	}


	return 0;
}

     Note the relationship between the size of the data block COUNT for each read and write and the number of cycles of read and write count:

    Set COUNT = 1

    $ ./file_copy test.bin file_copy_result.bin

    main 109 argc:3

    argv0 = ./file_copy

    argv1 = test.bin

    argv1 = file_copy_result.bin

    count = 6127992 , f_stat.st_size = 6127992

    The figure below is the result of binary comparison using compare software. The copied binary file is consistent with the original file.

     COUNT = 256

     $ ./file_copy test.bin file_copy_result.bin

     main 126 argc:3

     argv0 = ./file_copy

     argv1 = test.bin

     argv1 = file_copy_result.bin

     count = 23938 , f_stat.st_size = 6127992

     remainder = 120

    COUNT = 512

    $ ./file_copy test.bin file_copy_result.bin

    main 125 argc:3

    argv0 = ./file_copy

    argv1 = test.bin

    argv1 = file_copy_result.bin

    count = 11969 , f_stat.st_size = 6127992

    remainder = 376

    COUNT = 1024

    $ ./file_copy test.bin file_copy_result.bin

    main 125 argc:3

    argv0 = ./file_copy

    argv1 = test.bin

    argv1 = file_copy_result.bin

    count = 5985 , f_stat.st_size = 6127992

    remainder = 376

    Read the Md5 file, pay attention to modify the fopen method of opening the file: 

	sp = fopen(str, "r");
	if(sp == NULL)
	{
		printf("open file %s failed!\n", str);
		return -1;
	}

	fp = fopen(dstr, "a");
	if(fp == NULL)
	{
		printf("open file %s failed!\n", dstr);
		return -1;
	}

    COUNT = 1024

    $ ./file_copy testcfg.md5 file_copy_result.log

    main 125 argc:3

    argv0 = ./file_copy

    argv1 = testcfg.md5

    argv1 = file_copy_result.log

   count = 1 , f_stat.st_size = 215

   remainder = 215

    Read the log file:

    COUNT = 1024

    $ ./file_copy ifconfig file_copy_result.log

    main 125 argc:3

    argv0 = ./file_copy

    argv1 = ifconfig

    argv1 = file_copy_result.log

    count = 1 , f_stat.st_size = 952

    remainder = 952

 

3.1 Use fopen, fgetc, and fputc functions to copy binary files and text files.

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h> 

/*
*复制文件
*/
int file_copy(char *str, char *dstr)
{

	FILE *sp = NULL;
	FILE *fp = NULL;
	int ret = 0;
	int ch;
	
	if((str == NULL) || (dstr == NULL))
	{
		printf("input parameter is NULL!\n");
		return -1;

	}

	//打开二进制文件rb,文本文件r
	sp = fopen(str, "rb");
	if(sp == NULL)
	{
		printf("open file %s failed!\n", str);
		return -1;

	}

	fp = fopen(dstr, "ab");
	if(fp == NULL)
	{
		printf("open file %s failed!\n", dstr);
		return -1;
	}

	//如果出错或者文件指针到了文件末尾(EOF)则返回 TRUE,否则返回 FALSE。
	while(!feof(sp))
	{
		ch = fgetc(sp);
		if(ch != EOF)
		{
			ret = fputc(ch, fp);
			if(ret == EOF)
			{
				printf("write file %s error!\n", dstr);
				
				fclose(sp);
				fclose(fp);
				return -1;

			}
		}
	}


	fclose(sp);
	fclose(fp);

	return 0;
}


int main(int argc, char *argv[])
{

	int ret = 0;

	printf("%s %d argc:%d\r\n", __FUNCTION__, __LINE__, argc );

	if((argv[1] == NULL) || (argv[2] == NULL) || (argc < 3))
	{
		printf("input parameter is NULL!\n");
		return -1;

	}

	printf("argv0 = %s\r\n", argv[0]);
	printf("argv1 = %s\r\n", argv[1]);
	printf("argv1 = %s\r\n", argv[2]);


	ret = file_copy( argv[1], argv[2]);
	
	if(ret != 0)
	{
		printf("file_copy error!\n");
	}


	return 0;
}

    Read binary files:

    $ ./file_copy_fgetc test.bin  file_copy.bin

    main 73 argc:3

    argv0 = ./file_copy_fgetc

    argv1 = test.bin

   argv1 = file_copy.bin

   $

    Read Md5 file:

    $ ./file_copy_fgetc testcfg.md5 file_copy_result.log

    main 71 argc:3

    argv0 = ./file_copy_fgetc

    argv1 = testcfg.md5

    argv1 = file_copy_result.log

    $

     Read the log file:

     $ ./file_copy_fgetc ifconfig file_copy_result.log

    main 71 argc:3

    argv0 = ./file_copy_fgetc

    argv1 = ifconfig

    argv1 = file_copy_result.log

    $

 

4 Format IO 

4.1 Format the output
    header file: #include <stdio.h>
    Function prototype:
    int printf(const char *restrict format, ...);
    int fprintf(FILE *restrict fp, const char *restrict format, ...);
    Return value: The return value of the two functions. If successful, it returns the number of characters output. If the output fails, it returns a negative value.
    Thinking: What does the printf function return 0 mean?
    
    int sprintf(char *restrict buf, const char *restrict format, ...);
    int fprintf(char *restrict buf, size_t n, const char *restrict format, ...);
    Return value: return value of two functions , If it succeeds, it returns the number of characters stored in the array, if there is an encoding error, it returns a negative value.
    
    1) printf writes formatted data to standard output.
    2) fprintf writes data to the specified stream.
    3) sprintf sends the formatted characters into the array buf. sprintf automatically adds a null character to the end of the array, but this character is not included in the return value.
    The sprintf function may cause overflow of the buffer pointed to by buf. The caller must ensure that the buffer is large enough.
    4) The snprintf function solves the buffer overflow problem of the sprintf function. In this function, the buffer length is a display parameter, and any characters written beyond the end of the buffer will be discarded. If the buffer is large enough, the snprintf function returns the number of characters written to the buffer. snprintf automatically adds a null character to the end of the array, but this character is not included in the return value. If the snprintf function returns a positive value less than the buffer length n, then the output is not truncated.
    
    5) Format description control:
    %[flags][fldwidth][precision][lenmodifier]convtype
    flags:


    fldwidth specifies the minimum field width for conversion. If the converted characters are less, fill them with spaces. The field width is a non-negative decimal number or an asterisk (*).
    precision specifies the minimum number of digits after the integer conversion, the minimum number of digits after the decimal point after the floating-point number conversion, and the maximum number of characters after the string conversion. The precision is a point (.) followed by an optional non-negative decimal integer or an asterisk (*).
    The lenmodifier indicates the length of the parameter, and its possible values ​​are as follows.


    convtype controls how to interpret the parameters, as shown in the following table.
    


    4.2 Format the input
    header file: #include <stdio.h>
    Function prototype:
    int scanf(const char *restrict format, ...);
    int fscanf(FILE *restrict fp, const char *restrict format, ...);
    int sscanf(const char *restrict buf, const char *restrict format, ...);
    Return value: the return value of the three functions, the specified number of input items; return if the input is wrong or the end of the file has been reached before any transformation EOF.
    
    1) The scanf function family is used to analyze the input string and convert the character sequence into a variable of the specified type. Each parameter after the format contains the address of the variable, which is used to initialize these variables as a result of the conversion.
    2) Format description control:
    %[*][fldwidth][precision][lenmodifier]convtype
    fldwidth indicates the maximum field width for conversion.
    The lenmodifier description indicates the size of the parameter to be initialized for the conversion result.
    convtype controls how to interpret the parameters, as shown in the following table.

Reference: For a summary of formatted I/O functions, please refer to the following blog post, which is well summarized.
       https://blog.csdn.net/zhangna20151015/article/details/52794982       Linux C string input function gets(), fgets(), scanf() Detailed explanation
       https://www.cnblogs.com/52php/p/5724372 .html scanf,sscanf,fscanf

 

Quoting the following information, I sincerely thank you for sharing:
       https://blog.csdn.net/zhangna20151015/article/details/52794982
       https://www.cnblogs.com/52php/p/5724372.html

Advanced Programming in UNIX Environment

 

Guess you like

Origin blog.csdn.net/the_wan/article/details/108294340