To understand the buffer, reading this article is enough

  Whether it is an operating system or learning a language, everyone should have heard of the buffer, so what is the buffer? Where is the buffer located? In this article, Xiaobian will take you to understand the buffer. ❄️❄️❄️ Let us appreciate the picture below, let us congratulate EDG to win the championship. ❄️❄️❄️
insert image description here

1. The extraction of the buffer

1.1 “\n”

  For the following code, what will their execution result look like?

int main()
{
    
    
	printf("你好呀,缓冲区!\n");
	sleep(3);
	return 0;
}
int main()
{
    
    
	printf("你好呀,缓冲区!");
	sleep(3);
	return 0;
}

  Both codes print "Hello, buffer!" and then sleep for 3s, but the only difference is that code 1 has a "\n" newline after printing the content, while code 2 does not, this small How can a small symbol make a difference in our results?
The running result of code 1 is shown in the figure: first print the string, then sleep for 3s

The running result of code 2 is shown in the figure: sleep for 3s first, then print the string
  From the above results, can we draw the conclusion that the execution order of the two codes is different? the answer is negative! No matter how the code changes, the code is executed sequentially from top to bottom , but when there is no \n, hello, buffer! It will be saved in the buffer first, so the runtime will sleep first and then print it, giving the result that the sleep is executed first, but it is not. In fact, printf is executed first, but the data is not displayed immediately, and it is displayed after the operation is completed, indicating that after printf is executed, hello world is first saved in the buffer. This leads to the concept of buffer.

1.2 fflush

  I believe that readers all know the relevant knowledge of file descriptors and redirection. If readers who do not understand this aspect can view another article of the editor, "File Descriptors" . We know that for the following code, when we run it directly, there is no corresponding data in either the display or log.txt. The result is as follows:

int main()
{
    
    
    close(1);
    umask(0);
    int fd = open("log.txt",O_WRONLY|O_CREAT, 0666);
    if(fd<0){
    
    
      perror("open file!\n");
      return 1;
    }
    printf("你好呀,缓冲区!\n");                                                                                                                                               
    close(fd);
    return 0;
}
  Why is this? The data here is still saved in the buffer, and we must call fflush to flush the data to the log.txt file.
int main()
{
    
    
    close(1);
    umask(0);
    int fd = open("log.txt",O_WRONLY|O_CREAT, 0666);
    if(fd<0){
    
    
      perror("open file!\n");
      return 1;
    }
    printf("你好呀,缓冲区!\n");     
    fflush(stdout);                                                                                                                                          
    close(fd);
    return 0;
}

insert image description here
insert image description here

2. Buffer scheme

  In general, there are three buffering/refreshing schemes: a. no buffering; b. line buffering (commonly when refreshing data on the display); c. full buffering (full buffering is used when writing files, The file here can be understood as a disk file).

  To put it bluntly, it provides a memory space in the computer memory, and continuously writes data into the space. When the line buffer is full, or when \n is written, it will flush a line or all contents including \n. . If there is no +\n, but if you want to refresh it, you must add fflush (stdout), and the content of the buffer will be refreshed; full buffering means that the buffer must be full before it is refreshed to the corresponding disk.
  So why is the display line buffered and file writing fully buffered? Because the computer follows the von Neumann system, files like files actually belong to peripherals, or on top of peripherals, such as keyboards are peripherals, disks are peripherals, in other words, we need to refresh data to disk or display The above is called writing to the peripheral. The writing efficiency of the peripheral is very low, so we accumulate the data into the memory buffer, and refresh it regularly when the accumulation is enough to improve the efficiency.
  According to this statement, is full buffering the most efficient? The answer is correct! ! ! In theory, the full buffering efficiency is indeed the highest, so why does the display need to perform line buffering? It can be understood in this way that people cannot read the disk when the file is written, and the display is different. When a person writes data to the display, he wants to get the corresponding output result as soon as possible. If he chooses no buffering, The efficiency will be very low, and the full buffering person will not see the message in time, so it is set to line buffering. The so-called line buffering is a balance between efficiency and availability.

3. Buffer provider

3.1 Observing phenomena through code

  In order to sort out these problems, we wrote a code to let everyone see the strange phenomenon. The following code is to call the C library function interface and the system call interface to print three lines of code, and the result is also printed to the display without any problem. We redirect the result to ./test>log.txt, which is also correct.

int main()
{
    
    
	//C语言库函数接口
	printf("hello printf!\n");
	rprintf(stdout, "hello rprintf!\n");
	//系统调用接口
	const char* str = "hello wirte!\n";
	write(1 str, strlen(str));
	return 0;
}
  To modify the code, we add a fork at the end. Fork is after three lines of printing code, so fork will not affect other functions. At this time, the result is still correct, but there is a problem when we output it to log.txt.
int main()
{
    
    
	//C语言库函数接口
	printf("hello printf!\n");
	rprintf(stdout, "hello rprintf!\n");
	//系统调用接口
	const char* str = "hello wirte!\n";
	write(1 str, strlen(str));
	fork();//添加了一个fork函数,生成子进程
	return 0;
}

3.2 Result analysis and cause analysis

Analysis of the results: We typed on the display, the result was 3 lines, and the display used line buffering; if redirected to a file, the buffering method has changed at this time, becoming a full buffer, and when printing to the file, hello printf and hello fprintf are printed twice, and hello write is only printed once in turn. According to the above results, we come to the conclusion: (1. Redirection or not redirection will change the buffering mode of the process; (2. The C interface is called twice, and the system call interface is printed once.

Reason analysis: Fork is at the end of the program. According to the principle of top-down execution of the program, when the fork is executed, the three print functions have indeed been executed, but are they all refreshed? Does it all show? - The answer is not necessarily. When you are printing to the display, because it is a line refresh, and all with \n, so the three printing functions have completed the work of printing && refresh; when you run the program and redirect it to log.txt When the refresh method becomes full buffering, the two library functions in C language only complete the printing function, and the content is still in the buffer without refreshing, that is, hello printf and hello fprintf are only printed in the buffer, and not displayed. , when the fork is executed, this buffer is a memory area in the parent process, which saves the data of the current process, so when you fork, after the subsequent completion, the parent and child processes must refresh the data, because the program When exiting, the buffer memory area will be refreshed. When the parent and child processes are refreshed separately, because the processes are independent, no matter who refreshes the buffer first, the parent and child processes refresh the buffer first. Once a process refreshes the buffer, the essence of the refresh buffer is When modifying the buffer data, the copy-on-write will occur at this time, so the library function of the C language prints twice, because a. the existence of the buffer; b. the copy-on-write occurs!
  Then why the interface write of the system is not printed twice? Explain that write has no buffer! !

3.3 Conclusion

  If the buffer is provided by the OS, absolutely all interfaces are printed twice. So this buffer comes with the C language! So what we usually call the buffer is the buffer that comes with the language. All buffers are in memory, but the question is who applies for this memory, and whether this buffer is in the kernel area of ​​memory or the user area of ​​memory.

4. Kernel buffer

  What we learned earlier is the buffer provided by the C language, but the buffer is not only this one, but also others. Memory is divided into user area and kernel area. The operating system is the manager of software and hardware resources. It is impossible for us to directly refresh the contents of the buffer in the FILE to the disk and display, but to go through the operating system. Therefore, in the kernel area There is also a buffer at the operating system level. This buffer has its own refresh mechanism, which refreshes the buffer content to peripherals such as disk and display. The buffer content in the user is refreshed to the buffer in the kernel. We don't need to care about the buffer flush policy in the kernel.
insert image description here
insert image description here

Guess you like

Origin blog.csdn.net/weixin_43202123/article/details/121206929