Linux file read and write mechanism

01. Basic concepts

Cache

Cache is a component used to reduce the average time required for high-speed devices to access low-speed devices . File reading and writing involves computer memory and disks. The memory operation speed is much faster than disks. If you call read and write each time you directly manipulate the disk, on the one hand Speed ​​will be limited, on the one hand, it will also reduce the life of the disk, so whether it is a read or write operation to the disk, the operating system will cache the data first.

Page Cache

Page Cache (Page Cache) is a buffer between memory and files . It is actually a memory area. All file IO (including network files) interact directly with page cache. The operating system uses a series of data structures. , Such as inode, address_space, struct page, realize the level of mapping a file to the page. To a large extent, the optimization of file reading and writing is the optimization of page cache usage.

Dirty Page

The page cache corresponds to an area in the file. If the content of the page cache and the corresponding file area are inconsistent, the page cache is called a dirty page (Dirty Page). Modifying the page cache or creating a new page cache, as long as the disk is not flushed, dirty pages will be generated.

Insert picture description here

[root@ufo130 ~]# cat /proc/meminfo | grep -E '^Cached|^Dirty'

Insert picture description here

Some parameters that can change the operating system's write-back behavior of dirty pages

[root@ufo130 ~]# sysctl -a 2>/dev/null | grep dirty
vm.dirty_background_ratio = 5
vm.dirty_background_bytes = 0
vm.dirty_ratio = 10
vm.dirty_bytes = 0
vm.dirty_writeback_centisecs = 500
vm.dirty_expire_centisecs = 3000
  • vm.dirty_background_ratio is the percentage of dirty pages that the memory can fill. When the total size of dirty pages reaches this ratio, the system background process will start flushing dirty pages to disk ( vm.dirty_background_bytes is similar, but it is set by the number of bytes)

  • vm.dirty_ratio is an absolute limit for dirty data, and the percentage of dirty data in memory cannot exceed this value. If the dirty data exceeds this amount, new IO requests will be blocked until the dirty data is written to disk

  • vm.dirty_writeback_centisecs specifies how often to write back dirty data, in hundredths of a second

  • vm.dirty_expire_centisecs specifies the time that dirty data can survive, in hundredths of a second. For example, it is set to 30 seconds. When the operating system writes back, if the dirty data is in memory for more than 30 seconds, it will be written Back to disk

  • These parameters can be sudo sysctl -w vm.dirty_background_ratio 5 = such commands to modify the required root privileges, may be performed in the root user echo 5> / proc / sys / vm / dirty_background_ratio modified

02. File read and write process

With the concept of page cache and dirty pages, let's look at the file read and write process

Read file

  • User initiates read operation
  • Operating system lookup page cache
  • If it is missed, a page fault exception is generated, then a page cache is created, and the corresponding page is read from the disk to fill the page cache
  • If it hits, the content to be read is returned directly from the page cache
  • User read call completed

Write file

  • User initiated write operation
  • Operating system lookup page cache
  • If it is missed, a page fault exception will be generated, then a page cache is created, and the content passed in by the user is written into the page cache
  • If it hits, the content passed in by the user is directly written to the page cache
  • User write call completed
  • Pages become dirty pages after being modified, the operating system has two mechanisms to write dirty pages back to disk
  • The user manually calls fsync()
  • The pdflush process periodically writes dirty pages back to disk

There is a corresponding relationship between page cache and disk file. This relationship is maintained by the operating system. Read and write operations on page cache are completed in kernel mode, which is transparent to users.

03. Optimization ideas for file reading and writing

Different optimization schemes are suitable for different usage scenarios, such as file size, frequency of reading and writing, etc. Here we do not consider the scheme of modifying system parameters. Modifying system parameters always has gains and losses. You need to choose a balance point, which is related to business. The degree is too high, such as whether to require strong data consistency, whether to tolerate data loss, and so on. The optimization idea has the following two considerations.

  • Maximize the use of page caching

  • Reduce the number of system api calls

The first point is easy to understand, try to make every IO operation hit the page cache, which will be much faster than operating the disk. The system API mentioned in the second point is mainly read and write, because the system call will enter the kernel mode from the user mode. , And some are accompanied by copying of memory data, so reducing system calls in some scenarios will also improve performance.

Guess you like

Origin blog.csdn.net/qq_42226855/article/details/113060602