Linux system: to ensure data security off the plate

In many IO scenario, we often need to ensure the security of the data has been written to disk, but also in order to read the data after a system reboot downtime. But we all know, Linux system IO path is very complex, divided into many layers, each layer may have a buffer to speed IO read and write. Meanwhile, applications, and user-mode library functions may also have their own buffer, which gave the IO path adds some complexity. Visible, in order to ensure the security of data written to disk, not simply transfer a write / fwrite you can get the.
So how to do it? A lot of people think a lot of ways, such as: fflush (), fsync () , fdatasync (), sync (), open () or use O_DIRECT O_SYNC signs. Ah, these tools (or some combination) can indeed guarantee the security of data persistence, then what is the difference between them? fflush () and fsync () what's the difference? O_DIRECT What do you mean, it can guarantee the security of the data persistence it? What O_DIRECT and O_SYNC difference? O_SYNC and fsync () do? fsync to complete msync function? This article will try to understand, to explain the role and the difference between these concepts.

Linux  IO

The so-called a picture is worth a thousand words, in order to resolve clearly distinguish these concepts, I deliberately drew a map, look carefully, you should be able to clearly see their role and distinction.
Linux: to ensure data security off the disk Linux: to ensure data security off the plate
Let us focus here O_DIRECT and O_SYNC, it must first be clear that, O_DIRECT only that data will not go through page cache (generally used in user mode to manage buffer) but submitted directly to the block device layer, but does not synchronize waiting for data security write not return (such as data line or bulk layer may still own the cache disk) after the disk. And O_SYNC flag, although the data still write page cache, but this time will use write through policy, and wait for the synchronization of data security will return after a disk write. So if you use O_DIRECT and O_SYNC, then the data will not be synchronized through the page cache and wait for the data written to disk before returning security, of course, this IO performance will be very low.
Since O_DIRECT will bypass page cache, so if there is another process using common way to read the file, there may be data inconsistencies, which also need attention.
In order to do something HELP, where I posted what I explored some of the information process seen. : The first is the reference to the open system call
http://man7.org/linux/man-pages/man2/open.2.html 
description of relevant parameters:
Linux: to ensure data security off the disk Linux: to ensure data security off the plate
and innodb related documents:
https://lwn.net/Articles/457667 /
Linux: to ensure data security off the disk Linux: to ensure data security off the plate
difference fsync and fdatasync of:
http://man7.org/linux/man-pages/man2/fsync.2.html
Linux: to ensure data security off the disk Linux: to ensure data security off the plate
msync:
http://man7.org/linux/man-pages/man2/msync.2. html
Linux: to ensure data security off the disk Linux: to ensure data security off the plate

DAX

In fact, there is a IO mode, that is, DAX (Direct Access), is not O_DIRECT looks like. This mode requires filesystem block driver and support can, generally mainly used in non volatile memory, bypassing the page cache is also essentially direct operation of the device. This article is not the first DAX-depth discussions later I will write a block device driver support ramdisk DAX mode, and then formatted as ext4 file system and mount -o dax mode, again a detailed study of the IO path DAX.
Finally, attach the Linux io path tracking under common scenarios:
https://my.oschina.net/fileoptions/blog/3061822


Guess you like

Origin blog.51cto.com/14414295/2425793