在学校里上操作系统课的时候学到过这个，但现在除了“页面缓存”这个概念其他什么的都不记得了。这次重新学习，想要搞清楚这到底是个什么东东，还是源于在看MetaQ的文档里提到了Page Cache。

这里我就特别好奇于“PAGECACHE的作用”，于是就做了一番深入学习。

于是先在谷歌上搜了一下，关于这方面的中文资料真的不多。

大概是这么说的：

1.Page cache是vfs文件系统层的cache，例如对于一个ext3文件系统而言，每个文件都会有一棵radix树管理文件的缓存页，这些被管理的缓存页被称之为page cache。所以，page cache是针对文件系统而言的。例如，ext3文件系统的页缓存就是page cache。Buffer cache是针对设备的，每个设备都会有一棵radix树管理数据缓存块，这些缓存块被称之为buffer cache。通常对于ext3文件系统而言，page cache的大小为4KB，所以ext3每次操作的数据块大小都是4KB的整数倍。Buffer cache的缓存块大小通常由块设备的大小来决定，取值范围在512B~4KB之间，取块设备大小的最大公约数。

简单来说：page cache是对文件数据的缓存；buffer cache是对设备数据的缓存。

2.磁盘的操作有逻辑级（文件系统）和物理级（磁盘块），这两种Cache就是分别缓存逻辑和物理级数据的。假设我们通过文件系统操作文件，那么文件将被缓存到Page Cache，如果需要刷新文件的时候，Page Cache将交给Buffer Cache去完成，因为Buffer Cache就是缓存磁盘块的。也就是说，直接去操作文件，那就是Page Cache区缓存，用dd等命令直接操作磁盘块，就是Buffer Cache缓存的东西。

扫描二维码关注公众号，回复： 1132129 查看本文章

之后看了英文的文档，发现上面的说法有点不恰当或者过时了。为了更好的帮助自己记忆理解，简单翻译下：

Linux Page Cache Basics

Under Linux, the Page Cache accelerates many accesses to files on non volatile storage. This happens because, when it first reads from or writes to data media like hard drives, Linux also stores data in unused areas of memory, which acts as a cache. If this data is read again later, it can be quickly read from this cache in memory. This article will supply valuable background information about this page cache.

在Linux下，Page Cache可以加速在稳定存储器上（volatile-不稳定的）对文件的访问。这是因为，当第一次从数据媒体如硬盘驱动器读或者写数据时，Linux也会在空闲的内存区域存储该数据，充当一个缓存。如果之后该数据被再次读取，它可以迅速地从内存中的这个缓存中被读取。这篇文章将会提供关于Page cache的背景信息。

Page Cache or Buffer Cache

The term, Buffer Cache, is often used for the Page Cache. Linux kernels up to version 2.2 had both a Page Cache as well as a Buffer Cache. As of the 2.4 kernel, these two caches have been combined. Today, there is only one cache, the Page Cache.^[1]

Buffer Cache这个术语通常用于Page Cache。2.2版本以上的Linu内核同时有Page Cache和Buffer Cache。到了2.4的kernel,这两个缓存被合并了。今天，只有一种缓存——Page Cache。

Functional Approach

Under Linux, the number of megabytes of main memory currently used for the page cache is indicated in the Cached column of the report produced by the free -m command.

在Linux中，通过free -m命令可以看到，物理主内存被Page Cache当前使用的容量，显示在报告中的Cached列。

[root@testserver ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         15976      15195        781          0        167       9153
-/+ buffers/cache:       5874      10102
Swap:         2000          0       1999
[root@testserver ~]#

Writing

If data is written, it is first written to the Page Cache and managed as one of its dirty pages. Dirty means that the data is stored in the Page Cache, but needs to be written to the underlying storage device first. The content of these dirty pages is periodically transferred (as well as with the system calls sync or fsync) to the underlying storage device. The system may, in this last instance, be a RAID controller or the hard disk directly.

如果数据被写入，这是第一次被写入Page Cache，并作为脏页面被管理。脏页面是指数据还存储在Page Cache，但是需要先被刷到底层设备。这些脏页面的内容将被定期转移（随着系统调用sync或者fsync函数）到底层设备。这个底层设备可能是磁盘阵列或者直接硬盘。

The following example will show the creation of a 10-megabyte file, which will first be written to the Page Cache. The number of dirty pages will be increased by doing so, until they have been written to the underlying solid-state drive (SSD), in this case manually in response to the sync command.

下面的例子将会展示10M大小的文件的创建，会被首先写入Page Cache。这样脏页面的数量就会增加，直到它们已经被写入底层的SSD。在这个例子里，手动调用sync命令来反映。

wfischer@pc:~$ dd if=/dev/zero of=testfile.txt bs=1M count=10

10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0,0121043 s, 866 MB/s
wfischer@pc:~$ cat /proc/meminfo | grep Dirty
Dirty:             10260 kB
wfischer@pc:~$ sync
wfischer@pc:~$ cat /proc/meminfo | grep Dirty
Dirty:                 0 kB

自己在虚拟机上测试了一下：

测试1

注：dd是Linux/UNIX 下的一个非常有用的命令，作用是用指定大小的块拷贝一个文件，并在拷贝的同时进行指定的转换。

因为之前已经敲过这条命令了，所以在第二次输了这条命令后，Page Cache的大小没有改变（应该是相同的文件不重复缓存），之后改变了一下文件名，Page Cache的大小又增加了10M。

测试2

这里就比较奇怪了，脏页面的大小怎么这么小呢？？！！个人猜测应该是Linux系统设置的原因——定期刷盘的频率太高。（这里留一个问题下次解决吧。）

注：cat /proc/meminfo这条命令可以查看所有的内存信息

Up to Version 2.6.31 of the Kernel: pdflush

Up to and including the 2.6.31 version of the Linux kernel, the pdflush threads ensured that dirty pages were periodically written to the underlying storage device.

As of Version 2.6.32: per-backing-device based writeback

Since pdflush had several performance disadvantages, Jens Axboe developed a new, more effective writeback mechanism for Linux Kernel version 2.6.32. ^[2]

This approach provides threads for each device, as the following example of a computer with an SSD (/dev/sda) and a hard disk (/dev/sdb) shows.

root@pc:~# ls -l /dev/sda
brw-rw---- 1 root disk 8, 0 2011-09-01 10:36 /dev/sda
root@pc:~# ls -l /dev/sdb
brw-rw---- 1 root disk 8, 16 2011-09-01 10:36 /dev/sdb
root@pc:~# ps -eaf | grep -i flush
root       935     2  0 10:36 ?        00:00:00 [flush-8:0]
root       936     2  0 10:36 ?        00:00:00 [flush-8:16]

这里懒得翻了。。。。

Reading

File blocks are written to the Page Cache not just during writing, but also when reading files. For example, when you read a 100-megabyte file twice, once after the other, the second access will be quicker, because the file blocks come directly from the Page Cache in memory and do not have to be read from the hard disk again. The following example shows that the size of the Page Cache has increased after a good, 200-megabytes video has been played.

文件块被写入Page Cache，不止在写文件时，也在读文件时。例如，当你两次读一个100M的文件时，

一次接着一次，第二次会更快，因为文件块直接来自内存中的Page Cache，不需要再次从硬盘中读取了。接下来的实例将会演示Page Cache的大小在一个200M的视频播放之后增加了。

user@adminpc:~$ free -m
             total       used       free     shared    buffers     cached
Mem:          3884       1812       2071          0         60       1328
-/+ buffers/cache:        424       3459
Swap:         1956          0       1956
user@adminpc:~$ vlc video.avi
[...]
user@adminpc:~$ free -m
             total       used       free     shared    buffers     cached
Mem:          3884       2056       1827          0         60       1566
-/+ buffers/cache:        429       3454
Swap:         1956          0       1956
user@adminpc:~$

If Linux needs more memory for normal applications than is currently available, areas of the Page Cache that are no longer in use will be automatically deleted.

如果Linux需要分配更多比目前可用的内存给普通的应用程序，那些不再使用的Page Cache的内存区域

将会被自动删除。

Optimizing the Page Cache

Automatically storing file blocks in the Page Cache is generally quite advantageous. Some data, such as log files or MySQL dump files, are often no longer needed once they have been written. Therefore, such file blocks often use space in the Page Cache unnecessarily. Other file blocks, whose storage would be more advantageous, might be prematurely deleted from the Page Cache by newer log files or MySQL.^[3]

Periodic execution of logrotate with gzip compression can help with log files, somewhat. When a 500-megabyte log file is compressed into 10 megabytes by logrotate and gzip, the original log file becomes invalid along with its cache space. 490 megabytes in the Page Cache will then become available by doing so. The danger of a continuously growing log file displacing more useful file blocks from the Page Cache is reduced thereby.

Therefore, it is completely reasonable, if some applications would normally not store certain files and file blocks in the cache. There is already such a patch available for rsync.^[4]

太晚了，明天还要上班，优化页面缓存这块内容有空再来翻译吧。

补充：

调用系统函数write时有写延迟，write负责把东西写到缓存区上sync负责把缓存区上的东西排到写队列中，在由守护进程负责把队列里的东西写到磁盘上，而sync函数在把缓存区上的东西排到写队列后不管写队列中的内容是否写到磁盘上都立即返回。 fsync函数则是对指定文件的操作，而且必须等到写队列中的内容都写到磁盘后才返回，并且更新文件inode结点里的内容。

参考：http://www.thomas-krenn.com/en/wiki/Linux_Page_Cache_Basics#cite_note-1

Page Cache学习

Linux Page Cache Basics

Page Cache or Buffer Cache

Functional Approach

Up to Version 2.6.31 of the Kernel: pdflush

As of Version 2.6.32: per-backing-device based writeback

猜你喜欢