操作系统内存状态与postgresql内存设置

根据业务变化调整数据库的配置，是数据库管理员需要具备的基本技能，也是日常维护工作。

freebsd 系统中 top 显示的内存有6个状态

Wired: Wired pages are locked in memory and cannot be paged out. Typically, these pages are being used by the kernel or the physical-memory pager, or they have been locked down with mlock. In addition, all the pages being used to hold the user structure and thread stacks of loaded (i.e., not swapped-out) processes are also wired. Wired pages cannot be paged out.

Active: Active pages are being used by one or more regions of virtual memory. Although the kernel can page them out, doing so is likely to cause an active process to fault them back again.

Inactive: Inactive pages have contents that are still known, but they are not usually part of any active region. If the contents of the page are dirty, the contents must be written to backing store before the page can be reused. Once the page has been cleaned, it is moved to the cache list. If the system becomes short of memory, the pageout daemon may try to move active pages to the inactive list in the hopes of finding pages that are not really in use. The selection criteria that are used by the pageout daemon to select pages to move from the active list to the inactive list are described later in this section. When the free-memory and cache lists drop too low, the pageout daemon traverses the inactive list to create more cache and free pages.

Cache: Cache pages have contents that are still known, but they are not usually part of any active region. If they are mapped into an active region, they must be marked read-only so that any write attempt will cause them to be moved off the cache list. They are similar to inactive pages except that they are not dirty, either because they are unmodified, since they were paged in, or because they have been written to their backing store. They can be moved to the free list when needed.

Free: Free pages have no useful contents and will be used to fulfill new page-fault requests. The idle process attempts to keep about 75 percent of the pages on the free list zeroed so that they do not have to be zeroed while servicing an anonymous-region page fault. Pages with unknown contents are placed at the front of the free list. Zeroed pages are placed at the end of the free list. The idle process takes pages from the front of the free list, zeros them, marks them as having been zeroed, and puts them on the end of the free list. Page faults that will be filling a page take one from the front of the free list. Page faults that need a zero-filled page take one from the end of the free list. If marked as zeroed, it does not have to be zeroed during the fault service.

wired: 是系统核心占用的，永远不会从系统物理[内存]种驱除。
active: 表示这些[内存]数据正在使用种，或者刚被使用过。
inactive: 表示这些[内存]中的数据是有效的，但是最近没有被使用。
Cache: 写回磁盘后干净的，或刚paged in进来的，可随时变为free
Buf: 被用于 BIO-level 磁盘缓冲的页的数目（专门用于磁盘文件缓存）
free: 表示这些[内存]中的数据是无效的，这些空间可以随时被程序使用。

　　当free的[内存]低于某个值（这个值是由你的物理[内存]大小决定的），系统则会按照以下顺序使用inactive的资源。首先如果inactive的数据最近被调用了，系统会把它们的状态改变成active,并接在原有active[内存]逻辑地址的后面, 如果inactive的[内存]数据最近没有被使用过，但是曾经被更改过而还没有在硬盘的相应虚拟[内存]中做修改，系统会对相应硬盘的虚拟[内存]做修改，并把这部分物理[内存]释放为free供程序使用。如果inactive[内存]中得数据被在映射到硬盘后再没有被更改过，则直接释放成free。最后如果active的[内存]一段时间没有被使用，会被暂时改变状态为inactive。
　　
所以说，如果你的系统里有少量的free memeory和大量的inactive的memeory，说明你的[内存]是够用的，系统运行在最佳状态，只要需要,系统就会使用它们，不用担心。而反之如果系统的free memory和inactive memory都很少，而active memory很多，说明你的[内存]不够了。当然一开机，大部分[内存]都是free,这时系统反而不在最佳状态，因为很多数据都需要从硬盘调用，速度反而慢了。

postgresql：

shared_buffers
if you have a system with 1GB or more of RAM, a reasonable starting value for shared_buffers is 1/4 of the memory in your system. If you have less ram you'll have to account more carefully for how much RAM the OS is taking up, closer to 15% is more typical there.

on Windows (and on PostgreSQL versions before 8.1), large values for shared_buffers aren't as effective, and you may find better results keeping it relatively low and using the OS cache more instead. On Windows the useful range is 64MB to 512MB

effective_cache_size
Setting effective_cache_size to 1/2 of total memory would be a normal conservative setting, and 3/4 of memory is a more aggressive but still reasonable amount. HUP is enough , —— total memory 一半

This is the effective amount of caching between the actual postgres buffers, and the OS buffers. If you are dedicating this machine to postgres, I would set it to something like 3.5G. If it is a mixed machine, then you have to think about it.

This does not change how postgres uses RAM, it changes how postgres estimates whether an Index scan will be cheaper than a Sequential scan, based on the likelihood that the data you want will already be cached in Ram.

A good value for effective_cache_size would be total memory minus what the OS and others need minus what private memory the PostgreSQL backends need. The latter can be estimated as work_mem times max_connections. （可设置为：总内存 - 操作系统需要用到的内存 - work_mem 乘以 max_connections）

一些脚本

查看缓冲区中的表，以及已被修改的‘脏’数据

SELECT datname ,c.relname as "relname                     ", count(*) AS buffers, sum(case when isdirty='t' then 1 else 0 end) AS dirt_buffers, 
round(sum(case when isdirty='t' then 1.0 else 0.0 end)*100/count(*),1) as percent,sum(case when isdirty='t' then 1 else 0 end) >0
             FROM pg_buffercache b left JOIN pg_class c
             ON b.relfilenode = pg_relation_filenode(c.oid) 
             AND
                b.reldatabase IN (0, (SELECT oid FROM pg_database
                                      WHERE datname = current_database()))
             left join  pg_database
             on b.reldatabase=pg_database.oid
             --where c.relname like '%examuser%'
             where c.relname not like 'pg_%'
             GROUP BY 1,2
             --having count(*) >10 
             ORDER BY round(sum(case when isdirty='t' then 1.0 else 0.0 end)*100/count(*),1)>0 desc, count(*)*count(*) +  round(sum(case when isdirty='t' then 1.0 else 0.0 end)*100/count(*),1) *round(sum(case when isdirty='t' then 1.0 else 0.0 end)*100/count(*),1) DESC
            -- LIMIT 25
;

手动将整个表载入缓冲区

psql  -c "select pg_prewarm('tableXXX')" postgres

以下是将我一个8GB内存的服务器内存使用状态，用python2分钟搜集一次，5个小时后，用JSXGraph库显示在网页上的结果：

操作系统内存状态与postgresql内存设置

猜你喜欢