Oracle Huge Pages,Transparent Huge Pages

写在前言

Linux中大页分为两种:Huge pages (标准大页) Transparent Huge pages(透明大页)

内存是以块即页的方式进行管理的,当前大部分系统默认的页大小为4096 bytes4K1MB内存等于256页;1GB内存等于256000页。

CPU拥有内置的内存管理单元,包含这些页面的列表,每个页面通过页表条目引用。当内存越来越大的时候,CPU需要管理这些内存页的成本也就越高,这样会对操作系统的性能产生影响。

 

Huge Pages

Huge pages 是从Linux Kernel 2.6后被引入的,目的是通过使用大页内存来取代传统的4kb内存页面,以适应越来越大的系统内存,让操作系统可以支持现代硬件架构的大页面容量功能。

Huge pages 有两种格式大小:2MB  1GB2MB页块大小适合用于GB大小的内存,1GB页块大小适合用于TB级别的内存;2MB是默认的页大小。

 

Transparent Huge Pages

Transparent Huge Pages 缩写 THP,这个是RHEL 6开始引入的一个功能,Linux6上透明大页是默认启用的。

由于Huge pages很难手动管理,而且通常需要对代码进行重大的更改才能有效的使用,因此RHEL 6开始引入了Transparent Huge PagesTHP),THP是一个抽象层,能够自动创建、管理和使用传统大页。

THP为系统管理员和开发人员减少了很多使用传统大页的复杂性因为THP的目标是改进性能因此其它开发人员 (来自社区和红帽已在各种系统、配置、应用程序和负载中对 THP 进行了测试和优化。这样可让 THP 的默认设置改进大多数系统配置性能。但是不建议对数据库工作负载使用 THP

 

这两者最大的区别在于: 标准大页管理是预分配的方式,而透明大页管理则是动态分配的方式。

 

 

 

标准大页的页面大小

[oracle@we2db1 ~]$ grep Hugepagesize/proc/meminfo

Hugepagesize:       2048 kB

 

注:THP 目前只能映射异步内存区域,比如堆和栈空间

 

Oracle 官方是推荐我们使用Huge pages的,它拥有以下的好处:

 

Larger Page Size and Less # ofPages: Default page size is 4K whereas the HugeTLB size is 2048K. That meansthe system would need to handle 512 times less pages.

 

Reduced Page Table Walking:Since a HugePage covers greater contiguous virtual address range than a regularsized page, a probability of getting a TLB hit per TLB entry with HugePages arehigher than with regular pages. This reduces the number of times page tablesare walked to obtain physical address from a virtual address.

 

Less Overhead for MemoryOperations: On virtual memory systems (any modern OS) each memory operation isactually two abstract memory operations. With HugePages, since there are lessnumber of pages to work on, the possible bottleneck on page table access isclearly avoided.

 

Less Memory Usage: From theOracle Database perspective, with HugePages, the Linux kernel will use lessmemory to create pagetables to maintain virtual to physical mappings for SGAaddress range, in comparison to regular size pages. This makes more memory tobe available for process-private computations or PGA usage.

 

No Swapping: We must avoidswapping to happen on Linux OS at all Document 1295478.1. HugePages are notswappable (whereas regular pages are). Therefore there is no page replacementmechanism overhead. HugePages are universally regarded as pinned.

 

No 'kswapd' Operations: kswapdwill get very busy if there is a very large area to be paged (i.e. 13 millionpage table entries for 50GB memory) and will use an incredible amount of CPUresource. When HugePages are used, kswapd is not involved in managing them. Seealso Document 361670.1

 

当然使用 Huge pages 也会存在某些缺点:

首先开启该功能需要进行额外设置,

第二,Huge pages Oracle 11g性特性AMMAutomatic Memory Management)是相互冲突的,但是ASMMAutomaticShared Memory Management)仍然可以继续使用。

 

Oracle官方虽然推荐我们使用 Huge pages,但是却建议我们关闭Transparent Huge pages,因为透明大页存在一些问题:

 

1.RAC环境下 透明大页(TransparentHugePages )会导致异常节点重启,和性能问题;

 

2.在单机环境中,透明大页(TransparentHugePages )也会导致一些异常的性能问题;

 

注:Transparent Huge Pages在32位的RHEL 6中是不支持的。

 

 

禁用trasnparent hugepage

查看透明大页是否已经启用

[root@we2db2 ~]# cat /etc/issue

Red Hat Enterprise Linux Server release 6.9(Santiago)

Kernel \r on an \m

 

[root@we2db2 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled

always madvise [never]

 

使用命令查看时,如果输出结果为[always]表示透明大页启用了。[never]表示透明大页禁用、[madvise]表示只在MADV_HUGEPAGE标志的VMA中使用THP

这里已经禁用了,若未禁用可以使用以下方式禁用

[root@we2db2 ~]# more /etc/grub.conf

# grub.conf generated by anaconda

#

# Note that you do not have to rerun grubafter making changes to this file

# NOTICE: You have a /boot partition.  Thismeans that

#         all kernel and initrd paths are relative to /boot/, eg.

#         root (hd0,0)

#         kernel /vmlinuz-version ro root=/dev/mapper/rootvg-lv_root

#         initrd /initrd-[generic-]version.img

#boot=/dev/sda

default=0

timeout=5

splashimage=(hd0,0)/grub/splash.xpm.gz

hiddenmenu

title Red Hat Enterprise Linux 6(2.6.32-696.20.1.el6.x86_64)

       root (hd0,0)

       kernel /vmlinuz-2.6.32-696.20.1.el6.x86_64 ro root=/dev/mapper/rootvg-lv_root transparent_hugepage=never  rd_NO_LUKS.UTF-8 rd_LVM_L

V=rootvg/lv_swap rd_NO_MDSYSFONT=latarcyrheb-sun16 rd_LVM_LV=rootvg/lv_root  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgbquiet

       initrd /initramfs-2.6.32-696.20.1.el6.x86_64.img

 

 

配置huge page

1. /etc/security/limits.conf加入以下参数

 

*  soft   memlock    8193024

*  hard   memlock    8193024

 

 

这里的试验环境是8G 的内存,memlock 参数的值只需略小于内存的值即可,即使超过了SGA的需求,也没有任何影响。

 

 

[root@we2db1 ~]# cat /etc/security/limits.conf

 

*  soft   memlock    8193024

*  hard   memlock    8193024

 

2.重新登录服务器验证设置

 

[root@we2db1 ~]# ulimit -l

8193024

3.Oracle 11g中禁用 AutomaticMemory Management (AMM)特性,即MEMORY_TARGET and MEMORY_MAX_TARGET 值为0

 

SQL> show parameter MEMORY_TARGET

 

NAME                                 TYPE        VALUE

----------------------------------------------- ------------------------------

memory_target                        big integer 0

SQL> show parameter MEMORY_MAX_TARGET

 

NAME                                 TYPE        VALUE

----------------------------------------------- ------------------------------

memory_max_target                    big integer 0

 

这里试验环境并没有启用AMM特性,如果启用了该特性使用一下语句关闭。

 

alter system setMEMORY_TARGET=0 scope=spfile;

alter system setMEMORY_MAX_TARGET=0 scope=spfile;

 

重启实例后生效。

 

 

4.确认所有需要使用Hugepage的实例都是开启的(包括ASM实例),然后运行hugepages_settings.sh(具体脚本参考文档 ID 401749.1)

下面是oracle文档 ID 401749.1的内容

APPLIES TO:

Oracle Database - Enterprise Edition
Linux OS - Version Oracle Linux 4.4 to Oracle Linux 7.5 with Unbreakable Enterprise Kernel [4.14.35] [Release OL4U4 to OL7U5]
Generic Linux

PURPOSE

This script is intended to compute values for the recommended HugePages/HugeTLB configuration for the current shared memory segments on Oracle Linux systems.

It does calculation for all shared memory segments available when the script is run, no matter it is an Oracle RDBMS shared memory segment or not.

For general information about HugePages / HugeTLB, please see Note 361323.1

REQUIREMENTS

  • Oracle Database instance(s) are up and running
  • Oracle Database 11g Automatic Memory Management (AMM) is not setup  (See Note 749851.1)
  • The shared memory segments can be listed by command "ipcs -m"
  • Oracle Linux
  • Package 'bc' installed

CONFIGURING

  1. Create a text file named hugepages_settings.sh
  2. Copy the contents below in the file
  3. Run:
    $ chmod +x hugepages_settings.sh

INSTRUCTIONS

  1. Be sure that all applications that are meant to use HugePage / HugeTLB are running at the time the script is to be run. This includes the Oracle RDBMS instances and ASM instances in addition to other applications.
  2. Be sure that you have /bin and /usr/bin in $PATH
  3. Run:
    $ ./hugepages_settings.sh

CAUTION

This sample code is provided for educational purposes only, and is not supported by Oracle Support. It has been tested internally, however, we do not guarantee that it will work for you. Ensure that you run it in your test environment before using.

SCRIPT

#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
# on Oracle Linux
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support 
# http://support.oracle.com

# Welcome text
echo "
This script is provided by Doc ID 401749.1 from My Oracle Support 
(http://support.oracle.com) where it is intended to compute values for 
the recommended HugePages/HugeTLB configuration for the current shared 
memory segments on Oracle Linux. Before proceeding with the execution please note following:
 * For ASM instance, it needs to configure ASMM instead of AMM.
 * The 'pga_aggregate_target' is outside the SGA and 
   you should accommodate this while calculating SGA size.
 * In case you changes the DB SGA size, 
   as the new SGA will not fit in the previous HugePages configuration, 
   it had better disable the whole HugePages, 
   start the DB with new SGA size and run the script again.
And make sure that:
 * Oracle Database instance(s) are up and running
 * Oracle Database 11g Automatic Memory Management (AMM) is not setup 
   (See Doc ID 749851.1)
 * The shared memory segments can be listed by command:
     # ipcs -m


Press Enter to proceed..."

read

# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`

# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`
if [ -z "$HPG_SZ" ];then
    echo "The hugepages may not be supported in the system where the script is being executed."
    exit 1
fi

# Initialize the counter
NUM_PG=0

# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`
do
    MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
    if [ $MIN_PG -gt 0 ]; then
        NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
    fi
done

RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`

# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
    echo "***********"
    echo "** ERROR **"
    echo "***********"
    echo "Sorry! There are not enough total of shared memory segments allocated for 
HugePages configuration. HugePages can only be used for shared memory segments 
that you can list by command:

    # ipcs -m

of a size that can match an Oracle Database SGA. Please make sure that:
 * Oracle Database instance is up and running 
 * Oracle Database 11g Automatic Memory Management (AMM) is not configured"
    exit 1
fi

# Finish with results
case $KERN in
    '2.2') echo "Kernel version $KERN is not supported. Exiting." ;;
    '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
           echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
    '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    '3.8') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    '3.10') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
    '4.1') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
esac

# End

SAMPLE OUTPUT

For 2.4 kernel systems:

$ ./hugepages_settings.sh
...
Recommended setting: vm.hugetlb_pool = 764

For 2.6 and later kernel systems:

$ ./hugepages_settings.sh
...
Recommended setting: vm.nr_hugepages = 67

Please see Document 361323.1 about how to do that setting.

[root@we2db1 ~]# chmod +x hugepages_settings.sh

[root@we2db1 ~]# ./hugepages_settings.sh

This script is provided by Doc ID 401749.1from My Oracle Support

(http://support.oracle.com) where it isintended to compute values for

the recommended HugePages/HugeTLBconfiguration for the current shared

memory segments on Oracle Linux. Beforeproceeding with the execution please note following:

 *For ASM instance, it needs to configure ASMM instead of AMM.

 *The 'pga_aggregate_target' is outside the SGA and

  you should accommodate this while calculating SGA size.

 * Incase you changes the DB SGA size,

   asthe new SGA will not fit in the previous HugePages configuration,

   ithad better disable the whole HugePages,

  start the DB with new SGA size and run the script again.

And make sure that:

 *Oracle Database instance(s) are up and running

 *Oracle Database 11g Automatic Memory Management (AMM) is not setup

  (See Doc ID 749851.1)

 *The shared memory segments can be listed by command:

    # ipcs -m

 

 

Press Enter to proceed...

 

Recommended setting:vm.nr_hugepages = 1204

 

5.在 /etc/sysctl.conf 文件中添加vm.nr_hugepages参数

 

[root@we2db1 ~]# cat /etc/sysctl.conf |grepnr_hugepages

vm.nr_hugepages = 1204

[root@we2db2 ~]# cat /etc/sysctl.conf |grepnr_hugepages

vm.nr_hugepages = 1204

 

6.关闭所有数据库实例并重启服务器

 

7.验证配置是否正确,如下所示:

 

[root@we2db2 ~]# grep HugePages/proc/meminfo

AnonHugePages:         0 kB

HugePages_Total:    1204

HugePages_Free:      739

HugePages_Rsvd:      736

HugePages_Surp:        0

 

为了确保HugePages配置的有效性,HugePages_Free值应该小于HugePages_Total的值,并且有一定的HugePages_Rsvd的值。

 

Also there should be some HugePages_Rsvd ifPRE_PAGE_SGA is 'false' for all the Oracle database instances..

HugePages_Rsvd counts free pages that arereserved for use (requested for an SGA, but not touched/mapped yet).

PRE_PAGE_SGA determines if the all SGApages are read-in when the instance starts up.

If parameter is set to 'true' then the OSpage table entries are prebuilt for each page of the SGA, leading to HugePagesreservation of those pages.

For Oracle database versions before 12.1the default value for PRE_PAGE_SGA is 'false'. So the HugePages_Rsvd would behigher than 0. But since 12c the PRE_PAGE_SGA defaults to 'true' which wouldcause the HugePages_Rsvd to be 0.

 

The sum of Hugepages_Free andHugePages_Rsvd may be smaller than your total combined SGA as instancesallocate pages dynamically and proactively as needed.

 

至此 HugePages 已经配置完成。

 

虽然Oracle官方推荐使用Huge Pages 但是具体是否使用还得考虑实际情况。如果您的系统经常碰到因为swap引发的性能问题的系统可以考虑启用HugePage。另外,OS内存非常大的系统也可以启用HugePage。但是具体多大就一定需要使用HugePage?这并没有定论。

 

参考文档:

 

HugePages on Oracle Linux 64-bit (文档 ID361468.1)

 

HugePages and Oracle Database 11g AutomaticMemory Management (AMM) on Linux (文档 ID 749851.1)

 

HugePages on Linux: What It Is... and WhatIt Is Not... (文档 ID 361323.1)

 

Oracle Linux: Shell Script to CalculateValues Recommended Linux HugePages / HugeTLB Configuration (文档 ID401749.1)

 

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/s-memory-transhuge#s-memory-configure_hugepages

 

 

猜你喜欢

转载自blog.csdn.net/qq_34556414/article/details/82978993
今日推荐