Oracle 11g 诊断新特性——自动诊断资料库ADR

 ADR(Automatic Diagnostic Repository)是一个基于文件的档案库,用于存放数据库的诊断信息,例如跟踪文件,意外dump文件,IPS包,警告日志文件,健康监控报告,核心dump文件以及其它诊断信息。ADR的根目录叫做ADR base, 位置通过参数DIAGNOSTIC_DEST设置。ADR拥有统一的目录结构,在数据库之外存储多个产品和实例的诊断信息,因此即使在数据库关闭时仍然可以进行问题诊断。
    从Oracle 11gR1 开始,ADR用于存储数据库,ASM,CRS和其它产品或组件(如listener)的诊断信息。每一个实例或者产品拥有各自的ADR home路径。例如在一个RAC环境下,ASM, 数据库实例拥有单独的ADR home。

名词解释
严重错误(Critical Error)是指会产生跟踪文件的Oracle内部错误。Oracle把他们划分为不同的类别,内部错误(ORA-600),系统访问异常(ora-7445,ora-3113),锁相关的错误,坏块(ORA-1578)和内存不足(ORA-4030/4031)等。
事件(Incident)是指一次严重错误,每出现一次严重错误,就会产生一次事件。ADR会跟踪每一个事件并产生唯一的事件ID。
问题(Problem)是一组严重错误,他们拥有一组共同的属性。ADR跟踪每一个问题,并且给每一个问题产生一个唯一的问题ID.

改变
    从11gR1开始,所有的诊断信息都保存在ADR中。ADR是一个外部的,迷你的XML数据库。
跟踪文件和进程的1:1对应不再存在。虽然进程跟踪文件仍然存在,但是它只记录每个事件的trc和trm文件的位置。Oracle为每一个事件产生一对儿文件——trc和trm文件。trc文件存放诊断信息,而trm文件存放元数据。
另外,文件的内部结构也发生了改变。例如一个XML格式的文件被引入,存放在ADR home的alert路径下(ADR_BASE/diag/rdbms/<db_name>/<SID>/alert/log.xml)。trc文件由若干个标记过的XML记录构成,每个记录都是分层排序过的。这种改变更加容易找到文件中我们感兴趣的部分的信息。

从Oracle Database11g R1 开始,将忽略传统的…_DUMP_DEST 初始化参数。ADR 根目录又称为ADR 基目录,其位置由DIAGNOSTIC_DEST 初始化参数设定。如果省略此参数或将其保留为空,数据库将在启动时按如下方式设置DIAGNOSTIC_DEST:

如果已设置了环境变量ORACLE_BASE,则将DIAGNOSTIC_DEST 设置为$ORACLE_BASE。

如果未设置环境变量ORACLE_BASE,则将DIAGNOSTIC_DEST 设置为$ORACLE_HOME/log。

可能通过V$DIAG_INFO视图来查看各个ADR的位置:

SQL>  select * from v$diag_info;

INST_ID NAME                      VALUE
------- ------------------------- --------------------------------------------------------------------------------
      1 Diag Enabled              TRUE
      1 ADR Base                  /u01/app/oracle
      1 ADR Home                  /u01/app/oracle/diag/rdbms/orcl/orcl
      1 Diag Trace                /u01/app/oracle/diag/rdbms/orcl/orcl/trace
      1 Diag Alert                /u01/app/oracle/diag/rdbms/orcl/orcl/alert
      1 Diag Incident             /u01/app/oracle/diag/rdbms/orcl/orcl/incident
      1 Diag Cdump                /u01/app/oracle/diag/rdbms/orcl/orcl/cdump
      1 Health Monitor            /u01/app/oracle/diag/rdbms/orcl/orcl/hm
      1 Default Trace File        /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_12710.trc
      1 Active Problem Count      0
      1 Active Incident Count     0

11 rows selected.

(1)  ADR Base:ADR 基目录的路径

(2)  ADR Home:当前数据库实例的ADR 主目录的路径

(3)  Diag Trace:文本预警日志和后台/前台进程跟踪文件的位置

(4)  Diag Alert:XML 版本的预警日志的位置

(5)  …

(6)  Default Trace File:会话的跟踪文件的路径。SQL 跟踪文件将写入到这里。

对于OracleDatabase 11g,前台和后台跟踪文件之间没有什么区别。这两种类型的文件都会放入$ADR_HOME/trace目录中。所有非意外事件跟踪都存储在TRACE子目录中。以前的版本会将严重错误信息转储到相应的进程跟踪文件而不是意外事件转储,这就是新旧版本之间的主要区别。从Oracle Database 11g 开始,意外事件转储将存放到独立于正常进程跟踪文件的文件中。

工具ADRCI

ADR 命令行工具

ADRCI 是一个命令行工具,包含在Oracle Database 版本11g 中引入的故障可诊断性基础结构中。使用ADRCI,可以:

(1)  查看自动诊断资料档案库(ADR) 中的诊断数据。

(2)  将意外事件和问题信息打包成zip 文件,以传输到Oracle 技术支持。此操作是使用称为意外事件打包服务(IPS) 的服务完成的。

[oracle@qht131 diag]$ adrci

ADRCI: Release 11.2.0.3.0 - Production on Thu Jul 19 09:25:09 2018

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/u01/app/oracle"
adrci> help

 HELP [topic]
   Available Topics:
        CREATE REPORT
        ECHO
        EXIT
        HELP
        HOST
        IPS
        PURGE
        RUN
        SET BASE
        SET BROWSER
        SET CONTROL
        SET ECHO
        SET EDITOR
        SET HOMES | HOME | HOMEPATH
        SET TERMOUT
        SHOW ALERT
        SHOW BASE
        SHOW CONTROL
        SHOW HM_RUN
        SHOW HOMES | HOME | HOMEPATH
        SHOW INCDIR
        SHOW INCIDENT
        SHOW PROBLEM
        SHOW REPORT
        SHOW TRACEFILE
        SPOOL

 There are other commands intended to be used directly by Oracle, type
 "HELP EXTENDED" to see the list

用的最多的查看日志的使用方法:

--如果直接show alert ,会启用editor模式。

adrci> show alert

Choose the alert log from the following homes to view:

1: diag/rdbms/orcl/orcl
2: diag/clients/user_oracle/host_2012117373_80
3: diag/tnslsnr/qht131/listener
Q: to quit

Please select option: 1

这里会提示选择homepath,这里有一个小技巧,就是提前设置好homepath,以避免每次都需要选择Homepath。

adrci> set homepath diag/rdbms/orcl/orcl
adrci> show alert

ADR Home = /u01/app/oracle/diag/rdbms/orcl/orcl:

不想进行editor模式的话,也可以用-tail来显示最少的多少条

adrci>  show alert -tail 15
2018-07-18 09:39:31.036000 +08:00
DW00 started with pid=37, OS id=2777, wid=1, job SYS.SYS_EXPORT_TABLE_01
2018-07-18 09:39:47.386000 +08:00
XDB installed.
2018-07-18 09:39:48.647000 +08:00
XDB initialized.
2018-07-18 13:06:33.050000 +08:00
Thread 1 advanced to log sequence 2 (LGWR switch)
  Current log# 2 seq# 2 mem# 0: /u01/oradata/orcl/redo02.log
2018-07-18 13:06:41.746000 +08:00
Archived Log entry 5 added for thread 1 sequence 1 ID 0x59f9d93d dest 1:
2018-07-18 22:00:00.322000 +08:00
Setting Resource Manager plan SCHEDULER[0x318B]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Starting background process VKRM
VKRM started with pid=29, OS id=7867
2018-07-18 22:00:08.948000 +08:00
Begin automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"
2018-07-18 22:02:23.220000 +08:00
End automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"
2018-07-18 22:02:34.054000 +08:00
Thread 1 advanced to log sequence 3 (LGWR switch)
  Current log# 3 seq# 3 mem# 0: /u01/oradata/orcl/redo03.log
2018-07-18 22:02:37.612000 +08:00
Archived Log entry 6 added for thread 1 sequence 2 ID 0x59f9d93d dest 1:
2018-07-19 02:00:00.171000 +08:00
Closing scheduler window
Closing Resource Manager plan via scheduler window
Clearing Resource Manager plan via parameter
2018-07-19 02:05:19.851000 +08:00
Thread 1 advanced to log sequence 4 (LGWR switch)
  Current log# 1 seq# 4 mem# 0: /u01/oradata/orcl/redo01.log
2018-07-19 02:05:21.259000 +08:00
Archived Log entry 7 added for thread 1 sequence 3 ID 0x59f9d93d dest 1:

也可以用 show alert -tail -f 实际监控日志,-tail的用法和tail命令差不多的。

查看具体的ORA-错误,-p参数可以指定查找错误:

adrci> help show alert

  Usage: SHOW ALERT [-p <predicate_string>]  [-term]
                    [ [-tail [num] [-f]] | [-file <alert_file_name>] ]
  Purpose: Show alert messages.

  Options:
    [-p <predicate_string>]: The predicate string must be double-quoted.
    The fields in the predicate are the fields:
        ORIGINATING_TIMESTAMP         timestamp
        NORMALIZED_TIMESTAMP          timestamp
        ORGANIZATION_ID               text(65)
        COMPONENT_ID                  text(65)
        HOST_ID                       text(65)
        HOST_ADDRESS                  text(17)
        MESSAGE_TYPE                  number
        MESSAGE_LEVEL                 number
        MESSAGE_ID                    text(65)
        MESSAGE_GROUP                 text(65)
        CLIENT_ID                     text(65)
        MODULE_ID                     text(65)
        PROCESS_ID                    text(33)
        THREAD_ID                     text(65)
        USER_ID                       text(65)
        INSTANCE_ID                   text(65)
        DETAILED_LOCATION             text(161)
        UPSTREAM_COMP_ID              text(101)
        DOWNSTREAM_COMP_ID            text(101)
        EXECUTION_CONTEXT_ID          text(101)
        EXECUTION_CONTEXT_SEQUENCE    number
        ERROR_INSTANCE_ID             number
        ERROR_INSTANCE_SEQUENCE       number
        MESSAGE_TEXT                  text(2049)
        MESSAGE_ARGUMENTS             text(129)
        SUPPLEMENTAL_ATTRIBUTES       text(129)
        SUPPLEMENTAL_DETAILS          text(129)
        PROBLEM_KEY                   text(65)

举了例子,如果 需要查看ORA-27037的错误,用如下语句: 

adrci> show alert  -P "message_text  LIKE '%ORA-27037%'"

另外可以查找trace files,  通过ADRCI,我们可以查看ADR下的所有tracefiles,并可以对这些trace 文件进行过滤,只查看我们关注的信息。

adrci> show tracefile;
     diag/rdbms/orcl/orcl/trace/orcl_arc1_2024.trc
     diag/rdbms/orcl/orcl/trace/alert_orcl.log
     diag/rdbms/orcl/orcl/trace/orcl_mmon_2006.trc
     diag/rdbms/orcl/orcl/trace/orcl_arc3_2028.trc
     diag/rdbms/orcl/orcl/trace/orcl_arc0_2022.trc
     diag/rdbms/orcl/orcl/trace/orcl_dw00_2239.trc
     diag/rdbms/orcl/orcl/trace/orcl_dw00_2777.trc
adrci> show tracefile %arc%
     diag/rdbms/orcl/orcl/trace/orcl_arc1_2024.trc
     diag/rdbms/orcl/orcl/trace/orcl_arc3_2028.trc
     diag/rdbms/orcl/orcl/trace/orcl_arc0_2022.trc

除了show alert,还能查看事件show incident,incident是记录的严重的错误信息。语法和alert差不多的。

adrci> help show incident

  Usage: SHOW INCIDENT [-p <predicate_string>]
                       [-mode BASIC|BRIEF|DETAIL]
                       [-last <num> | -all]
                       [-orderby (field1, field2, ...) [ASC|DSC]]

  Purpose: Show the incident information. By default, this command will
           only show the last 50 incidents which are not flood controlled.

  Options:
    [-p <predicate_string>]: The predicate string must be double-quoted.

    [-mode BASIC|BRIEF|DETAIL]: The different modes of showing incidents.
    BASIC will show the basic information of non-flooded controlled
    incidents, which is the default mode. In this mode, only the following
    fields can be used in the predicate clause:
        INCIDENT_ID                   number
        PROBLEM_KEY                   text(550)
        CREATE_TIME                   timestamp
    BRIEF will display incident information from the incident relation.
    In this mode, the fields can appear in the predicate are:
        INCIDENT_ID                   number
        PROBLEM_ID                    number
        CREATE_TIME                   timestamp
        CLOSE_TIME                    timestamp
        STATUS                        number
        FLAGS                         number
        FLOOD_CONTROLLED              number
        ERROR_FACILITY                text(10)
        ERROR_NUMBER                  number
        ERROR_ARG1                    text(64)
        ERROR_ARG2                    text(64)
        ERROR_ARG3                    text(64)
        ERROR_ARG4                    text(64)
        ERROR_ARG5                    text(64)
        ERROR_ARG6                    text(64)
        ERROR_ARG7                    text(64)
        ERROR_ARG8                    text(64)
        SIGNALLING_COMPONENT          text(64)
        SIGNALLING_SUBCOMPONENT       text(64)
        SUSPECT_COMPONENT             text(64)
        SUSPECT_SUBCOMPONENT          text(64)
        ECID                          text(64)
        IMPACT                        number

    DETAIL will display all incident-related information, such as incident
    files. The fields can appear in the predicate is the same as the ones
    in the brief mode.

    [-last <num> | -all]: This option allows users to either select
    the last <num> of qualified incidents to show or to show all the
    qualified incidents. If this option is not specified, this command
    will only show 50 incidents.

    [-orderby (field1, field2, ...) [ASC|DSC]]: If specified, the results
    will be ordered by the specified fields' values. By default, it will be
    in the ascending order unless "DSC" is specified. Note that the field
    names that can be specified here are from the "INCIDENT" relation.

  Examples:
    show incident
    show incident -mode detail
    show incident -mode detail -p "incident_id=123"

另外可以通过purge命令来删除日志信息

adrci> help purge

  Usage: PURGE [[-i <id1> | <id1> <id2>] |
               [-age <mins> [-type ALERT|INCIDENT|TRACE|CDUMP|HM|UTSCDMP]]]:

  Purpose: Purge the diagnostic data in the current ADR home. If no
           option is specified, the default purging policy will be used.

  Options:
    [-i id1 | id1 id2]: Users can input a single incident ID, or a
    range of incidents to purge.

    [-age <mins>]: Users can specify the purging policy either to all
    the diagnostic data or the specified type. The data older than <mins>
    ago will be purged

    [-type ALERT|INCIDENT|TRACE|CDUMP|HM|UTSCDMP]: Users can specify what type of
    data to be purged.

  Examples:
    purge
    purge -i 123 456
    purge -age 60 -type incident

 比如要删除60分钟之前的alert

adrci> purge -age 60 -type ALERT

参考:

https://blog.csdn.net/tianlesoftware/article/details/8222724 

https://blogs.oracle.com/database4cn/oracle-11g-adr

猜你喜欢

转载自blog.csdn.net/jolly10/article/details/81109143