###问题描述
PLSQL无法登录,普通用户使用sqlplus登录hang住,只能通过sys用户登录
###一、分析
####1.会话分析
set linesize 600;
col sid format a500;
col username format a20;
col event format a60;
SQL> select sid,username,event,p1,p2,p3 from v$session where status='ACTIVE' and username is not null;
SID USERNAME EVENT P1 P2 P3
---------- -------------------- ------------------------------------------------------------ ---------- ---------- ----------
########## MASK buffer busy waits 44 3578 1
########## DBSNMP buffer busy waits 44 3578 1
########## SYSMAN buffer busy waits 44 3578 1
########## SYSMAN log file switch (checkpoint incomplete) 0 0 0
########## SYSMAN log file switch (checkpoint incomplete) 0 0 0
########## MASK buffer busy waits 44 3578 1
########## SYSMAN buffer busy waits 44 3578 1
########## MASK buffer busy waits 44 3578 1
########## SYSMAN buffer busy waits 44 3578 1
########## SYSTEM buffer busy waits 44 3578 1
########## SYSMAN buffer busy waits 44 3578 1
####2.hang analyze介绍
通常除了systemstate dump,最好同时生成hang analyze来直观地了解数据库进程间的等待关系。
#####2.1 单实例
$sqlplus / as sysdba
或者
$sqlplus -prelim / as sysdba <==当数据库已经很慢或者hang到无法连接
SQL>alter session set tracefile_identifier='mytrace';
SQL>oradebug setmypid
SQL>oradebug unlimit;
SQL>oradebug dump hanganalyze 3
等1~2分钟
SQL>oradebug dump hanganalyze 3
等1~2分钟
SQL>oradebug dump hanganalyze 3
SQL>oradebug tracefile_name;
SQL>oradebug close_trace
SQL> alter session set tracefile_identifier='mytrace';
Session altered.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit;
Statement processed.
SQL> oradebug dump hanganalyze 3
Statement processed.
SQL> oradebug dump hanganalyze 3
Statement processed.
SQL> oradebug dump hanganalyze 3
Statement processed.
SQL> oradebug tracefile_name;
/u01/app/oracle/diag/rdbms/s***2/s***2/trace/s***2_ora_23481_mytrace.trc
SQL> oradebug close_trace
Statement processed.
SQL> exit
截图:
[2275]/1/2276/13241/0xb0149aa98/13053/LEAF_NW/
该行表示sid,serial#为2276,13241会话持有锁 数据库实例编号 会话sid 会话serial# 会话的地址saddr 会话对应的操作系统ID 表明会话是否等待 如有值,表明是持锁会话的CHAIN编号,否则为空
[2275]/1/2276/13241/0xb0149aa98/13053/LEAF_NW/
trace文件分析参考链接:http://blog.itpub.net/9240380/viewspace-1823479/
####3.查看会话
#####3.1查看对应的sid和serial会话信息
select status from v$session where sid=2276 and serial#=13241;
#####3.2查看长时间持有latch的SQL语句
SELECT s.sql_hash_value,s.sql_id, l.name
FROM V$SESSION s, V$LATCHHOLDER l
WHERE s.sid = l.sid;
#####3.3查看指定sql_id对应的SQL语句
select sql_fulltext from v$sqlarea where sql_id='4ducr8st4ruas';
#####3.4查看日志组信息
select * from v$log;
从下图可以发现非current状态的日志组均为active状态,怀疑日志文件太小
###二、处理
####1.增加日志组
ALTER DATABASE ADD LOGFILE GROUP 4('/u01/app/oracle/oradata/s***2/redo04.log') SIZE 1024M;
ALTER DATABASE ADD LOGFILE GROUP 5('/u01/app/oracle/oradata/s***2/redo05.log') SIZE 1024M;
ALTER DATABASE ADD LOGFILE GROUP 6('/u01/app/oracle/oradata/s***2/redo06.log') SIZE 1024M;
--切换日志组
alter system switch logfile;
####2.查看审计
show parameter audit_trail
SQL> show parameter audit_trail
NAME TYPE VALUE
------------------------------------ --------------------------------- ------------------------------
audit_trail string DB
--关闭审计
SQL> alter system set audit_trail=FALSE scope=spfile;
System altered.
--重启数据库
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 4.6900E+10 bytes
Fixed Size 2239016 bytes
Variable Size 3.2883E+10 bytes
Database Buffers 1.3959E+10 bytes
Redo Buffers 55869440 bytes
Database mounted.
Database opened.
SQL> show parameter audit_trail
NAME TYPE VALUE
------------------------------------ --------------------------------- ------------------------------
audit_trail string FALSE
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
####3.登录检查恢复正常
$sqlplus m**k/******
SQL*Plus: Release 11.2.0.3.0 Production on Mon Feb 26 17:09:00 2018
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select status from v$session where sid=2276 and serial#=13241;
no rows selected
###三、补充知识
####1.hang analyze
oradebug hanganalyze 3产生的TRACE FILE包括3部分内容
第一部分:
Chains most likely to have caused the hang:
[a] Chain 1 Signature: 'rdbms ipc message'<='log file switch (checkpoint incomplete)'<='buffer busy waits'
Chain 1 Signature Hash: 0x4110dae0
[b] Chain 2 Signature: <not in a wait><='latch: cache buffers chains'
Chain 2 Signature Hash: 0xccebf225
[c] Chain 3 Signature: 'rdbms ipc message'<='enq: CR - block range reuse ckpt'<='buffer busy waits'
Chain 3 Signature Hash: 0x11bdab12
第二部分(当然这里面又分为intersecting chains或者non-intersecting chains)
第三部分:
Extra information that will be dumped at higher levels:
[level 4] : 3 node dumps -- [LEAF] [LEAF_NW]
[level 5] : 66 node dumps -- [NO_WAIT] [INVOL_WT] [SINGLE_NODE] [NLEAF] [SINGLE_NODE_NW]
State of ALL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[1]/1/2/33209/0xb0913ba18/14107/NLEAF/[1421]
[2]/1/3/56281/0xb110bda68/21150/NLEAF/[3552]
[143]/1/144/589/0xb01111318/2007/NLEAF/[1421]
[285]/1/286/3181/0xaf112d4d0/13267/NLEAF/[1988]
[286]/1/287/2703/0xaf912f5c8/21152/NLEAF/[3552]
[427]/1/428/437/0xb11173570/2120/NLEAF/[1421]
[428]/1/429/1867/0xae917a538/26749/NLEAF/[1988]
[569]/1/570/7907/0xb512010f8/2126/NLEAF/[1988]
[571]/1/572/16555/0xb111aee90/26751/NLEAF/[428]
[711]/1/712/4255/0xaf91e50d0/2122/NLEAF/[1988]
[714]/1/715/25733/0xb09268760/20010/NLEAF/[1421]
[853]/1/854/5489/0xae9230040/2124/NLEAF/[1421]
...
#####1.1各部分说明
● 根据其每列含义
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
数据库实例编号 会话sid 会话serial# 会话的地址saddr 会话对应的操作系统ID 表明会话是否等待 如有值,表明是持锁会话的CHAIN编号,否则为空
[163] / 1 / 164 / 10419 / 0xdc9ea7e0 / 4046 / NLEAF / [301]
● non-intersecting chains的特色就是所包含的会话的阻塞会话,不隶属于任何chain
而intersecting chains包含的会话的阻塞会话隶属于另一个chains中的持锁会话,也就是说chains包含在另一个chains中
● chains中包括每个会话以及阻塞会话的详细信息,包括SID,PID,SPID以及当前运行的SQL
所调用堆栈;以及近期的历史等待会话列表
● 关于 ([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]) 中的state列的各值含义如下,暂不全面(仍需要进一步测试)
○ NLEAF表明是等待会话
○ SINGLE_NODE表明它是等待任何会话或资源
○ LEAF表明它是持锁会话,即它不等待任何会话或资源
○ LEAF_NEW它也是持锁会话
####2.日志组成员操作
查看日志组成员路径
select * from v$logfile;
#####2.1 添加日志组成员
ALTER DATABASE ADD LOGFILE GROUP 4('/u01/app/oracle/oradata/s***2/redo04.log') SIZE 1024M;
#####2.2 切换日志组
alter system switch logfile;
#####2.3 删除日志组成员
当日志组成员状态为INACTIVE时可以删除 如下:
alter database drop logfile member '/u01/app/oracle/oradata/s***2/redo01.log';
####3.审计功能检查
#####3.1 检查审计功能
show parameter audit_trail
SQL> show parameter audit_trail
NAME TYPE VALUE
------------------------------------ --------------------------------- ------------------------------
audit_trail string DB
#####3.2 关闭审计
SQL> alter system set audit_trail=FALSE scope=spfile;
System altered.
#####3.3 重启数据库
SQL> shutdown immediate;
SQL> startup
#####3.4 检查审计功能
SQL> show parameter audit_trail
NAME TYPE VALUE
------------------------------------ --------------------------------- ------------------------------
audit_trail string FALSE