【故障】EXP-00056: ORACLE error 1466 encountered


EXP-00056: ORACLE error 1466 encountered

ORA-01466: unable to read data - table definition has changed

  • S ense: In April 2016, thefollowing alarms and errors occurred in the background log of the DB -EXP export table of Chuangjin Hexin Fund:
EXP-00056: ORACLE error 1466 encountered
ORA-01466: unable to read data - table definition has changed
. . exporting table TO32_T0
. . exporting table TO32_T0_DETAIL 314 rows exported
. . exporting table TO32_T0_JYKY
. . exporting table TO32_T0_TMP 155 rows exported
. . exporting table TO32_T0_ZLZY
. . exporting table TO32_T1
. . exporting table TO32_T1_DETAIL
EXP-00056: ORACLE error 1466 encountered
ORA-01466: unable to read data - table definition has changed
. . exporting table TO32_T1_JYKY
. . exporting table TO32_T1_TMP
EXP-00056: ORACLE error 1466 encountered
ORA-01466: unable to read data - table definition has changed
  • why?
  • The SMON process maintains the SMON_SCN_TIME data dictionary base table, which results in a DDL operation for the data dictionary base table during EXP exporting data . Because in the process of EXP , the export exporter will first read part of the data dictionary base table to locate the overview of the data to be exported. However , it is very likely that SMON will maintain the SMON_SCN_TIME data dictionary base table at some point in the export process, causing DML changes to the base table read by the export program , resulting in an O RACLE ORA-1466 error .
  • The question for operators and some DBAs is that this error occurs from time to time, but it is not a regular routine error. The reason is that the time point at which SMON maintains data dictionary base tables such as SMON_SCN_TIME is not routine, but maintenance operations are performed according to a certain frequency and the statistical state ( SCN ) of the database. Therefore , when the frequency and the database state are completely matched, SMON will initiate the task of maintaining the base table of the data dictionary . Therefore, the DBA only checks the DDL of the related table from the EXP export log , which is tantamount to asking for fish.
  • In Chuangjinhexin database version Release 11.2.0.4.0 Production Version , we can use the Oracle internal error guide tool to preliminarily determine the cause of the failure or problem:
01466, 00000, "unable to read data - table definition has changed"

  • According to the guide ORA - info to verify our troubleshooting guesses based on intuitive EXP log files, we guessed right :)

SQL*Plus: Release 11.2.0.4.0 Production on Thu Mar 24 05:39:19 2016
Copyright (c) 1982, 2013, Oracle. All rights reserved.
SQL> conn /as sysdba
Connected.
SQL> !oerr ora 1466
01466, 00000, "unable to read data - table definition has changed"
// *Cause: Query parsed after tbl (or index) change, and executed
// w/old snapshot
// *Action: commit (or rollback) transaction, and re-execute
  • The role of the SMON (system monitor) background process is also unknown to maintain the SMON_SCN_TIME data dictionary base table
  • The SMON_SCN_TIME base table is used to record the mapping relationship between the SCN (system change number) and the specific timestamp (timestamp) in the past time period. Because Oracle records this mapping relationship by sampling, SMON_SCN_TIME can be relatively rough (inaccurately). ) to locate the time information of a certain SCN. The actual SMON_SCN_TIME mapping is a cluster table cluster table.
  • The biggest use of the SMON_SCN_TIME time mapping table is to provide a way for flashback type queries to map time to SCN (The SMON scn time mapping is mainly for flashback type queries to map a time to an SCN).
  • The Metalink document <Error ORA-01466 while executing a flashback query. [ID 281510.1]> describes the rules for SMON to update SMON_SCN_TIME:
  1. 在版本10g中SMON_SCN_TIME每6秒钟被更新一次(In Oracle Database 10g, smon_scn_time is updated every 6 seconds hence that is the minimum time that the flashback query time needs to be behind the timestamp of the first change to the table.)
  2. 在版本9.2中SMON_SCN_TIME每5分钟被更新一次(In Oracle Database 9.2, smon_scn_time is updated every 5 minutes hence the required delay between the flashback time and table properties change is at least 5 minutes.)
  • In addition, starting from 10g, SMON will also clear the records in SMON_SCN_TIME. The SMON background process will be woken up every 5 minutes to check the total number of mapped records of SMON_SCN_TIME on the disk. If the total number exceeds 144000, the following statement will be used to delete the oldest one Record (minimum time_mp): If only deleting one record is not enough to obtain enough space, SMON will execute the above DELETE statement repeatedly.
delete from smon_scn_time
where thread = 0
and time_mp = (select min(time_mp) from smon_scn_time where thread = 0)
  • SMON maintains the trigger scenario of SMON_SCN_TIME
  • Although the Metalink document <Error ORA-01466 while executing a flashback query. [ID 281510.1]> pointed out that in 10g, SMON will update the SMON_SCN_TIME base table every 6 seconds, but the actual observation can find the update frequency and the growth of SCN The rate is related. In a busy instance, when the SCN rises very quickly, the SMON may update at the shortest interval of 6 seconds, but in an idle instance, the SCN grows more slowly, but still at a frequency of every 5 or 10 minutes. Update, for example:
[oracle@datar ~]$ ps -ef|grep smon|grep -v grep
oracle 3922 1 0 Mar23 ? 00:00:09 ora_smon_ora11g
[oracle@datar ~]$ ps -ef|grep smon|grep -v grep
oracle 3922 1 0 Mar23 ? 00:00:09 ora_smon_ora11g
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE 11.2.0.4.0 Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production
SQL> select * from global_name;
GLOBAL_NAME
--------------------------------------------------------------------------------
ORA11G
SQL> oradebug setospid 3922
Oracle pid: 13, Unix process pid: 3922, image: oracle@datar (SMON)
SQL> oradebug event 10500 trace name context forever,level 10 : 10046 trace name context forever,level 12;
Statement processed.
SQL> oradebug tracefile_name;
/home/oracle/product/diag/rdbms/ora11g/ora11g/trace/ora11g_smon_3922.trc
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
  • Find the record of the SMON process insert data to the SMON_SCN_TIME mapping table in the ora11g_smon_3922.trc trace file as follows, we can find the update frequency:
PARSING IN CURSOR #140375269773744 len=141 dep=1 uid=0 oct=2 lid=0 tim=1458763578296250 hv=973751600 ad='a9899260' sqlid='9wncfacx0nj9h'
insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas, num_mappings, tim_scn_map) values (0, :1, :2, :3, :4, :5, :6, :7)
END OF STMT
PARSE #140375269773744:c=999,e=847,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,plh=0,tim=1458763578296045
  • 可以通过以上INSERT语句的TIME_DP绑定变量值发现其更新SMON_SCN_TIME的时间规律,一般为5或10分钟一次。这说明SMON_SCN_TIME的更细频率与数据库实例的负载有关,更为贴切的说是与SCN的生成速度有关,其最短的间隔是每6秒一次,最长的间隔为10分钟一次。具体请见附件trace文件。
  • SMON维护SMON_SCN_TIME时相关的Stack CALL,ktf_scn_time是更新SMON_SCN_TIME的主要函数,但是11G的11.2.0.4版本的SMON跟踪文件中并没有相关的函数记录,如果使用10046高级别的跟踪事件,我们会在相关的追踪文件中发现处理函数的调用状态。
  • 根据My Oracle Support支持文档,可以发现因SMON_SCN_TIME更新而引起的数据库错误有十分经典的案例。例如以下的两则,来源于MOS官方支持:
  • 由于SMON_SCN_TIME的更新频率问题可能引起ORA-01466错误,详见:Error ORA-01466 while executing a flashback query. [ID 281510.1]
  • 由于SMON_SCN_TIME数据不一致可能引起ORA-00600[6711]或频繁地执行”delete from smon_scn_time”删除语句,详见:ORA-00600[6711]错误一例 High Executions Of Statement “delete from smon_scn_time…” [ID 375401.1]
  • SMON 还可能使用以下SQL语句维护SMON_SCN_TIME字典基表:
select smontabv.cnt,
         smontab.time_mp,
         smontab.scn,
         smontab.num_mappings,
         smontab.tim_scn_map,
         smontab.orig_thread
from smon_scn_time smontab,
        (select max(scn) scnmax,
                  count(*) + sum(NVL2(TIM_SCN_MAP, NUM_MAPPINGS, 0)) cnt
                  from smon_scn_time
        where thread = 0) smontabv
where smontab.scn = smontabv.scnmax
   and thread = 0
insert into smon_scn_time (thread,
                                        time_mp,
                                        time_dp,
                                        scn,
                                        scn_wrp,
                                        scn_bas,
                                        num_mappings,
                                        tim_scn_map)
                              values (0, :1, :2, :3, :4, :5, :6, :7)
  • 查询映射表的统计状态如下所示:

 

update smon_scn_time
      set orig_thread = 0,
           time_mp = :1,
           time_dp = :2,
           scn = :3,
           scn_wrp = :4,
           scn_bas = :5,
           num_mappings = :6,
           tim_scn_map = :7
where thread = 0
   and scn = (select min(scn) from smon_scn_time where thread = 0)
delete from smon_scn_time
        where thread = 0
           and scn = (select min(scn) from smon_scn_time where thread = 0)
  • 如何禁止SMON更新SMON_SCN_TIME基表
  • 可以通过设置诊断事件events=’12500 trace name context forever, level 10’来禁止SMON更新SMON_SCN_TIME基表(Setting the 12500 event at system level should stop SMON from updating the SMON_SCN_TIME table.):
SQL> alter system set events ‘12500 trace name context forever,level 10’;
System altered.
  • 一般我们不推荐禁止SMON更新SMON_SCN_TIME基表,因为这样会影响flashback Query闪回查询的正常使用,但是在某些异常恢复的场景中SMON_SCN_TIME数据讹误可能导致实例的Crash,那么可以利用以上12500事件做到不触发SMON_SCN_TIME被更新。
  • 如何手动清除SMON_SCN_TIME的数据
  • 因为SMON_SCN_TIME不是bootstrap自举核心对象,所以DBA可以手动更新该表上的数据、及重建其索引。尤其需要注意的是,当SMON_SCN_TIME表中的数据和其索引中的数据不一致时会造成一些比较严重的后果,所以其索引的状态是否一致也是需要我们关注的核心问题,例如delete语句无法删除表中的记录问题,此时可以通过重新创建索引来解决。
SQL> drop index smon_scn_time_bak_tim_idx;
索引已删除。
SQL> drop index smon_scn_time_bak_scn__idx;
索引已删除。
SQL> create unique index smon_scn_time_bak_tim_idx on smon_scn_time_bak(time_mp);
索引已创建。
SQL> create unique index smon_scn_time_bak_scn_idx on smon_scn_time_bak(scn);
索引已创建。
SQL> analyze table smon_scn_time_bak validate structure cascade;
表已分析。
  • 可以在设置了12500事件后手动删除SMON_SCN_TIME上的记录,重启实例后SMON会继续正常更新SMON_SCN_TIME。除非是因为SMON_SCN_TIME表上的记录与索引smon_scn_time_tim_idx或smon_scn_time_scn_idx上的不一致造成DELETE语句无法有效删除该表上的记录:文档<LOCK ON SYS.SMON_SCN_TIME [ID 747745.1]>说明了该问题,否则我们没有必要手动清除SMON_SCN_TIME的数据。
  • 具体的操作步骤如下所示:
--set oracle trace event 12500
SQL> alter system set events ‘12500 trace name context forever,level 10’;
--delete records in smon_scn_time
SQL> delete from smon_scn_time;
SQL> alter system set events ‘12500 trace name context forever off’;
SQL> commit;
SQL> shutdown immediate;
SQL> startup;

 

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326963669&siteId=291194637