恢复包括restore和recover两阶段,前者指恢复文件,后者指应用redo到文件,使其前滚到灾难发生前的时刻。这两者RMAN都是自动做的。
PERFORM COMPLETE AND INCOMPLETE RECOVERY
需要开启归档模式
Restore 和Recovery简介
完整恢复(complete recovery)不丢失数据。不完整恢复(incomplte recovery)会丢失数据。大部分恢复是前者。
完整恢复分4步:
- 将损坏的文件离线
- resotre
- recover
- 将文件在线
不完整恢复分为4步:
- mount数据库
- restore所有数据文件
- recover数据库到某时间点
- 带resetlogs选项打开数据库
完整恢复时数据库通常可处于open状态。不完整恢复时数据库处于mount状态,会重建online redo log。
使用RMAN RESTORE 和 RECOVER 命令
大致过程为:
SQL> shutdown immediate;
SQL> startup mount;
$ rman target / catalog rman@rcat
RMAN> restore database;
RMAN> recover database;
RMAN> sql 'alter database open';
非关键数据文件可以在线恢复,表空间必须在线恢复。
非关键数据文件的在线恢复
关键数据文件指SYSTEM和UNDO表空间相关的数据文件。
这时数据库可处于打开状态。
过程如下:
- 将包含非关键数据文件的表空间置于离线
- 使用rman restore 恢复数据文件
- 使用rman recovery恢复数据文件
- 将表空间置于在线
SQL> select table_name,tablespace_name from all_tables where table_name = 'TEST';
TABLE_NAME TABLESPACE_NAME
---------------- ------------------------------
TEST USERS
SQL> select file_name, tablespace_name from dba_data_files where tablespace_name ='USERS';
FILE_NAME TABLESPACE_NAME
------------------------------------------------------------ ------------------------------
/opt/oracle/oradata/ORCLCDB/ORCLPDB1/users01.dbf USERS
SQL> select * from appuser1.test;
A
----------
1
SQL> ! rm /opt/oracle/oradata/ORCLCDB/ORCLPDB1/users01.dbf
此时登录,报错:
$ rlwrap sqlplus appuser1/oracle@orclpdb1
SQL*Plus: Release 19.0.0.0.0 - Production on Mon Dec 9 20:06:19 2019
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
ERROR:
ORA-01109: database not open
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01157: cannot identify/lock data file 12 - see DBWR trace file
ORA-01110: data file 12: '/opt/oracle/oradata/ORCLCDB/ORCLPDB1/users01.dbf'
恢复过程:
$ rman target / catalog rcat_owner/Welcome1@rcat
RMAN> restore pluggable database orclpdb1;
RMAN> recover pluggable database orclpdb1;
恢复成功验证:
SQL> alter database open;
Database altered.
SQL> select * from appuser1.test;
A
----------
1
SQL> show con_name;
CON_NAME
------------------------------
ORCLPDB1
关键数据文件的完整恢复
过程为:
- shutdown abort关闭数据库
- startup mount
- rman restore
- rman recover
- alter database open
使用rman进行不完整恢复
逻辑错误恢复,可用于时间点恢复。restore point(以下简称恢复点),可用于不完整恢复或Flashback。
创建恢复点
时间点可以是指定时间或SCN。若都不指定,默认为当前SCN。
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
4143033
创建恢复点:
create restore point good_for_now;
create restore point good_for_now as of scn 4143000;
恢复点的保留期限由参数CONTROL_FILE_RECORD_KEEP_TIME决定:
SQL> show parameter CONTROL_FILE_RECORD_KEEP_TIME
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
control_file_record_keep_time integer 30
如果想永久保留,可以使用preserve:
SQL> create restore point good_for_now preserve;
Restore point created.
SQL> drop restore point good_for_now;
Restore point dropped.
执行受管的不完整恢复
过程为:
- 确定需要恢复的时间点
- 如果需要做基于时间的恢复,在OS层面设置NLS环境变量
- 停止并启动数据库到mount状态
- 使用RMAN命令块,运行SET UNTIL,RESTORE和RECOVER
- (可选)只读状态打开数据库,验证恢复点
- RESETLOGS方式打开数据库
NLS环境变量必须设置正确,才能正确解析时间格式,例如:
export NLS_LANG=american_america.us7ascii
expport NLS_DATE_FORMAT="Mon DD YYYY HH24:MI:SS"
示例:
SQL> create restore point before_disaster;
Restore point created.
SQL> drop table hr.job_history;
Table dropped.
SQL> shutdown immediate
Pluggable Database closed.
因为是PDB,因此此时已为mount状态:
SQL> show pdbs;
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
3 ORCLPDB1 MOUNTED
恢复:
$ rman target / catalog rcat_owner/Welcome1@rcat
run {
set until restore point before_disaster;
restore pluggable database orclpdb1;
recover pluggable database orclpdb1;
}
打开数据库:
SQL> alter pluggable database orclpdb1 open resetlogs;
Pluggable database altered.
确认已恢复:
SQL> alter session set container=orclpdb1;
Session altered.
SQL> select count(*) from hr.job_history;
COUNT(*)
----------
10
恢复到时间点后,此后的所有修改全部丢失。
使用增量更新的备份恢复
image拷贝由于保持了数据文件的原生格式,因此恢复时不用转换,可加快恢复。image拷贝同样支持增量备份,因此恢复可更快。
恢复Image拷贝
通过增量备份更新image拷贝时,需要应用自上一次增备后的归档和redo。
实现Image拷贝策略
示例策略:
run {
recover copy of database with tag 'inc_upd';
backup incremental level 1 for recovery of copy with tag 'inc_upd' database;
}
第一次执行时,产生0级备份。
第二次执行时,产生1级增备。
第三次执行时,image拷贝会用之前产生的增备更新
使用image拷贝快速恢复
image拷贝可以直接使用,即无需resoter,直接recover。
在RMAN中可通过SET NEWNAME
实现。
快速切换到Image拷贝
假设你已有image拷贝,并有之后所有的redo和归档,即可恢复。
过程为:
- 将损坏的数据文件置于离线,可通过
V$RECOVER_FILE
,V$DATAFILE_HEADER
, 或V$TABLESPACE
判断哪些文件需要恢复。 - 使用RMAN SWITCH . . . TO COPY指向损坏数据文件对应的image拷贝
- recover数据文件
- 将数据文件置于在线
RMAN SWITCH等同于数据库中的alter database rename file
示例,先作一个image拷贝:
RMAN> backup as copy database;
在操作系统中删除某一表空间:
rm /opt/oracle/oradata/ORCLCDB/ORCLPDB1/users01.dbf
确认损坏的文件:
SQL> show con_name;
CON_NAME
------------------------------
ORCLPDB1
SQL>
select file#, status, error, recover, tablespace_name, name from v$datafile_header
where recover = 'YES'
3 or (recover is null and error is not null);
FILE# STATUS ERROR REC TABLESPACE_NAME NAME
---------- ------- -------------------------------- --- ------------------------------ --------------------
12 ONLINE CANNOT OPEN FILE
SQL> select file_name, file_id, tablespace_name from dba_data_files where file_id = 12;
FILE_NAME FILE_ID TABLESPACE_NAME
---------------------------------------------------------------- ---------- ------------------------------
/opt/oracle/oradata/ORCLCDB/ORCLPDB1/users01.dbf 12 USERS
将数据文件离线:
SQL> alter database datafile 12 offline;
Database altered.
恢复:
$ rman target / catalog rcat_owner/Welcome1@rcat
Recovery Manager: Release 19.0.0.0.0 - Production on Mon Dec 9 21:28:09 2019
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved.
connected to target database: ORCLCDB (DBID=2795391422)
connected to recovery catalog database
RMAN> switch datafile 12 to copy;
datafile 12 switched to datafile copy "/u02/fra/backups/ORCLCDB_20191209_26uj1642"
starting full resync of recovery catalog
full resync complete
RMAN> recover datafile 12;
Starting recover at 09-DEC-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=49 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=60 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=280 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=68 device type=DISK
starting media recovery
media recovery complete, elapsed time: 00:00:01
Finished recover at 09-DEC-19
将数据文件在线:
SQL> alter database datafile 12 online;
Database altered.
此时恢复完成了,但数据文件的位置变了:
SQL> select file_name, file_id, tablespace_name from dba_data_files where file_id=12;
FILE_NAME FILE_ID TABLESPACE_NAME
---------------------------------------------------------------- ---------- ------------------------------
/u02/fra/backups/ORCLCDB_20191209_26uj1642 12 USERS
要恢复到之前的位置,可以如下做:
$ ls /opt/oracle/oradata/ORCLCDB/ORCLPDB1/users01.dbf
ls: cannot access /opt/oracle/oradata/ORCLCDB/ORCLPDB1/users01.dbf: No such file or directory
$ rman target / catalog rcat_owner/Welcome1@rcat
RMAN> backup as copy datafile 12 format '/opt/oracle/oradata/ORCLCDB/ORCLPDB1/users01.dbf';
将文件置于离线:
$ rman target=sys/Welcome1@orclpdb1 catalog rcat_owner/Welcome1@rcat
RMAN> alter database datafile 12 offline;
Statement processed
还原到原来位置并恢复:
$ rman target / catalog rcat_owner/Welcome1@rcat
Recovery Manager: Release 19.0.0.0.0 - Production on Mon Dec 9 21:45:00 2019
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved.
connected to target database: ORCLCDB (DBID=2795391422)
connected to recovery catalog database
RMAN> switch datafile 12 to copy;
datafile 12 switched to datafile copy "/opt/oracle/oradata/ORCLCDB/ORCLPDB1/users01.dbf"
starting full resync of recovery catalog
full resync complete
RMAN> recover datafile 12;
Starting recover at 09-DEC-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=60 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=49 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=280 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=68 device type=DISK
starting media recovery
media recovery complete, elapsed time: 00:00:01
Finished recover at 09-DEC-19
将数据文件在线:
$ rman target=sys/Welcome1@orclpdb1 catalog rcat_owner/Welcome1@rcat
RMAN> alter database datafile 12 online;
Statement processed
验证已恢复到原来位置:
SQL> select file_name, file_id, tablespace_name from dba_data_files where file_id=12;
FILE_NAME FILE_ID TABLESPACE_NAME
---------------------------------------------------------------- ---------- ------------------------------
/opt/oracle/oradata/ORCLCDB/ORCLPDB1/users01.dbf 12 USERS
最后再做一次备份,下次就可以用这个恢复:
RMAN> backup as copy datafile 12;
SPFILES, CONTROLFILES和ONLINE REDO的恢复
RMAN controlfile自动备份会备份spfile和controlfile。controlfile一般会多处放置,recovery catalog中会包含controlfile中的元数据信息。redo是靠multiplex来保护的,RMAN并不备份redo。
从Autobackup中恢复spfile
步骤为:
- set dbid ########
- startup force nomount
- restore spfile from autobackup;
- startup force
恢复Controlfile
如果controlfile做了multiplex,则恢复就是拷贝余下的另一份。如果全部丢失,恢复过程为:
- startup nomount
- restore
- mount database
- recover
- open resetlogs
第一步,由于没有controlfile,因此只能nomount启动。
在第二步,由于数据库不能mount,因此RMAN并不知道controlfile包含在哪个备份里。这可以使用3中方法,从autobackup中恢复,从catalog恢复,从指定的备份恢复(如果之前保存了备份时的输出)。
第三步mount后,还必须recover controlfile,因为之前restore的controlfile并非最新。第5步也是此原因。
示例:
SQL> shutdown abort
ORACLE instance shut down.
SQL> startup force mount
ORACLE instance started.
Total System Global Area 1207955552 bytes
Fixed Size 9134176 bytes
Variable Size 452984832 bytes
Database Buffers 738197504 bytes
Redo Buffers 7639040 bytes
ORA-00205: error in identifying control file, check alert log for more info
$ rman target / catalog rcat_owner/Welcome1@rcat
Recovery Manager: Release 19.0.0.0.0 - Production on Tue Dec 10 22:55:56 2019
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved.
connected to target database: ORCLCDB (not mounted)
connected to recovery catalog database
RMAN> restore controlfile from autobackup;
Starting restore at 10-DEC-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=255 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=256 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=22 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=257 device type=DISK
recovery area destination: /u02/fra
database name (or database unique name) used for search: ORCLCDB
channel ORA_DISK_1: AUTOBACKUP /u02/fra/ORCLCDB/autobackup/2019_12_09/o1_mf_s_1026597056_gywnb0wk_.bkp found in the recovery area
channel ORA_DISK_1: looking for AUTOBACKUP on day: 20191210
recovery area destination: /u02/fra
database name (or database unique name) used for search: ORCLCDB
channel ORA_DISK_2: AUTOBACKUP /u02/fra/ORCLCDB/autobackup/2019_12_09/o1_mf_s_1026597056_gywnb0wk_.bkp found in the recovery area
channel ORA_DISK_2: looking for AUTOBACKUP on day: 20191209
recovery area destination: /u02/fra
database name (or database unique name) used for search: ORCLCDB
channel ORA_DISK_3: AUTOBACKUP /u02/fra/ORCLCDB/autobackup/2019_12_09/o1_mf_s_1026597056_gywnb0wk_.bkp found in the recovery area
recovery area destination: /u02/fra
database name (or database unique name) used for search: ORCLCDB
channel ORA_DISK_4: AUTOBACKUP /u02/fra/ORCLCDB/autobackup/2019_12_09/o1_mf_s_1026597056_gywnb0wk_.bkp found in the recovery area
channel ORA_DISK_4: skipped, AUTOBACKUP already found
channel ORA_DISK_2: skipped, AUTOBACKUP already found
channel ORA_DISK_1: skipped, AUTOBACKUP already found
channel ORA_DISK_3: restoring control file from AUTOBACKUP /u02/fra/ORCLCDB/autobackup/2019_12_09/o1_mf_s_1026597056_gywnb0wk_.bkp
channel ORA_DISK_3: control file restore from AUTOBACKUP complete
output file name=/opt/oracle/oradata/ORCLCDB/control01.ctl
output file name=/opt/oracle/oradata/ORCLCDB/control02.ctl
Finished restore at 10-DEC-19
RMAN> alter database mount
2> ;
released channel: ORA_DISK_1
released channel: ORA_DISK_2
released channel: ORA_DISK_3
released channel: ORA_DISK_4
Statement processed
RMAN> recover database;
Starting recover at 10-DEC-19
Starting implicit crosscheck backup at 10-DEC-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=24 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=259 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=25 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=260 device type=DISK
Crosschecked 14 objects
Crosschecked 13 objects
Crosschecked 15 objects
Crosschecked 15 objects
Finished implicit crosscheck backup at 10-DEC-19
Starting implicit crosscheck copy at 10-DEC-19
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
Crosschecked 2 objects
Crosschecked 2 objects
Crosschecked 3 objects
Crosschecked 3 objects
Finished implicit crosscheck copy at 10-DEC-19
searching for all files in the recovery area
cataloging files...
cataloging done
List of Cataloged Files
=======================
File Name: /u02/fra/ORCLCDB/archivelog/2019_12_10/o1_mf_1_32_gyz40zvk_.arc
File Name: /u02/fra/ORCLCDB/autobackup/2019_12_09/o1_mf_s_1026597056_gywnb0wk_.bkp
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4
starting media recovery
archived log for thread 1 with sequence 32 is already on disk as file /u02/fra/ORCLCDB/archivelog/2019_12_10/o1_mf_1_32_gyz40zvk_.arc
archived log for thread 1 with sequence 33 is already on disk as file /opt/oracle/oradata/ORCLCDB/redo03.log
archived log file name=/u02/fra/ORCLCDB/archivelog/2019_12_10/o1_mf_1_32_gyz40zvk_.arc thread=1 sequence=32
archived log file name=/opt/oracle/oradata/ORCLCDB/redo03.log thread=1 sequence=33
media recovery complete, elapsed time: 00:00:01
Finished recover at 10-DEC-19
## 这一步时间很长
RMAN> alter database open resetlogs;
Statement processed
new incarnation of database registered in recovery catalog
starting full resync of recovery catalog
full resync complete
从丢失的Redo Log Group中恢复
看log group丢失全部还是部分,以及log group的当前状态,log group或log group member的丢失可能丢失数据,也可能没有关系。
日志文件的状态可从v$log
中查询。状态分为:
- CURRENT :正在写此log group,此log group需要用于实例恢复
- ACTIVE :没有在写,此log group需要用于实例恢复
- INACTIVE:无需用于实例恢复,可能并没有归档
- UNUSED:尚未使用
- CLEARING:正在由
alter database clear logfile
命令清理,完成后状态为UNUSED - CLEARING_CURRENT:在
alter database clear logfile
命令清理时发生错误
最常用的状态为前3个。
SQL> select group#, sequence#, archived, status from v$log;
GROUP# SEQUENCE# ARC STATUS
---------- ---------- --- ----------------
1 1 YES INACTIVE
2 2 YES INACTIVE
3 3 NO CURRENT
注意以上两个INACTIVE状态的日志已归档。
从Log Group 成员错误中恢复
如果log group中一个成员失效,LGWR会继续写另一个成员,数据不会丢失,但应尽快修复。
SQL> select group#, status, member from v$logfile;
GROUP# STATUS MEMBER
---------- ------- ----------------------------------------------------------------
3 /opt/oracle/oradata/ORCLCDB/redo03.log
2 /opt/oracle/oradata/ORCLCDB/redo02.log
1 /opt/oracle/oradata/ORCLCDB/redo01.log
1 /opt/oracle/oradata/ORCLCDB/redo01_2.log
2 /opt/oracle/oradata/ORCLCDB/redo02_2.log
3 /opt/oracle/oradata/ORCLCDB/redo03_2.log
6 rows selected.
SQL> !rm /opt/oracle/oradata/ORCLCDB/redo02_2.log
若状态为INVALID,则删除并重建就好:
alter database drop logfile member '/opt/oracle/oradata/ORCLCDB/redo03.log';
alter database add logfile member '/opt/oracle/oradata/ORCLCDB/redo03.log' to group 1;
状态为INACTIVE的Log Group整个丢失时的恢复
此log group无需用于示例恢复,因此整个丢失影响不大。
可以用alter database clear logfile
清理。
具体命令需根据归档状态,可从v$log中的archive列查询。
若已归档,则无数据丢失,处理如下,clear即重新建立:
alter database clear logfile group 1;
如果未归档,提交的事务不会删除,但在clear日志后必须做一全备,否则归档文件会不连续,导致只能做不完全恢复。处理如下:
alter database clear unarchived logfile group 1;
状态为ACTIVE的Log Group丢失时的恢复
状态为active表示当前并未向其写日志,但此log group需要用于实例恢复。
首先执行alter system checkpoint
将buffer中脏数据冲刷到数据文件,然后clear日志文件,如上例。
状态为CURRENT的Redo Log Group丢失时的恢复
这是唯一会丢失数据的情形。实例会崩溃,只能做不完全恢复。
备份和恢复口令文件
口令文件位于ORACLE_HOME/dbs,RMAN不能管理。需使用操作系统备份命令。例如Oracle Home目录的备份。不过如果口令文件存于ASM,则必须用ASMCMD将其拷贝到文件系统。口令文件若损坏,可以简单的拷贝回来,或使用orapwd重建。
从丢失的临时文件中恢复
临时文件也是数据文件,但属于临时表空间,损坏影响不大,可以在线恢复。
丢失临时文件
可使用alter tablespace重建。
示例:
-- 查看原临时文件位置
select file#, name from v$tempfile;
-- 建新临时文件
alter tablespace temp add tempfile '/u01/app/oracle/oradata/orclpdb1/temp02.dbf' size 25m;
-- 删除原临时文件
drop tablespace temp drop '/u01/app/oracle/oradata/orclpdb1/temp01.dbf';
select file#, name from v$tempfile;
无临时文件时的启动
数据库会自动在原位置重建,如原位置不可用,参见上一个处理方法。
参考
- https://oracle-base.com/articles/12c/multitenant-rman-backup-recovery-cdb-and-pdb-12cr1#pdb-backup