Oracle使用RMAN Advisor恢复current redo丢失故障
本文主要介绍通过RMAN Advisor恢复current redo丢失的故障。
环境说明:
DB:Oracle 11.2.0.4.0
OS:Red Hat Enterprise Linux Server release 7.5 (Maipo)
准备环境:
查看归档
SQL> archive log list;
Database log mode No Archive Mode
Automatic archival Disabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 12
Current log sequence 14
启动归档
[root@cjc-db-01 ~]# mkdir arch
[root@cjc-db-01 ~]# chown oracle:oinstall arch
sqlplus as sysdba
alter system set log_archive_dest_1='location=/arch';
alter system set log_archive_format = "cjc_%t_%s_%r.arc" scope=spfile;
shutdown immediate;
startup mount;
alter database archivelog;
alter database open;
archive log list;
alter system switch logfile;
创建测试数据
SQL> conn cjc/***
SQL>
create table t1(id number,time varchar2(100));
insert into t1 values (1, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));
insert into t1 values (2, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));
insert into t1 values (3, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));
commit;
col time for a25
select * from t1;
ID TIME
---------- -------------------------
1 2024-03-28 09:24:31
2 2024-03-28 09:24:39
3 2024-03-28 09:24:45
执行rman全备份
rman target
run
{
allocate channel c1 type disk;
allocate channel c2 type disk;
backup incremental level = 0 format '/back/rman/rman_level0_%d_%T_%U.bak' database;
sql 'alter system archive log current';
backup archivelog all format '/back/rman/rman_arch_%d_%T_%U.bak';
backup current controlfile format '/back/rman/rman_con_%d_%T_%U.bak';
release channel c1;
release channel c2;
}
备份后插入新数据
insert into t1 values (4, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));
insert into t1 values (5, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));
commit;
查看redo信息
SQL> col member for a50
select a.GROUP#,a.STATUS,b.member from v$log a,v$logfile b where a.group#=b.group#;SQL>
GROUP# STATUS MEMBER
---------- ---------------- --------------------------------------------------
3 CURRENT oracle/app/oracle/oradata/cjc/redo03.log
2 INACTIVE oracle/app/oracle/oradata/cjc/redo02.log
1 INACTIVE oracle/app/oracle/oradata/cjc/redo01.log
重命名current redo logfile,模拟故障
[oracle@cjc-db-01 cjc]$ cd oracle/app/oracle/oradata/cjc
[oracle@cjc-db-01 cjc]$ mv redo03.log redo03.log.1
继续插入数据,仍可以正常提交事物
insert into t1 values (6, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));
insert into t1 values (7, to_char(sysdate, 'yyyy-mm-dd hh24:mi:ss'));
commit;
手动杀掉pmon进程,模拟异常宕机
[oracle@cjc-db-01 cjc]$ ps -ef|grep pmon
oracle 17623 1 0 12:27 ? 00:00:00 ora_pmon_cjc
oracle 18287 11168 0 12:35 pts/3 00:00:00 grep --color=auto pmon
[oracle@cjc-db-01 cjc]$ kill -9 17623
重新启动数据库,启动失败
SQL> startup
ORACLE instance started.
Total System Global Area 2071076864 bytes
Fixed Size 2254784 bytes
Variable Size 536873024 bytes
Database Buffers 1526726656 bytes
Redo Buffers 5222400 bytes
Database mounted.
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/oracle/app/oracle/oradata/cjc/redo03.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
查看丢失的是current redo logfile
SQL> col member for a50
select a.GROUP#,a.STATUS,b.member from v$log a,v$logfile b where a.group#=b.group#;SQL>
GROUP# STATUS MEMBER
---------- ---------------- --------------------------------------------------
1 INACTIVE oracle/app/oracle/oradata/cjc/redo01.log
3 CURRENT oracle/app/oracle/oradata/cjc/redo03.log
2 INACTIVE /oracle/app/oracle/oradata/cjc/redo02.log
GROUP# 3丢失,理论上最多能恢复到 1187045,也就是GROUP# 2的NEXT_CHANGE#
SQL> select GROUP#,SEQUENCE#,FIRST_CHANGE#,NEXT_CHANGE# from v$log order by 1;
GROUP# SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ---------- ------------- ------------
1 7 1186324 1187037
2 8 1187037 1187045
3 9 1187045 2.8147E+14
使用RMAN Advisor工具进行恢复
查看故障,检测到Redo log group 3丢失
RMAN> list failure;
using target database control file instead of recovery catalog
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
1483 CRITICAL OPEN 28-MAR-24 Redo log group 3 is unavailable
1486 HIGH OPEN 28-MAR-24 Redo log file /oracle/app/oracle/oradata/cjc/redo03.log is missing
查看建议,生成Repair script
RMAN> advise failure;
List of Database Failures
=========================
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
1483 CRITICAL OPEN 28-MAR-24 Redo log group 3 is unavailable
1486 HIGH OPEN 28-MAR-24 Redo log file /oracle/app/oracle/oradata/cjc/redo03.log is missing
analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=20 device type=DISK
analyzing automatic repair options complete
Mandatory Manual Actions
========================
no manual actions available
Optional Manual Actions
=======================
1. If file /oracle/app/oracle/oradata/cjc/redo03.log was unintentionally renamed or moved, restore it
Automated Repair Options
========================
Option Repair Description
------ ------------------
1 Perform incomplete database recovery to SCN 1187045
Strategy: The repair includes point-in-time recovery with some data loss
Repair script: /oracle/app/oracle/diag/rdbms/cjc/cjc/hm/reco_3624337172.hm
查看Repair script,需要执行不完全恢复,scn 1187045,和之前查的一致
[oracle@cjc-db-01 cjc]$ cat /oracle/app/oracle/diag/rdbms/cjc/cjc/hm/reco_3624337172.hm
# database point-in-time recovery
restore database until scn 1187045;
recover database until scn 1187045;
alter database open resetlogs;
根据建议执行恢复
可以手动恢复,也可以执行repair failure进行恢复
RMAN> restore database until scn 1187045;
RMAN> recover database until scn 1187045;
RMAN> alter database open resetlogs;
查询数据
SQL> col time for a50
SQL> select * from cjc.t1;
ID TIME
---------- -----------------
1 2024-03-28 12:30:14
2 2024-03-28 12:30:14
3 2024-03-28 12:30:14
丢失了全备后新增的4条数据。
###chenjuchao 20240328###