一大早就收到了告警信息,提示一个主库下两个dataguard备库延时过大。猜测同步可能出现异常或者归档量过大,导致无法及时应用,那到底是什么原因呢,就需要详细排查了,以下是具体排查过程:
1、检查两个dg备库的alert日志
查看了alert日志,发现确实只有在接收归档日志,并没有在实时应用
继续往前查看alert日志,发现了报错内容,报错信息如下:
Wed Jun 12 18:20:40 2024
WARNING: File being created with same name as in Primary
Existing file may be overwritten
Errors in file /u01/app/oracle/diag/rdbms/sbhsdbn/hsdb1/trace/hsdb1_pr00_27888.trc:
ORA-01119: error in creating database file '+ORADATA/hsdbn/tsp_yaopin_01.dbf'
ORA-17502: ksfdcre:4 Failed to create file +ORADATA/hsdbn/tsp_yaopin_01.dbf
ORA-15173: entry 'hsdbn' does not exist in directory '/'
File #292 added to control file as 'UNNAMED00292'.
Originally created as:
'+ORADATA/hsdbn/tsp_yaopin_01.dbf'
Recovery was unable to create the file as:
'+ORADATA/hsdbn/tsp_yaopin_01.dbf'
MRP0: Background Media Recovery terminated with error 1274
Errors in file /u01/app/oracle/diag/rdbms/sbhsdbn/hsdb1/trace/hsdb1_pr00_27888.trc:
ORA-01274: cannot add datafile '+ORADATA/hsdbn/tsp_yaopin_01.dbf' - file could not be created
Wed Jun 12 18:20:40 2024
从上面提示看到,在创建292号文件出错了,我们看下备库控制文件记录的292号文件再哪里?
SQL> select name from v$datafile where file#=292;
NAME
--------------------------------------------------------------------------------
/u01/app/oracle/product/11.2.0/db_1/dbs/UNNAMED00292
由于这个文件没有创建成功,导致它记录的是默认位置$ORACLE_HOME/dbs下,正常是要放到+ORADATA/sbhsdbn/datafile下面才对。
SQL> select name from v$datafile where file#=291;
NAME
--------------------------------------------------------------------------------
+ORADATA/sbhisdbn/datafile/tsp_menzhen_04.dbf
出现该问题的原因,可能是db_file_name_convert这个参数没有配置数据文件转换目录映射。
2、检查备库的参数
SQL> show parameter db_create_file_dest;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_create_file_dest string
SQL>
SQL>
SQL>
SQL> show parameter convert;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_file_name_convert string +ORADATA/hsdbn/datafile, +ORA
DATA/sbhsdbn/datafile, +ORADA
TA/hsdbn/tempfile, +ORADATA/s
bhsdbn/tempfile
log_file_name_convert string +ORADATA/hsdbn, +ORADATA/sbh
sdbn
可以看到db_file_name_convert配置以下两个转换目录:
‘+ORADATA/hsdbn/datafile’ 转换到 ‘+ORADATA/sbhsdbn/datafile’
‘+ORADATA/hsdbn/tempfile’ 转换到 ‘+ORADATA/sbhsdbn/tempfile’
我们从alert日志报错信息,发现是有问题的路径是 ‘+ORADATA/hsdbn/tsp_yaopin_01.dbf’,仔细看这个路径少了一个datafile,因此我们需要给db_file_name_convert在添加一个新的目录映射:
‘+ORADATA/hsdbn/’ 映射到 ‘+ORADATA/sbhsdbn/datafile’
3、解决方法
以下操作在备库执行:
--备库修改db_file_name_convert参数
alter system set db_file_name_convert='+ORADATA/hsdbn/datafile','+ORADATA/sbhsdbn/datafile','+ORADATA/hsdbn/tempfile','+ORADATA/sbhsdbn/tempfile','+ORADATA/hsdbn/','+ORADATA/sbhsdbn/datafile' scope=spfile sid='*';
--备库重启生效
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 1.2827E+11 bytes
Fixed Size 2269072 bytes
Variable Size 1.7985E+10 bytes
Database Buffers 1.1006E+11 bytes
Redo Buffers 227807232 bytes
Database mounted.
--于出问题后的归档的日志都存在,可以采用下面的方法进行处理有问题的文件;
--如果出问题后,时间比较久才发现,备库的归档被删除了,那就比较麻烦,备库需要做增量恢复。
SQL> ALTER SYSTEM SET standby_file_management=MANUAL SCOPE=BOTH;
SQL> alter database create datafile '/u01/app/oracle/product/11.2.0/db_1/dbs/UNNAMED00292' as '+ORADATA/sbhsdbn/datafile/tsp_yaopin_01.dbf' ;
SQL> select name from v$datafile where file#=292;
NAME
--------------------------------------------------------------------------------
+ORADATA/sbhsdbn/datafile/tsp_yaopin_01.dbf
SQL> ALTER SYSTEM SET standby_file_management=AUTO SCOPE=BOTH;
--开启同步
SQL> alter database recover managed standby database using current logfile disconnect;
同步开启后,查看alert,发现在应用归档了,等归档全部应用完成,检查同步情况:
SQL> select 'thread#:' || a.thread# || ', primary:' || b.max_available || ', standby:' ||
a.max_applied , b.max_available - a.max_applied log_gap
from (select thread#, max(sequence#) max_applied
from gv$archived_log
where applied = 2 'YES'
group by thread#) a,
(select thread#, max(sequence#) max_available
from gv$archived_log
group by thread#) b
where a.thread# = b.thread#;
'THREAD#:'||A.THREAD#||',PRIMARY:'||B.MAX_AVAILABLE||',STANDBY:'||A.MAX_APPLIED LOG_GAP
---------------------------------------------------------------------------------------------------------------------------------------------------- ----------
thread#:1, primary:124228, standby:124228 0
thread#:2, primary:169252, standby:169251 1
另外一个备库也是类似的问题,采用同样的方法进行处理,不在赘述。
4、思考
针对该问题,后续运维该注意哪些呢?
(1)主库添加数据文件时,文件路径要注意,尽量选择备库(db_file_name_convert)已经配置了目录映射的路径。
(2)主库添加完数据文件,要检查一下备库的同步情况,有问题的时及时处理。