Monday, November 2, 2009

Recovery from missing or corrupted redo log group

Case 1: A multiplexed copy of the missing log is available.

If a redo log is missing, it should be restored from a multiplexed copy, if possible. This is the only way to recover without any losses. Here's an example, where I attempt to startup from SQLPlus when a redo log is missing:

SQL> connect / as sysdba;
Connected to an idle instance.
SQL> startup open;
ORACLE instance started.

Total System Global Area 1317011456 bytes
Fixed Size 1250740 bytes
Variable Size 939526732 bytes
Database Buffers 369098752 bytes
Redo Buffers 7135232 bytes
Database mounted.
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\REDO03A.LOG'

To fix this we simply copy REDO03A.LOG from its multiplexed location on E: to the above location on E:. After copying the file, we issue an "alter database open" from the above SQLPlus session:

SQL> alter database open;



Case 2: All members of a log group lost.

In this case an incomplete recovery is the best we can do. We will lose all transactions from the missing log and all subsequent logs. We illustrate using the same example as above. The error message indicates that members of log group 3 are missing. We don't have a copy of this file, so we know that an incomplete recovery is required. The first step is to determine how much can be recovered. In order to do this, we query the V$LOG view (when in the mount state) to find the system change number (SCN) that we can recover to (Reminder: the SCN is a monotonically increasing number that is incremented whenever a commit is issued):

--The database should be in the mount state for v$log access

SQL> select first_change# from v$log where group#=3 ;

SQL> connect / as sysdba;
Connected to an idle instance.
SQL> startup open;
ORACLE instance started.

Total System Global Area 1317011456 bytes
Fixed Size 1250740 bytes
Variable Size 956303948 bytes
Database Buffers 352321536 bytes
Redo Buffers 7135232 bytes
Database mounted.
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\REDO03.LOG'


SQL> select first_change# from V$log where group#=3;

FIRST_CHANGE#
-------------
3240563848


The FIRST_CHANGE# is the first SCN stamped in the missing log. This implies that the last SCN stamped in the previous log is 370254 (FIRST_CHANGE#-1). This is the highest SCN that we can recover to. In order to do the recovery we must first restore ALL datafiles to this SCN, followed by recovery (also up to this SCN). This is an incomplete recovery, so we must open the database resetlogs after we're done. Here's a transcript of the recovery session (typed commands in bold, comments in italics, all other lines are RMAN feedback):

SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

C:\>rman target=sys@emsts catalog=rman@emsts

Recovery Manager: Release 10.2.0.1.0 - Production on Mon Nov 2 18:11:07 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

target database Password:
connected to target database: EMSTS (DBID=849865825, not open)
recovery catalog database Password:
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00554: initialization of internal recovery manager package failed
RMAN-04004: error from recovery catalog database: ORA-01033: ORACLE initialization or shutdown in progress

C:\>rman target /

Recovery Manager: Release 10.2.0.1.0 - Production on Mon Nov 2 18:11:26 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

connected to target database: EMSTS (DBID=849865825, not open)

RMAN> restore database until scn 3240563847;

Starting restore at 02-NOV-09
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=151 devtype=DISK

channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00001 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\SYSTEM01.DBF
restoring datafile 00002 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\UNDOTBS01.DBF
restoring datafile 00003 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\SYSAUX01.DBF
restoring datafile 00004 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\USERS01.DBF
restoring datafile 00005 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\EXAMPLE01.DBF
restoring datafile 00006 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\ODDDAT01.DBF
restoring datafile 00007 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\ODDTAB01.DBF
restoring datafile 00008 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\SMTTAB01.DBF
restoring datafile 00009 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\RMAN01.DBF
channel ORA_DISK_1: reading from backup piece C:\ORABKUP\EMSTS\BACKUPEMSTS_DB_04KT1A8E_4_1
channel ORA_DISK_1: restored backup piece 1
piece handle=C:\ORABKUP\EMSTS\BACKUPEMSTS_DB_04KT1A8E_4_1 tag=TAG20091029T162341
channel ORA_DISK_1: restore complete, elapsed time: 00:03:18
Finished restore at 02-NOV-09

RMAN> recover database until scn 3240563847;

Starting recover at 02-NOV-09
using channel ORA_DISK_1

starting media recovery

archive log thread 1 sequence 109 is already on disk as file D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\A
RCHIVELOG\2009_10_30\O1_MF_1_109_5GNOC06T_.ARC
archive log thread 1 sequence 110 is already on disk as file D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\A
RCHIVELOG\2009_11_02\O1_MF_1_110_5GWOVN4W_.ARC
archive log thread 1 sequence 111 is already on disk as file D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\A
RCHIVELOG\2009_11_02\O1_MF_1_111_5GX91FBQ_.ARC
channel ORA_DISK_1: starting archive log restore to default destination
channel ORA_DISK_1: restoring archive log
archive log thread=1 sequence=108
channel ORA_DISK_1: reading from backup piece C:\ORABKUP\EMSTS\BACKUPEMSTS_DB_05KT1AH4_5_1
channel ORA_DISK_1: restored backup piece 1
piece handle=C:\ORABKUP\EMSTS\BACKUPEMSTS_DB_05KT1AH4_5_1 tag=TAG20091029T162820
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
archive log filename=D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\ARCHIVELOG\2009_11_02\O1_MF_1_108_5GXDFQV
C_.ARC thread=1 sequence=108
channel default: deleting archive log(s)
archive log filename=D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\ARCHIVELOG\2009_11_02\O1_MF_1_108_5GXDFQV
C_.ARC recid=110 stamp=701893047
archive log filename=D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\ARCHIVELOG\2009_10_30\O1_MF_1_109_5GNOC06
T_.ARC thread=1 sequence=109
media recovery complete, elapsed time: 00:00:18
Finished recover at 02-NOV-09

RMAN> alter database open resetlogs;

database opened

RMAN>

The following points should be noted:
  1. The entire database must be restored to the SCN that has been determined by querying v$log.
  2. All changes beyond that SCN are lost. This method of recovery should be used only if you are sure that you cannot do better. Be sure to multiplex your redo logs, and (space permitting) your archived logs!
  3. The database must be opened with RESETLOGS, as a required log has not been applied. This resets the log sequence to zero, thereby rendering all prior backups worthless. Therefore, the first step after opening a database RESETLOGS is to take a fresh backup. Note that the RESETLOGS option must be used for any incomplete recovery.


No comments:

Post a Comment