Sunday, November 22, 2009

Lock Identification

/* This script shows all users who are waiting for locked resources */

SELECT b.username, b.serial#, c.sid, c.owner object_owner, c.object, d.id1, a.sql_text SQL FROM v$sqltext a,v$session b, v$access c, v$lock d
WHERE a.address = b.sql_address AND a.hash_value = b.sql_hash_value AND b.sid = c.sid AND b.lockwait = d.kaddr AND c.owner NOT IN ('SYS','SYSTEM');

/* This script shows the users who are locking the above waiting resources. This script will also show the SQL being executed by the user. NOTE: If the locking user is no longer executing any SQL, the rows returned will be 0. The user, however, is still locking the above users since he/she has not yet committed his/her transaction -- maybe they are away from their desk?? You should call them and ask them to either commit or rollback. */

SELECT x.serial#, x.sid, x.username, y.id1, z.sql_text SQL FROM v$session x, v$lock y, v$sqltext z
WHERE y.id1 IN (SELECT distinct b.id1 FROM v$session a, v$lock b WHERE b.kaddr = a.lockwait) AND x.sid = y.sid AND z.hash_value = x.sql_hash_value AND y.request = 0


/* This script is identical to the above script EXCEPT that it does not return the SQL text being executed by the locking user. You can use it to quickly identify the locking user's serial#, sid and username if you get to a point where you will need to kill the user's session. */

SELECT x.serial#, x.sid, x.username, y.id1 FROM v$session x, v$lock y WHERE y.id1 IN (SELECT distinct b.id1 FROM v$session a, v$lock b WHERE b.kaddr = a.lockwait) AND x.sid = y.sid AND y.request = 0;

To Check the Database Version

Here is the query to check what is the Database version

Connect as sys/pwd@dbname as sysdba

select username,account_status,default_tablespace,temporary_tablespace,created
from dba_users


Sample Output

PRODUCT VERSION STATUS
NLSRTL 10.2.0.1.0 Production
Oracle Database 10g Enterprise Edition 10.2.0.1.0 Prod
PL/SQL 10.2.0.1.0 Production
TNS for Linux: 10.2.0.1.0 Production

PL/SQL Last Compiled Date/Time

To check when was the pl/sql procedure compiled.

Connect to the schema which you want and give the query as

Eg:

select to_char(LAST_DDL_TIME,'YYYY-MM-DD HH24:MI:SS') from user_objects where object_name = upper('dcs_dev');

Friday, November 20, 2009

High-availability cluster

The most common size for an HA cluster is a two-node cluster, since that's the minimum required to provide redundancy, but many clusters consist of many more, sometimes dozens of nodes. Such configurations can sometimes be categorized into one of the following models:

* Active/Active — Traffic intended for the failed node is either passed onto an existing node or load balanced across the remaining nodes. This is usually only possible when the nodes utilize a homogeneous software configuration.
* Active/Passive — Provides a fully redundant instance of each node, which is only brought online when its associated primary node fails. This configuration typically requires the most amount of extra hardware.
* N+1 — Provides a single extra node that is brought online to take over the role of the node that has failed. In the case of heterogeneous software configuration on each primary node, the extra node must be universally capable of assuming any of the roles of the primary nodes it is responsible for. This normally refers to clusters which have multiple services running simultaneously; in the single service case, this degenerates to Active/Passive.
* N+M — In cases where a single cluster is managing many services, having only one dedicated failover node may not offer sufficient redundancy. In such cases, more than one (M) standby servers are included and available. The number of standby servers is a tradeoff between cost and reliability requirements.
* N-to-1 — Allows the failover standby node to become the active one temporarily, until the original node can be restored or brought back online, at which point the services or instances must be failed-back to it in order to restore High Availability.
* N-to-N — A combination of Active/Active and N+M clusters, N to N clusters redistribute the services or instances from the failed node among the remaining active nodes, thus eliminating (as with Active/Active) the need for a 'standby' node, but introducing a need for extra capacity on all active nodes.

Thursday, November 19, 2009

Database Replay - Ora 11g

“Simulating production load is not possible” , you might have heard these word.

In one project, where last 2 year management want to migrate from UNIX system to Linux system ( RAC ) , but they still testing because they are not sure where this Linux Boxes where bale to handle load or not. They have put lot of efforts and time in load testing and functional testing etc, but still not le gain confidence.
After using these feature of 11g , they will gain confidence and will able to migrate to Linux with full confidence and will know how there system will behave after migration/upgrade.


As per datasheet given on OTN

Database Replay workload capture of external clients is performed at the database server level. Therefore, Database Replay can be used to assess the impact of any system changes below the database tier level such as below:



* Database upgrades, patches, parameter, schema changes, etc.
* Configuration changes such as conversion from a single instance to RAC etc.
* Storage, network, interconnect changes
* Operating system, hardware migrations, patches, upgrades, parameter changes





DB replay does this by capturing a workload on the production system with negligible performance overhead( My observation is 2-5% more CPU usage ) and replaying it on a test system with the exact timing, concurrency, and transaction characteristics of the original workload. This makes possible complete assessment of the impact of the change including undesired results; new contentions points or performance regressions. Extensive analysis and reporting ( AWR , ADDM report and DB replay report) is provided to help identify any potential problems, such as new errors encountered and performance divergences. The ability to accurately capture the production workload results in significant cost and timesaving since it completely eliminates the need to develop simulation workloads or scripts. As a result, realistic testing of even complex applications using load simulation tools/scripts that previously took several months now can be accomplished at most in a few days with Database Replay and with minimal effort. Thus using Database Replay, businesses can incur much lower costs and yet have a high degree of confidence in the overall success of the system change and significantly reduce production deployment


Steps for Database Replay



1.
Workload Capture



Database are tracked and stored in binary files, called capture files, on the file system. These files contain all relevant information about the call needed for replay such as SQL text, bind values, wall clock time, SCN, etc.



1) Backup production Database #

2) Add/remove filter ( if any you want )
By default, all user sessions are recorded during workload capture. You can use workload filters to specify which user sessions to include in or exclude from the workload. Inclusion filters enable you to specify user sessions that will be captured in the workload. This is useful if you want to capture only a subset of the database workload.
For example , we don't want to capture load for SCOTT user



BEGIN
DBMS_WORKLOAD_CAPTURE.ADD_FILTER (
fname => 'user_scott',
fattribute => 'USER',
fvalue => 'SCOTT');
END;

Here filter name is "user_scott" ( user define name)

3) Create directory make sure enough space is there





CREATE OR REPLACE DIRECTORY db_replay_dir
AS '/u04/oraout/test/db-replay-capture';





4) Capture workload

BEGIN
DBMS_WORKLOAD_CAPTURE.start_capture (
name => capture_testing',dir=>'DB_REPLAY_DIR',
duration => NULL );
END;

Duration => NULL mean , it will capture load till we stop with below mentioned manual SQL command. Duration is optional input to specify the duration (in seconds) , default is NULL

5) Finish capture



BEGIN
DBMS_WORKLOAD_CAPTURE.finish_capture;
END;





# Take backup of production before Load capture, so we can restore database on test environment and will run replay on same SCN level of database to minimize data divergence


Note as per Oracle datasheet



The workload that has been captured on Oracle Database release 10.2.0.4 and higher can also be replayed on Oracle Database 11g release.So , I think , It simply mean NEW patch set 10.2.0.4 will support capture processes. Is it mean Current patch set (10.2.0.3) not support load capture ??????



2. Workload Processing

Once the workload has been captured, the information in the capture files has to be processed preferably on the test system because it is very resource intensive job. This processing transforms the captured data and creates all necessary metadata needed for replaying the workload.

exec DBMS_WORKLOAD_REPLAY.process_capture('DB_REPLAY_DIR');





3.
Workload Replay

1) Restore database backup taken step one to test system and start Database

2) Initialize

BEGIN
DBMS_WORKLOAD_REPLAY.initialize_replay (
replay_name => 'TEST_REPLAY',
replay_dir => 'DB_REPLAY_DIR');
END;



3) Prepare



exec DBMS_WORKLOAD_REPLAY.prepare_replay(synchronization => TRUE)



4) Start clients



$ wrc mode=calibrate replaydir=/u03/oradata/test/db-replay-capture

Workload Replay Client: Release 11.1.0.6.0 - Production on Wed Dec 26 00:31:41 2007

Copyright (c) 1982, 2007, Oracle. All rights reserved.


Report for Workload in: /u03/oradata/test/db-replay-capture
-----------------------

Recommendation:
Consider using at least 1 clients divided among 1 CPU(s).

Workload Characteristics:
- max concurrency: 1 sessions
- total number of sessions: 7

Assumptions:
- 1 client process per 50 concurrent sessions
- 4 client process per CPU
- think time scale = 100
- connect time scale = 100
- synchronization = TRUE




$ wrc system/pass mode=replay replaydir=/u03/oradata/test/db-replay-capture

Workload Replay Client: Release 11.1.0.6.0 - Production on Wed Dec 26 00:31:52 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.

Wait for the replay to start (00:31:52)



5) Start Replay

BEGIN
DBMS_WORKLOAD_REPLAY.start_replay;
END;
/





$ wrc system/pass mode=replay replaydir=/u03/oradata/test/db-replay-capture

Workload Replay Client: Release 11.1.0.6.0 - Production on Wed Dec 26 00:31:52 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.

Wait for the replay to start (00:31:52)
Replay started (00:33:32)
Replay finished (00:42:52)



4.
Analysis and Reporting

Generate AWR , ADDM and DB reply report and compare with data gathered on production for same time period when load was captured on Production database. For Database Replay Report run following command





SQL> COLUMN name FORMAT A20
SQL> SELECT id, name FROM dba_workload_replays;

ID NAME
---------- --------------------
1 TEST_REPLAY



DECLARE
v_report CLOB;
BEGIN
v_report := DBMS_WORKLOAD_replay.report(
replay_id => 1,
format=>DBMS_WORKLOAD_CAPTURE.TYPE_HTML
);
dbms_output.put_line(l_report);
END;
/

Friday, November 13, 2009

Listener

when attempting to rename the listener.log file while the TNS listener process is running. The Oracle TNS listener process holds an open handle to the file. Under UNIX, you CAN remove the file, but Oracle will not re-create the file when it attempts to write to it again. The TNS listener will need to be stopped and restarted in order for it to create the new listener.log file.

Well, here is a solution for renaming (or removing) the listener.log file without having to stop and start the TNS listener process under either Windows or UNIX:


Windows

C:\> cd \oracle\ora92\network\log
C:\oracle\ora92\network\log> lsnrctl set log_status off
C:\oracle\ora92\network\log> rename listener.log listener.old
C:\oracle\ora92\network\log> lsnrctl set log_status on


UNIX

% cd $ORACLE_HOME/network/log
% lsnrctl set log_status off
% mv listener.log listener.old
% lsnrctl set log_status on

On the Other hand even if you were able to rename the listener.log file it will not create a new log file like alert log. Instead it keeps writing to the renamed log file.

For Eg,
When the Listener is running if we give the command as
% cd $ORACLE_HOME/network/log
% mv listener.log listenerOld.log

After we move to the listenerOld.log file, it will continue writing log files in the file listenerOld.log. In order to make sure we create a new log file we must use the following commands.

% cd $ORACLE_HOME/network/log
% lsnrctl set log_status off
% mv listener.log listener.old
% lsnrctl set log_status on

It is advisable to move/rename the listener.log file atleast once in a month, we can create a script file to do this.

Shell Script for moving the Listener Log file
Save the Following contents in a file eg.,lsnrlogmv.sh file
export ORACLE_SID=opubp
export ORACLE_HOME=/opt/oracle/u01/oracle/product/10.2.0/db_1
$ORACLE_HOME/bin/lsnrctl set log_status off
mv $ORACLE_HOME/network/log/listener.log $ORACLE_HOME/network/log/listener_$(date +"%d-%m-%Y").log
$ORACLE_HOME/bin/lsnrctl set log_status on

Edit Crontab by Crontab -e
00 15 * * 2 /opt/oracle/lsnrlogMv.sh
#mi hh dd mm dow filename

Hope this helps.

Sunday, November 8, 2009

Memory Size in ASMM

Here is How you know memory component size when use automatic storage memory managenent(ASMM)

Oracle Database 10g includes the Automatic Shared Memory Management feature which simplifies the SGA memory management significantly.

When automatic SGA memory management is enabled, the sizes of the different SGA components are flexible and can adapt to the needs of a workload without requiring any additional configuration. The database automatically distributes the available memory among the various components as required, allowing the system to maximize the use of all available SGA memory.

DBA can check all memory component size with dynamic view V$SGA_DYNAMIC_COMPONENT

Dataguar Switchover Steps

1. Initiate the switchover operation on the primary database (On the original primary database)

- Verify database role

SQL> SELECT DATABASE_ROLE FROM V$DATABASE;

DATABASE_ROLE
----------------
PRIMARY

- Switch database role from primary to standby database

SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN WAIT;

Database altered.

- Shutdown database

SQL> SHUTDOWN IMMEDIATE

- Startup database

SQL> STARTUP

- Verify database role

SQL> SELECT DATABASE_ROLE FROM V$DATABASE;

DATABASE_ROLE
----------------
PHYSICAL STANDBY

- Start apply log process for standby database

SQL> ALTER DATABASE RECOVER MANAGE STANDBY DATABASE DISCONNECT;

2. Switch standby database to primary database (On the original standby database)
- Verify database role

SQL> SELECT DATABASE_ROLE FROM V$DATABASE;

DATABASE_ROLE
----------------
PHYSICAL STANDBY

- Switch database role from standby to primary database

SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;

Database altered.

- Shutdown database

SQL> SHUTDOWN IMMEDIATE

- Startup database

SQL> STARTUP

- Verify database role

SQL> SELECT DATABASE_ROLE FROM V$DATABASE;

DATABASE_ROLE
----------------
PRIMARY

Creating a Function based Index

-- syntax for creating a function-based index:

CREATE [UNIQUE] INDEX index_name
ON table_name (function1, function2, . function_n)
[ COMPUTE STATISTICS ];


-- for example:

CREATE INDEX supplier_metrics
ON supplier (UPPER(supplier_city));

-- to be sure that the Oracle optimizer uses this index
-- when executing your SQL statements, be sure that
-- UPPER(supplier_city) does not evaluate to a NULL value.
-- To ensure this, add UPPER(supplier_city) IS NOT NULL to
-- your WHERE clause as follows:

SELECT supplier_id, supplier_city, UPPER(supplier_city)
FROM supplier
WHERE UPPER(supplier_city) IS NOT NULL
ORDER BY UPPER(supplier_city);

Monday, November 2, 2009

Recovery from missing or corrupted control file:

Case 1: A multiplexed copy of the control file is available.
--------------------------------------------------------------------

On startup Oracle must read the control file in order to find out where the datafiles and online logs are located. Oracle expects to find control files at locations specified in the CONTROL_FILE initialisation parameter. The instance will fail to mount the database if any one of the control files are missing or corrupt. A brief error message will be displayed, with further details recorded in the alert log. The exact error message will vary depending on what has gone wrong. Here's an example:

C:\>sqlplus /nolog

SQL*Plus: Release 10.2.0.1.0 - Production on Tue Nov 3 14:07:33 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

SQL> connect / as sysdba;
Connected to an idle instance.
SQL> startup open;
ORACLE instance started.

Total System Global Area 1317011456 bytes
Fixed Size 1250740 bytes
Variable Size 973081164 bytes
Database Buffers 335544320 bytes
Redo Buffers 7135232 bytes
ORA-00205: error in identifying control file, check alert log for more info


On checking the alert log, as suggested, we find the following:

ALTER DATABASE MOUNT
Tue Nov 03 14:07:42 2009
ORA-00202: control file: 'D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\CONTROL02.CTL'

Tue Nov 03 14:07:45 2009
ORA-205 signalled during: ALTER DATABASE MOUNT...

The above corruption was introduced by manually editing the control file when the database was closed.

The solution is simple, provided you have at least one uncorrupted control file - replace the corrupted control file with a copy using operating system commands. Remember to rename the copied file. The database should now start up without any problems.

SQL> shutdown immediate;
ORA-01507: database not mounted


ORACLE instance shut down.
SQL> connect / as sysdba;
Connected to an idle instance.
SQL> startup open;
ORACLE instance started.

Total System Global Area 1317011456 bytes
Fixed Size 1250740 bytes
Variable Size 973081164 bytes
Database Buffers 335544320 bytes
Redo Buffers 7135232 bytes
Database mounted.
Database opened.
SQL>


Case 2: All control files lost
------------------------------------

What if you lose all your control files? In that case you have no option but to use a backup control file. The recovery needs to be performed from within RMAN, and requires that all logs (archived and current online logs) since the last backup are available. The logs are required because all datafiles must also be restored from backup. The database will then have to be recovered up to the time the control files went missing. This can only be done if all intervening logs are available. Here's an annotated transcript of a recovery session (as usual, lines in bold are commands to be typed, lines in italics are explanatory comments, other lines are RMAN feedback):


C:\>sqlplus /nolog

SQL*Plus: Release 10.2.0.1.0 - Production on Tue Nov 3 14:39:34 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

SQL> connect / as sysdba;
Connected to an idle instance.
SQL> startup open;
ORACLE instance started.

Total System Global Area 1317011456 bytes
Fixed Size 1250740 bytes
Variable Size 981469772 bytes
Database Buffers 327155712 bytes
Redo Buffers 7135232 bytes
ORA-00205: error in identifying control file, check alert log for more info


SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options


C:\>rman target /

Recovery Manager: Release 10.2.0.1.0 - Production on Tue Nov 3 14:41:46 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

connected to target database: emsts (not mounted)

RMAN> set dbid 849865825

executing command: SET DBID

--restore controlfile from autobackup. The backup is not at the default
--location so the path must be specified

RMAN> restore controlfile from 'C:\OraBkup\emsts\C-849865825-20091102-01';

Starting restore at 03-NOV-09
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=157 devtype=DISK

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:06
output filename=D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\CONTROL01.CTL
output filename=D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\CONTROL02.CTL
output filename=D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\CONTROL03.CTL
Finished restore at 03-NOV-09

-- Now that control files have been restored, the instance can mount the
-- database.

RMAN> mount database;

database mounted
released channel: ORA_DISK_1

RMAN> restore database;

Starting restore at 03-NOV-09
Starting implicit crosscheck backup at 03-NOV-09
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=156 devtype=DISK
Crosschecked 11 objects
Finished implicit crosscheck backup at 03-NOV-09

Starting implicit crosscheck copy at 03-NOV-09
using channel ORA_DISK_1
Finished implicit crosscheck copy at 03-NOV-09

searching for all files in the recovery area
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\ARCHIVELOG\2009_11_03\O1_MF_1_4_5GZKBQMK_.ARC

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00001 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\SYSTEM01.DBF
restoring datafile 00002 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\UNDOTBS01.DBF
restoring datafile 00003 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\SYSAUX01.DBF
restoring datafile 00004 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\USERS01.DBF
restoring datafile 00005 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\EXAMPLE01.DBF
restoring datafile 00006 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\ODDDAT01.DBF
restoring datafile 00007 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\ODDTAB01.DBF
restoring datafile 00008 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\SMTTAB01.DBF
restoring datafile 00009 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\RMAN01.DBF
channel ORA_DISK_1: reading from backup piece C:\ORABKUP\EMSTS\BACKUPEMSTS_DB_0AKTC3D8_10_1
channel ORA_DISK_1: restored backup piece 1
piece handle=C:\ORABKUP\EMSTS\BACKUPEMSTS_DB_0AKTC3D8_10_1 tag=TAG20091102T183415
channel ORA_DISK_1: restore complete, elapsed time: 00:03:26
Finished restore at 03-NOV-09

--Database must be recovered because all datafiles have been restored from
-- backup otherwise the following error message will be given out


RMAN> open resetlogs database;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of alter db command at 11/03/2009 14:54:40
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: 'D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\SYSTEM01.DBF'

RMAN> alter database open resetlogs;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of alter db command at 11/03/2009 14:56:56
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: 'D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\SYSTEM01.DBF'

RMAN> recover database;

Starting recover at 03-NOV-09
using channel ORA_DISK_1

starting media recovery

archive log thread 1 sequence 3 is already on disk as file D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\REDO02.LOG
archive log thread 1 sequence 4 is already on disk as file D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\ARC
HIVELOG\2009_11_03\O1_MF_1_4_5GZKBQMK_.ARC
archive log thread 1 sequence 5 is already on disk as file D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\REDO01.LOG
archive log filename=D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\REDO02.LOG thread=1 sequence=3
archive log filename=D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\ARCHIVELOG\2009_11_03\O1_MF_1_4_5GZKBQMK_
.ARC thread=1 sequence=4
archive log filename=D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\REDO01.LOG thread=1 sequence=5
media recovery complete, elapsed time: 00:00:08
Finished recover at 03-NOV-09

RMAN> alter database open resetlogs;

database opened

RMAN>


Several points are worth emphasising.

1. Recovery using a backup controlfile should be done only if a current control file is unavailable.
2. All datafiles must be restored from backup. This means the database will need to be recovered using archived and online redo logs. These MUST be available for recovery until the time of failure.
3. As with any database recovery involving RESETLOGS, take a fresh backup immediately.
4. Technically the above is an example of complete recovery - since all committed transactions were recovered. However, some references consider this to be incomplete recovery because the database log sequence had to be reset.

Recovery from missing or corrupted redo log group

Case 1: A multiplexed copy of the missing log is available.

If a redo log is missing, it should be restored from a multiplexed copy, if possible. This is the only way to recover without any losses. Here's an example, where I attempt to startup from SQLPlus when a redo log is missing:

SQL> connect / as sysdba;
Connected to an idle instance.
SQL> startup open;
ORACLE instance started.

Total System Global Area 1317011456 bytes
Fixed Size 1250740 bytes
Variable Size 939526732 bytes
Database Buffers 369098752 bytes
Redo Buffers 7135232 bytes
Database mounted.
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\REDO03A.LOG'

To fix this we simply copy REDO03A.LOG from its multiplexed location on E: to the above location on E:. After copying the file, we issue an "alter database open" from the above SQLPlus session:

SQL> alter database open;



Case 2: All members of a log group lost.

In this case an incomplete recovery is the best we can do. We will lose all transactions from the missing log and all subsequent logs. We illustrate using the same example as above. The error message indicates that members of log group 3 are missing. We don't have a copy of this file, so we know that an incomplete recovery is required. The first step is to determine how much can be recovered. In order to do this, we query the V$LOG view (when in the mount state) to find the system change number (SCN) that we can recover to (Reminder: the SCN is a monotonically increasing number that is incremented whenever a commit is issued):

--The database should be in the mount state for v$log access

SQL> select first_change# from v$log where group#=3 ;

SQL> connect / as sysdba;
Connected to an idle instance.
SQL> startup open;
ORACLE instance started.

Total System Global Area 1317011456 bytes
Fixed Size 1250740 bytes
Variable Size 956303948 bytes
Database Buffers 352321536 bytes
Redo Buffers 7135232 bytes
Database mounted.
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\REDO03.LOG'


SQL> select first_change# from V$log where group#=3;

FIRST_CHANGE#
-------------
3240563848


The FIRST_CHANGE# is the first SCN stamped in the missing log. This implies that the last SCN stamped in the previous log is 370254 (FIRST_CHANGE#-1). This is the highest SCN that we can recover to. In order to do the recovery we must first restore ALL datafiles to this SCN, followed by recovery (also up to this SCN). This is an incomplete recovery, so we must open the database resetlogs after we're done. Here's a transcript of the recovery session (typed commands in bold, comments in italics, all other lines are RMAN feedback):

SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

C:\>rman target=sys@emsts catalog=rman@emsts

Recovery Manager: Release 10.2.0.1.0 - Production on Mon Nov 2 18:11:07 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

target database Password:
connected to target database: EMSTS (DBID=849865825, not open)
recovery catalog database Password:
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00554: initialization of internal recovery manager package failed
RMAN-04004: error from recovery catalog database: ORA-01033: ORACLE initialization or shutdown in progress

C:\>rman target /

Recovery Manager: Release 10.2.0.1.0 - Production on Mon Nov 2 18:11:26 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

connected to target database: EMSTS (DBID=849865825, not open)

RMAN> restore database until scn 3240563847;

Starting restore at 02-NOV-09
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=151 devtype=DISK

channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00001 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\SYSTEM01.DBF
restoring datafile 00002 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\UNDOTBS01.DBF
restoring datafile 00003 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\SYSAUX01.DBF
restoring datafile 00004 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\USERS01.DBF
restoring datafile 00005 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\EXAMPLE01.DBF
restoring datafile 00006 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\ODDDAT01.DBF
restoring datafile 00007 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\ODDTAB01.DBF
restoring datafile 00008 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\SMTTAB01.DBF
restoring datafile 00009 to D:\ORACLE\PRODUCT\10.2.0\ORADATA\EMSTS\RMAN01.DBF
channel ORA_DISK_1: reading from backup piece C:\ORABKUP\EMSTS\BACKUPEMSTS_DB_04KT1A8E_4_1
channel ORA_DISK_1: restored backup piece 1
piece handle=C:\ORABKUP\EMSTS\BACKUPEMSTS_DB_04KT1A8E_4_1 tag=TAG20091029T162341
channel ORA_DISK_1: restore complete, elapsed time: 00:03:18
Finished restore at 02-NOV-09

RMAN> recover database until scn 3240563847;

Starting recover at 02-NOV-09
using channel ORA_DISK_1

starting media recovery

archive log thread 1 sequence 109 is already on disk as file D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\A
RCHIVELOG\2009_10_30\O1_MF_1_109_5GNOC06T_.ARC
archive log thread 1 sequence 110 is already on disk as file D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\A
RCHIVELOG\2009_11_02\O1_MF_1_110_5GWOVN4W_.ARC
archive log thread 1 sequence 111 is already on disk as file D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\A
RCHIVELOG\2009_11_02\O1_MF_1_111_5GX91FBQ_.ARC
channel ORA_DISK_1: starting archive log restore to default destination
channel ORA_DISK_1: restoring archive log
archive log thread=1 sequence=108
channel ORA_DISK_1: reading from backup piece C:\ORABKUP\EMSTS\BACKUPEMSTS_DB_05KT1AH4_5_1
channel ORA_DISK_1: restored backup piece 1
piece handle=C:\ORABKUP\EMSTS\BACKUPEMSTS_DB_05KT1AH4_5_1 tag=TAG20091029T162820
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
archive log filename=D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\ARCHIVELOG\2009_11_02\O1_MF_1_108_5GXDFQV
C_.ARC thread=1 sequence=108
channel default: deleting archive log(s)
archive log filename=D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\ARCHIVELOG\2009_11_02\O1_MF_1_108_5GXDFQV
C_.ARC recid=110 stamp=701893047
archive log filename=D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\EMSTS\ARCHIVELOG\2009_10_30\O1_MF_1_109_5GNOC06
T_.ARC thread=1 sequence=109
media recovery complete, elapsed time: 00:00:18
Finished recover at 02-NOV-09

RMAN> alter database open resetlogs;

database opened

RMAN>

The following points should be noted:
  1. The entire database must be restored to the SCN that has been determined by querying v$log.
  2. All changes beyond that SCN are lost. This method of recovery should be used only if you are sure that you cannot do better. Be sure to multiplex your redo logs, and (space permitting) your archived logs!
  3. The database must be opened with RESETLOGS, as a required log has not been applied. This resets the log sequence to zero, thereby rendering all prior backups worthless. Therefore, the first step after opening a database RESETLOGS is to take a fresh backup. Note that the RESETLOGS option must be used for any incomplete recovery.


dbverify utility syntax

The primary tool for checking for corruption in an Oracle database is DBVERIFY. It can be used to perform a physical data structure integrity check on data files whether the database is online or offline. The big benefit of this is that DBVERIFY can be used to check backup data files without adding load to the database server. You invoke DBVERIFY from the operating system command line like this:
$ dbv file=dbfPATH/data01.dbf logfile=logfilepath/verify.log blocksize=8192 feedback=100
In this example data01.dbf is the data file to check, and the tablespace this file belongs to has a block size of 8192 bytes. The feedback parameter tells DBVERIFY to draw a period on the screen after every 100 pages (blocks) of the file are verified.
In the log file you’ll see output like this:

DBVERIFY: Release 10.2.0.1.0 - Production on Mon Nov 2 16:06:38 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.


DBVERIFY - Verification starting : FILE = d:\oracle\product\10.2.0\oradata\emsts\smttab01.dbf


DBVERIFY - Verification complete

Total Pages Examined : 262144
Total Pages Processed (Data) : 58669
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 50674
Total Pages Failing (Index): 0
Total Pages Processed (Other): 2588
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 150213
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Highest block SCN : 3240352395 (0.3240352395)

Sunday, November 1, 2009

expdp utility corruption

sometimes, the expdp utility may be corrupted, in this case either the expdp file which is inside the $ORACLE_HOME/bin folder may be missing or the filesize is 0 bytes,

Easiest method to solve this issue is to copy the expdp from a working server to the server which is not working - make sure that we need to copy from the same version of the database example if the server which is having trouble is running in the Database ver 10.2.03 then, the file which is copied should also run in the version 10.2.03.

After this it will show the error message as /opt/oracle/product/10gR2/bin/expdp: error while loading shared libraries: libclntsh.so.10.1: cannot open shared object file: No such file or directory when trying to run the expdp file.

Soltion for this is we need to set the environment variable as follows.

$export LD_LIBRARY_PATH=$ORACLE_HOME/lib
Make sure that you have set the ORACLE_HOME variable before.
Recovery from missing or corrupted datafile(s):

Case 1: Recovery from corrupted or missing datafile

This scenario deals with a situation where a datafile has gone missing, or is corrupted beyond repair. For concreteness, we look at a case where a datafile is missing. Below is a transcript of an SQL Plus session that attempts to open a database with a missing datafile (typed commands in bold, lines in italics are my comments, all other lines are feedback from SQL Plus):

The error message tells us that file# 4 is missing. Note that although the startup command has failed, the database is in the mount state. Thus, the database control file, which is also the RMAN repository can be accessed by the instance and by RMAN. We now recover the missing file using RMAN. The transcript of the recovery session is reproduced below (bold lines are typed commands, comments in italics, the rest is feedback from RMAN):

--logon to RMAN

C:\>rman target /

Recovery Manager: Release 9.2.0.4.0 - Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: ORCL (DBID=1507972899)

--restore missing datafile

RMAN> restore datafile 4;

Starting restore at 26/JAN/05

using target database controlfile instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=14 devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=15 devtype=DISK
channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00004 to D:\ORACLE_DATA\DATAFILES\ORCL\USERS01.DBF
channel ORA_DISK_1: restored backup piece 1
piece handle=D:\BACKUP\0QGB0UEC_1_1.BAK tag=TAG20050124T152708 params=NULL
channel ORA_DISK_1: restore complete
Finished restore at 26/JAN/05

--recover restored datafile - RMAN applies all logs automatically

RMAN> recover datafile 4;

Starting recover at 26/JAN/05 using channel ORA_DISK_1
using channel ORA_DISK_2

starting media recovery

archive log thread 1 sequence 4 is already on disk as file E:\ORACLE_ARCHIVE\ORCL\1_4.ARC
archive log thread 1 sequence 5 is already on disk as file C:\ORACLE_ARCHIVE\ORCL\1_5.ARC
archive log thread 1 sequence 6 is already on disk as file E:\ORACLE_ARCHIVE\ORCL\1_6.ARC
archive log thread 1 sequence 7 is already on disk as file E:\ORACLE_ARCHIVE\ORCL\1_7.ARC
archive log filename=E:\ORACLE_ARCHIVE\ORCL\1_4.ARC thread=1 sequence=4
archive log filename=C:\ORACLE_ARCHIVE\ORCL\1_5.ARC thread=1 sequence=5
media recovery complete
Finished recover at 26/JAN/05

--open database for general use

RMAN> alter database open;

database opened

RMAN>

In the above scenario, the database is already in the mount state before the RMAN session is initiated. If the database is not mounted, you should issue a "startup mount" command before attempting to restore the missing datafile. The database must be mounted before any datafile recovery can be done.