Blog | Nov 11, 2014

Some Uncommon Recovery Scenarios Explained – Tips for DBAs-Part 2

Here are some more useful tips for the DBAs:

Case 4: How to perform a block recovery when the datafile header is corrupted?

Recovery approach of corrupt data block on a datafile is a simple procedure. We need to identify the corrupt block, make all valid backups available, and recover the block using RMAN. If the first block on a datafile which has the header information, becomes corrupt, Oracle is not able to perform any recovery on that file.

Let’s start with the traditional approach and see how and why it fails to recover the corrupt block, and then we will find our way out. Issue validate command on the corrupt datafile will fail dictionary views due to the missing datafile header information.

RMAN> backup validate datafile 10;

Starting backup at 06-11-2014 23:07:11
using channel ORA_DISK_1
RMAN-06169: could not read file header for datafile 10 error reason 7
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 06/11/2014 23:07:11
RMAN-06056: could not access datafile 10

The V$DATABASE_BLOCK_CORRUPTION view won’t show any rows due to the above error.

SQL> select * from v$database_block_corruption;

no rows selected

Now let’s see if it still manages to recover the corrupt block.

RMAN> recover datafile 10 block 1;

Starting recover at 11-06-2014 23:08:09
using channel ORA_DISK_1

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 11-06-2014 23:08:10

It appears as if the recovery is done. Actually no it is not. Try to issue the validate command once more on the same datafile.

RMAN> backup validate datafile 10;

Starting backup at 06-11-2014 23:07:11
using channel ORA_DISK_1
RMAN-06169: could not read file header for datafile 10 error reason 7
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 06/11/2014 23:07:11
RMAN-06056: could not access datafile 10

We saw that the traditional approach did not help. Now let’s see how to accomplish this task.

First, put the datafile offline.

RMAN> sql 'alter database datafile 10 offline immediate';

sql statement: alter database datafile 10 offline immediate

Now manually remove the datafile from disk drive.

[oracle@localhost trace]$ rm /home/oracle/app/oracle/oradata/orcl/marcov01.dbf

Restore the datafile and recover it using RMAN and take it back online.

RMAN> restore datafile 10;
Starting restore at 11-06-2014 23:20:29
using channel ORA_DISK_1
RMAN> recover datafile 10;
RMAN> sql 'alter databse datafile 10 online';

sql statement: alter databse datafile 10 online

After this you may run the validate command once more on the datafile to find out if any other datablock is having physical corruption and use RMAN to recover the datablock if found any.

Case 5: How to make a database operational when Undo segment is corrupted?

Consider a scenario when one of the undo segments becomes corrupt that has active segments in it. You won’t be able to open the database until you undo the corrupt.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01172: recovery of thread 1 stuck at block 25 of file 16
ORA-01151: use media recovery to recover block, restore backup if needed

Follow the steps to fix it.
First mount the database.
SQL> startup mount
ORACLE instance started.

Total System Global Area  536870912 bytes
Fixed Size                  2022144 bytes
Variable Size             109053184 bytes
Database Buffers          423624704 bytes
Redo Buffers                2170880 bytes
Database mounted.

Now if you try to drop the undo segment having the corrupt block in it, you won’t be able to do that if undo management is automatic.
SQL> drop tablespace undo2;
drop tablespace undo2
*
ERROR at line 1:
ORA-01548: active rollback segment '_SYSSMU21_2 734716338$' found, terminate
dropping tablespace

Next step is to put the database in manual undo management, and drop the datafiles of undo tablespace

SQL> alter system set undo_management=manual scope=both;
System altered.
SQL> alter database datafile '/test01/dbs/undotbs0.dbf' offline drop;
Database altered.

Now re-create a new undo tablespace
SQL> create undo tablespace undotbs1 datafile '/test01/dbs/undotbs1.dbf' size 1G;
Tablespace created.
SQL> alter system set undo_tablespace=undotbs1;
System altered.

Finally open the database.
SQL>alter database open;
Database altered.

What happens if you try to drop the old undo tablespace?
SQL> drop tablespace UNDO_TBS;
drop tablespace UNDO_TBS
*
ERROR at line 1:
ORA-01548: active rollback segment '_SYSSMU52$' found, terminate dropping tablespace

It means that the data dictionary has the information that the old undo tablespace still contains the active undo segments.

To fix this, please edit pfile and below parameters.
_offline_rollback_segments=_SYSSMU52$
Startup the database using the pfile and drop the old undo tablespace.
SQL> Startup pfile='fullpath of pfile' ;
SQL> drop tablespace UNDO_TBS

Case 6: How to restore your database to a point in time in past when you don’t have backup available?

This is perhaps the easiest of all. To restore a 10g database to a past point in time doesn’t require RMAN backup pieces. All we need to ensure is that the flashback is enabled and we haven’t yet crossed over the undo retention period. Let us simulate this scenario.
We have discussed how to flashback a table in our first case. Now we will discuss how to flashback the whole database. To begin with, let’s assume that we need to restore the production database until 3 hours from now.

First thing to check is whether we are within db_flashback_retention_target period to ensure a successful flashback.

SQL> show parameter db_flashback_retention_target
NAME                                        TYPE          VALUE
------------------------------------ ----------- ------------------------------
db_flashback_retention_target                       integer       14400
SQL>

To undo retention period is four hours and retention type is guaranteed. That means no unexpired undo segment will be overwritten, and we still have consistent data of whole database for another one hour, in order to flashback the database till 3 hours from now.

You have options to flashback the database to either to a SCN number, or timestamp or any restore point if created.

Let’s flashback to timestamp. For this you need to mount database in the exclusive mode. Lastly open the database with resetlogs options.

SQL>shut immediate
SQL>startup mount exclusive
SQL> flashback database to timestamp to_timestamp('2014-06-13 16:00:00', 'YYYY-MM-DD HH24:MI:SS');
SQL> alter database open resetlogs;

I hope you find these tips for uncommon recovery scenarios helpful. Please let me know your feedback.