Blog | Jul 18, 2014

Considerations For The Switchovers of Oracle Databases using Data Guard (DG) Broker

In this blog we’ll discuss all of the considerations as well as detail the steps to perform the planned switchover between an Oracle 11g R2 Primary & the standby databases.

The following case study will help garner information regarding the above request we got from one of our valued client.

The Challenge:
Perform the switchover & switchback using DG broker within 30 minutes.

The Architecture:
Primary DB setup: - 2 Node RAC setup on 11g R2 Oracle Enterprise
Standby DB setup: - Single node setup on 11g R2 Oracle Enterprise

The DG Broker was already setup between the Primary and the Standby Databases.

Pre-requisites to be considered before performing the below switchover/switchback process:

Note: - This document is valid for 11g R2 RAC Primary & 11G R2 Single Node Standby

1.) Make sure that telnet for all the involved Nodes is working fine on all the scan-names,scan-IPs & VIPs with port for which each DB instance is configured to listen.

2.) Make sure each DB instance is remotely reachable using the service name configured on the listener.

3.) Make sure that the following DB parameters are configured properly on each DB instance involved in the switchover/switchback process:


fal_server
fal_client
local_listener
remote_listener
log_archive_config
log_archive_dest_<n>
log_archive_dest_<n>_state
standby_archive_dest
dg_broker_start
log_file_name_convert

4.) Validate the DG broker configuration using the dgmgrl for each DB instance involved in the switchover/switchback process.

5.) Make sure the service "<DB_UNIQUE_NAME>_DGMGRL.domain_name" is statically registered with the local listener for each DB instance involved in the switchover/switchback process.

Once those steps are in place, here is the process to statically register this service with the local listener:

On the whole Primary & Standby Environment:
#####################################

PRIMDB_DGMGRL.domain.com ==> With each RAC Instance

SBYDB_DGMGRL.domain.com

#####################################

Process to register the services in local listeners:
######################################

On Primary RAC Nodes: ========================
On host RAC1 :
Update the listener.ora from the Grid_Home as a grid user:
SID_LIST_LISTENER=
(SID_LIST=
(SID_DESC=
(GLOBAL_DBNAME=PRIMDB_DGMGRL.domain.com)
(ORACLE_HOME=/app/Oracle/product/11.2.0.3/dbhome1)
(SID_NAME=PRIMDB1)
)
(SID_DESC=
(GLOBAL_DBNAME=PRIMDB.domain.com)
(ORACLE_HOME=/app/Oracle/product/11.2.0.3/dbhome1)
(SID_NAME=PRIMDB1)
)
)

Add the following TNS-ENTRY to tnsnames.ora from ORACLE_HOME as an Oracle user:

PRIMDB_DGMGRL.domain.com =

(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =rac1-vip)(PORT = 1532))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = PRIMDB_DGMGRL.domain.com)
(INSTANCE_NAME = PRIMDB1)
)
)
Reload the listener using a grid user "lsnrctl reload LISTENER"
On the Host RAC2:
Update the listener.ora from the Grid_Home as a grid user:
SID_LIST_LISTENER=
(SID_LIST=
(SID_DESC=
(GLOBAL_DBNAME=PRIMDB_DGMGRL.domain.com)
(ORACLE_HOME=/app/Oracle/product/11.2.0.3/dbhome1)
(SID_NAME=PRIMDB2)
)
(SID_DESC=
(GLOBAL_DBNAME=PRIMDB.domain.com)
(ORACLE_HOME=/app/Oracle/product/11.2.0.3/dbhome1)
(SID_NAME=PRIMDB2)
)
)
Add the following TNS-ENTRY to tnsnames.ora from ORACLE_HOME as the Oracle user:
PRIMDB_DGMGRL.domain.com =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =rac2-vip)(PORT = 1532))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = PRIMDB_DGMGRL.domain.com)
(INSTANCE_NAME = PRIMDB2)
)
)
Reload the listener using grid user "lsnrctl reload LISTENER"
=================================

2.) On the host RAC1 :

Connect to dgmgrl using "sys/***@PRIMDB"

Update the Instance specific (PRIMDB1) property "StaticConnectIdentifier" from "PRIMDB1" to the following as:

edit instance 'PRIMDB1' set property StaticConnectIdentifier='(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.250.25)(PORT=1532))(CONNECT_DATA=(SERVICE_NAME=PRIMDB.domain.com)(INSTANCE_NAME=PRIMDB1)(SERVER=DEDICATED)))';
Update the Instance specific (PRIMDB2) property "StaticConnectIdentifier" from "PRIMDB2" to following as:
edit instance 'PRIMDB2' set property StaticConnectIdentifier='(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.250.26)(PORT=1532))(CONNECT_DATA=(SERVICE_NAME=PRIMDB.domain.com)(INSTANCE_NAME=PRIMDB2)(SERVER=DEDICATED)))';

ON the STANDBY Server "SBY":
=====================================
1.) Update the listener.ora from the Grid_Home as a grid user:
SID_LIST_LISTENER=
(SID_LIST=
(SID_DESC=
(GLOBAL_DBNAME=SBYDB_DGMGRL.domain.com)
(ORACLE_HOME=/app/Oracle/product/11.2.0.3/dbhome1)
(SID_NAME=SBYDB)
)
(SID_DESC=
(GLOBAL_DBNAME=SBYDB.domain.com)
(ORACLE_HOME=/app/Oracle/product/11.2.0.3/dbhome1)
(SID_NAME=SBYDB)
)
)
2.) Add the following TNS-ENTRY to tnsnames.ora from the ORACLE_HOME as an Oracle user:
SBYDB_DGMGRL.domain.com =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = sby)(PORT = 1532))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = SBYDB_DGMGRL.domain.com)
)
)
3.) Reload the listener using a grid user "lsnrctl reload LISTENER"

4.) Update the STANDBY INSTANCE specific (SBYDB) property "StaticConnectIdentifier" using dgmgrl

===========================================================================
##################################################################
Below you will find the detailed process carried out for this switchover & switchback in 30 Minutes:
The dgmgrl should be connected with TNS - Entry , you should not connect it using "/ as sysdba or sys/***"
To make it faster open a minimum of two Putty sessions for Node1 Primary (RAC1) , 1 Putty session for Node 2 Primary (RAC2)
& 2 Putty sessions for standby (SBY)
Please follow the below plan for a switchover & a switchback:
################################################

1.) Get the SEQ. Status :- --> on host RAC1
===========================

select thread#, max(sequence#) "Last Primary Seq Generated"  from v$archived_logval, v$databasevdb
whereval.resetlogs_change# = vdb.resetlogs_change#
group by thread# order by 1;

SELECT THREAD#, SEQUENCE# FROM V$THREAD;
2.) Get the SEQ. Applied on Standby: - on host SBY
==============================
SELECT THREAD#, MAX(SEQUENCE#) FROM V$ARCHIVED_LOG WHERE APPLIED = 'YES'
AND RESETLOGS_CHANGE# = (SELECT RESETLOGS_CHANGE# FROM V$DATABASE_INCARNATION WHERE STATUS ='CURRENT')
GROUP BY THREAD#;

3.) SELECT DATABASE_ROLE,SWITCHOVER_STATUS FROM V$DATABASE; --> on host RAC1
4.) --> on host RAC1
dgmgrl sys/***@PRIMDB
show configuration verbose; ==> This Should not have any errors
switchover to 'SBYDB'; 
5.) --> on host RAC1
Check the status from sqlplus as:
selectinst_id,status,instance_name,host_name from gv$instance;
6.)  Make sure that both the RAC Instances on hosts (RAC1/2) are in the MOUNTED State, if not then issue following commands:
srvctl start database -d PRIMDB -o mount

OR
If instance 1 is down then:
srvctl start instance -d PRIMDB -iPRIMDB1 -o mount
If instance 2 is down then:
srvctl start instance -d PRIMDB -iPRIMDB2 -o mount
7.) After the switchover is complete with all ofthe above six steps, validate the environment as:
==================================
On node "RAC1":
dgmgrl sys/***@PRIMDB
show configuration verbose; ==> Note the roles changed here & no errors should be reported
show database verbose "SBYDB";
show database verbose "PRIMDB";
show instance verbose "PRIMDB1";
show instance verbose "PRIMDB2";
8.) Verify that the archive logs are getting shipped from New Primary (SBYDB - on host SBY) & getting applied to the New Standby (PRIMDB - on hosts RAC1/2)
=============================================
On RAC1/2 :
SQL> SELECT DATABASE_ROLE,SWITCHOVER_STATUS FROM V$DATABASE;
set pages 999 lines 300
col OPEN_MODE for a10
col HOST_NAME for a12
select name,INSTANCE_NAME,HOST_NAME,OPEN_MODE,DATABASE_STATUS,DATABASE_ROLE,PROTECTION_LEVEL,CONTROLFILE_TYPE,LOGINS,
to_char(STARTUP_TIME,'DD-MON-YYYY HH24:MI:SS') "UP TIME"fromv$database,gv$instance; 
SQL> select name,open_mode,database_role from v$database;
SQL> select thread#, max(sequence#) "Last Standby Seq Received"
fromv$archived_logval, v$databasevdb
whereval.resetlogs_change# = vdb.resetlogs_change#
group by thread# order by 1; 
SQL>  select thread#, max(sequence#) "Last Standby Seq Applied"
fromv$archived_logval, v$databasevdb
whereval.resetlogs_change# = vdb.resetlogs_change#
andval.applied='YES'
group by thread# order by 1;
SQL>  SELECT PROCESS FROM V$MANAGED_STANDBY WHERE PROCESS LIKE 'MRP%';
#############################################################
SwitchBack :
############################################################

1.) Before starting a switch back, verify the following:
============================================
On the host "RAC1"
Make sure that both of the RAC Instances are in the MOUNTED State, if not then issue the following commands:
srvctl start database -d PRIMDB -o mount
OR
If instance 1 is down then:
srvctl start instance -d PRIMDB -iPRIMDB1 -o mount
If instance 2 is down then:
srvctl start instance -d PRIMDB -iPRIMDB2 -o mount
SELECT DATABASE_ROLE,SWITCHOVER_STATUS FROM V$DATABASE;
dgmgrl sys/***@PRIMDB
show configuration verbose; ==> This should not have any errors
2.) To perform the switch back, connect to dgmgrl as shown in previous step & execute the following command:
=================================
switchover to 'PRIMDB';


3.) Once the step 2.) is complete, check the status from the sqlplus as:-
On the host SBY:
==============================
SQL> SELECT DATABASE_ROLE,SWITCHOVER_STATUS FROM V$DATABASE;
set pages 999 lines 300
col OPEN_MODE for a10
col HOST_NAME for a12
select name,INSTANCE_NAME,HOST_NAME,OPEN_MODE,DATABASE_STATUS,DATABASE_ROLE,PROTECTION_LEVEL,CONTROLFILE_TYPE,LOGINS,
to_char(STARTUP_TIME,'DD-MON-YYYY HH24:MI:SS') "UP TIME"fromv$database,gv$instance;
SQL> select name,open_mode,database_role from v$database;
Check the archivelogs application on to standby:
SQL> select thread#, max(sequence#) "Last Standby Seq Received"
fromv$archived_logval, v$databasevdb
whereval.resetlogs_change# = vdb.resetlogs_change#
group by thread# order by 1; 
SQL>  select thread#, max(sequence#) "Last Standby Seq Applied"
fromv$archived_logval, v$databasevdb
whereval.resetlogs_change# = vdb.resetlogs_change#
andval.applied='YES'
group by thread# order by 1;
SQL>  SELECT PROCESS FROM V$MANAGED_STANDBY WHERE PROCESS LIKE 'MRP%';

4.) Verify the Primary:
===============================
On the hostsRAC1/2:
Check the status of both the RAC Instances in "OPEN READ/WRITE" Mode by:
selectinst_id,status,host_name,instance_name;
If the status is not open then open the respective instances as:
In case instance 1 is down:
srvctl start instance -d PRIMDB -iPRIMDB1
In case Instance 2 is down:-
srvctl start instance -d PRIMDB -iPRIMDB2

5.) dgmgrl sys/***@PRIMDB ==> from node "RAC1”
show configuration verbose ; ==> This should not have any errors
show database verbose "PRIMDB";
show instance verbose "PRIMDB1";
show instance verbose "PRIMDB2";
show database verbose "SBYDB";

6.) Once the switchover & switchback is fully complete, verify the DG Broker status in the OEM as well.
#############################################

References: - OracleMetalink Note ID -11.2 Data Guard Physical Standby Switchover Best Practices using the Broker (Doc ID 1305019.1)

Conclusion: The process outlined above utilizing switchover/switchback with DGMGRL enables TriCore to help clients reduce downtime for production and address the issue within 30 minutes.

Learn more about TriCore's Data Warehouse and Database Managed Services.

About TriCore Solutions
TriCore Solutions, the application management experts, provides a full suite of scalable and reliable managed application, cloud, infrastructure hosting, and consulting services to enterprise organizations. The company delivers its services and the TriCore Trusted Promise to more than 250 companies worldwide to reduce costs, raise service levels, improve customer experience, increase business agility, and accelerate innovation, unlocking the business value from their IT investments. TriCore Solutions is headquartered in Boston, MA, with offices in India and throughout North America.