Articles
Architect: High Availability
Guide to Oracle Data Guard Fast-Start Failoverby John Smiley Published March 2009 This document will guide you through configuring Oracle Data Guard Fast-Start Failover (FSFO) using a physical standby database. FSFO can provide substantial gains in high availability and disaster recovery preparedness for all environments, from inexpensive Cloud-based systems to global distributed data centers. The information in this guide is based on practical experience gained from deploying FSFO in a global corporate production environment. The guide makes few assumptions about your existing environment and includes examples for creating a physical standby database and Data Guard Broker configuration. To get started, all you'll need is Oracle Database Enterprise Edition Release 10.2 or later, a database, and three hosts: two for the databases and a small host for the FSFO observer. The guide attempts to be operating system agnostic; however, some examples may contain platform specific elements such as path and file naming conventions.
Major Components of an FSFO Environment
FSFO builds upon a number of other Oracle technologies and features such as Data Guard,
Flashback Database, and Data Guard Broker.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Database name |
db1 |
| Database unique names |
db1_a db1_b |
|
Domain name |
demo.org |
| Hostnames |
dbhost-a dbhost-b
|
| Data Guard listener name | LISTENER_DG |
| TNS aliases
|
db1_a db1_b |
It's good practice to use separate listeners for application connections and Data Guard connections. This allows Data Guard to remain functional during maintenance periods when the application listeners are down. Be sure to include the Data Guard listener in the
local_listeners database parameter.
Most of the network services used in a FSFO environment may use dynamic registration, but to enable Broker to restart instances during role transitions or during reinstatement after a failover, you must define a static service named
db_unique_name_dgmgrl.
db_domain
. (Note: 11.1.0.7 adds the
StaticConnectIdentifier Broker database property to allow you to specify a different service name.) If you will be using RMAN to create the standby database, it also needs a static service to restart the database being created. In order to maintain separation of Broker and non-Broker activity, a second static service is recommended.
listener.ora configuration for host "a":
LISTENER_DG =
(DESCRIPTION=
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=tcp)(HOST=dbhost-a)(PORT=1522))
)
)
SID_LIST_LISTENER_DG=
(SID_LIST=
(SID_DESC=
(SID_NAME=db1)
(SDU=32767)
(ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1)
(GLOBAL_DBNAME=db1_a_static.demo.org)
)
(SID_DESC=
(SID_NAME=db1)
(SDU=32767)
(ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1)
(GLOBAL_DBNAME=db1_a_dgmgrl.demo.org)
)
)
listener.ora configuration for host "b":
LISTENER_DG =
(DESCRIPTION=
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=tcp)(HOST=dbhost-b)(PORT=1522))
)
)
SID_LIST_LISTENER_DG=
(SID_LIST=
(SID_DESC=
(SID_NAME=db1)
(SDU=32767)
(ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1)
(GLOBAL_DBNAME=db1_b_static.demo.org)
)
(SID_DESC=
(SID_NAME=db1)
(SDU=32767)
(ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1)
(GLOBAL_DBNAME=db1_b_dgmgrl.demo.org)
)
)
Example tnsnames.ora entries:
db1_a=
(description=
(SDU=32767)
(address_list=
(address=(protocol=tcp)(host=dbhost-a)(port=1522))
)
(connect_data=
(service_name=db1_a.demo.org)
(server=dedicated)
)
)
db1_b=
(description=
(SDU=32767)
(address_list=
(address=(protocol=tcp)(host=dbhost-b)(port=1522))
)
(connect_data=
(service_name=db1_b.demo.org)
(server=dedicated)
)
)
db1_a_static=
(description=
(SDU=32767)
(address_list=
(address=(protocol=tcp)(host=dbhost-a)(port=1522))
)
(connect_data=
(service_name=db1_a_static.demo.org)
(server=dedicated)
)
)
db1_b_static=
(description=
(SDU=32767)
(address_list=
(address=(protocol=tcp)(host=dbhost-b)(port=1522))
)
(connect_data=
(service_name=db1_b_static.demo.org)
(server=dedicated)
)
)
lsnrctl start LISTENER_DG
tnsping db1_a
Attempting to contact (description= (SDU=32767) (address_list= (address=(protocol=tcp)(host=dbhost-a)(port=1522)))
(connect_data= (service_name=db1_a.demo.org) (server=dedicated)))
OK (0 msec)
tnsping db1_b
Attempting to contact (description= (SDU=32767) (address_list= (address=(protocol=tcp)(host=dbhost-b)(port=1522)))
(connect_data= (service_name=db1_b.demo.org) (server=dedicated)))
OK (0 msec)
Steps that require the primary to be in a mounted (not open) state are grouped together in the section below entitled Steps Requiring a Bounce of the Primary.
alter system set local_listener='(address=(host=dbhost-a)(port=1522)(protocol=tcp))';
alter system register;
show parameter local_listener
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
local_listener string (address=(host=dbhost-a)(port=
1522)(protocol=tcp))
lsnrctl status listener_dg
...
Services Summary...
Service "DB1_A" has 1 instance(s).
Instance "db1", status READY, has 1 handler(s) for this service...
Service "DB1_A_XPT" has 1 instance(s).
Instance "db1", status READY, has 1 handler(s) for this service...
Service "db1XDB" has 1 instance(s).
Instance "db1", status READY, has 1 handler(s) for this service...
Service "db1_a_dgmgrl.demo.org" has 1 instance(s).
Instance "db1", status UNKNOWN, has 1 handler(s) for this service...
Service "db1_a_static.demo.org" has 1 instance(s).
Instance "db1", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
alter database force logging;
select force_logging from v$database;
FOR
---
YES
create spfile='?/dbs/spfile${ORACLE_SID}.ora' from pfile='?/dbs/init${ORACLE_SID}.ora';
alter system set spfile='?/dbs/spfiledb1.ora';
show parameter spfile;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
spfile string ?/dbs/spfiledb1.ora
orapwd file=$ORACLE_HOME/dbs/orapw$ORACLE_SID
select * from v$pwfile_users;
USERNAME SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS TRUE TRUE FALSE
alter system set remote_login_passwordfile=exclusive scope=spfile;
show parameter remote_login_passwordfile
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
remote_login_passwordfile string EXCLUSIVE
alter system set db_unique_name = db1_a scope=spfile;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_unique_name string db1_a
show parameter db_unique_name
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_unique_name string db1_a
alter system set db_recovery_file_dest_size = 20g scope=both;
alter system set db_recovery_file_dest = '/u01/fra' scope=both;
show parameter db_recovery_file
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest string /u02/flash_recovery_area
db_recovery_file_dest_size big integer 2G
alter system set standby_file_management=auto;
show parameter standby_file_management
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
standby_file_management string AUTO
alter system set log_archive_config='DG_CONFIG=(db1_b)' scope=both;
show parameter log_archive_config
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_archive_config string DG_CONFIG=(db1_b)
select bytes, count(group#) from v$log group by bytes;
BYTES COUNT(GROUP#)
---------- -------------
52428800 3
Find the highest group#
select max(group#) from v$log;
MAX(GROUP#)
-----------
3
Add the SRLs. Unlike ORLs, SRLs should be created with only one member per group. There is no need to multiplex SRLs in order to protect redo as with ORLs (the redo is already protected in the ORLs of the primary). Multiplexing SRLs merely adds unnecessary IO and can increase commit latency. For systems with multiple RAID controllers, consider creating SRLs such that their IO is balanced across the controllers.
In this example, there are 3 ORLs with a max group# of 3. We will create 4 SRLs starting with group# 11. Starting with 11 is purely cosmetic - it allows new ORL groups to be added later while keeping their group# in the same sequence as the existing ORLs.
alter database add standby logfile group 11 '/u02/oradata/db1/stby-t01-g11-m1.log' size 52428800;
alter database add standby logfile group 12 '/u02/oradata/db1/stby-t01-g12-m1.log' size 52428800;
alter database add standby logfile group 13 '/u02/oradata/db1/stby-t01-g13-m1.log' size 52428800;
alter database add standby logfile group 14 '/u02/oradata/db1/stby-t01-g14-m1.log' size 52428800;
select group#, type, member from v$logfile where type = 'STANDBY';
GROUP# TYPE MEMBER
---------- ------- ----------------------------------------
11 STANDBY /u02/oradata/db1/stby-t01-g11-m1.log
12 STANDBY /u02/oradata/db1/stby-t01-g12-m1.log
13 STANDBY /u02/oradata/db1/stby-t01-g13-m1.log
14 STANDBY /u02/oradata/db1/stby-t01-g14-m1.log
alter database archivelog;
select log_mode from v$database;
LOG_MODE
------------
ARCHIVELOG
alter database flashback on;
alter system set db_flashback_retention_target = 60 scope=both;
select flashback_on from v$database;
FLASHBACK_ON
------------------
YES
show parameter db_flashback_retention_target
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_flashback_retention_target integer 60
alter database set standby database to maximize availability;
select protection_mode from v$database;
PROTECTION_MODE
--------------------
MAXIMUM AVAILABILITY
db1:/u01/app/oracle/product/11.1.0/db_1:Y
db_name = db1
startup nomount
rman target sys/password@db1_a_static auxiliary sys/password@db1_b_static
connected to target database: DB1 (DBID=1234567890)
connected to auxiliary database: X (not mounted)
CONFIGURE DEFAULT DEVICE TYPE TO DISK;
DUPLICATE TARGET DATABASE
FOR STANDBY
FROM ACTIVE DATABASE
DORECOVER
SPFILE
SET db_unique_name='db1_b'
SET log_archive_config=''
SET log_file_name_convert= ' ',' '
SET local_listener='(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dbhost-b)(PORT=1522)))'
NOFILENAMECHECK;
alter database flashback on;
alter system set db_flashback_retention_target = 60 scope=both;
select flashback_on from v$database;
FLASHBACK_ON
------------------
YES
show parameter db_flashback_retention_target
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_flashback_retention_target integer 60
alter system set dg_broker_config_file1='/u01/app/oracle/admin/db1/dgbroker/dg1db1.dat';
alter system set dg_broker_config_file2='/u02/app/oracle/admin/db1/dgbroker/dg2db1.dat';
show parameter dg_broker_config_file
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
dg_broker_config_file1 string /u01/app/oracle/admin/db1/dgbr
oker/dg1db1.dat
dg_broker_config_file2 string /u02/app/oracle/admin/db1/dgbr
oker/dg2db1.dat
alter system set dg_broker_start=true;
show parameter dg_broker_start
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
dg_broker_start boolean TRUE
dgmgrl sys/password@db1_a
create configuration 'FSF' as
primary database is db1_a
connect identifier is db1_a;
Configuration "FSF" created with primary database "db1_a"
add database db1_b as
connect identifier is db1_b
maintained as physical;
Database "db1_b" added
show configuration
Configuration
Name: FSF
Enabled: NO
Protection Mode: MaxAvailability
Databases:
db1_a - Primary database
db1_b - Physical standby database
Fast-Start Failover: DISABLED
Current status for "FSF":
DISABLED
DGConnectIdentifier in Oracle Database 11g,
InitialConnectIdentifier
in Oracle Database 10g).
ObserverConnectIdentifier allows you to specify different connect identifiers for the observer to use. You might, for instance, use this to allow the observer to monitor the databases using the same connect identifiers as the client applications.
edit database db1_a set property LogXptMode='SYNC';
edit database db1_a set property NetTimeout=10;
edit database db1_b set property NetTimeout=10;
enable configuration;
show configuration
Configuration
Name: FSF
Enabled: YES
Protection Mode: MaxAvailability
Databases:
db1_a - Primary database
db1_b - Physical standby database
Fast-Start Failover: DISABLED
Current status for "FSF":
SUCCESS
enable fast_start failover;
Enabled.
With FSFO enabled, Broker expects to find an observer, which we haven't started yet, so if you verify the configuration at this point with 'show configuration', Broker will report a warning (if it doesn't, give it a minute to discover that the observer isn't there).
show configuration
Configuration
Name: FSF
Enabled: YES
Protection Mode: MaxAvailability
Databases:
db1_a - Primary database
db1_b - Physical standby database
- Fast-Start Failover target
Fast-Start Failover:
ENABLED
Current status for "FSF":
Warning: ORA-16608: one or more databases have warnings
Running a StatusReport on the primary should verify that the error is due to a missing observer.
show database db1_a statusreport
STATUS REPORT
INSTANCE_NAME SEVERITY ERROR_TEXT
* ERROR ORA-16819: fast-start failover observer not started
tnsping db1_a
Attempting to contact (description= (SDU=32767) (address_list= (address=(protocol=tcp)(host=dbhost-a)(port=1522)))
(connect_data= (service_name=db1_a.demo.org) (server=dedicated)))
OK (0 msec)
tnsping db1_b
Attempting to contact (description= (SDU=32767) (address_list= (address=(protocol=tcp)(host=dbhost-b)(port=1522)))
(connect_data= (service_name=db1_b.demo.org) (server=dedicated)))
OK (0 msec)
dgmgrl sys/password@db1_a
start observer;
observer started
The terminal session will appear to hang at this point.
dgmgrl sys/password@db1_a
show configuration verbose
Configuration
Name: FSF
Enabled: YES
Protection Mode: MaxAvailability
Databases:
db1_a - Primary database
db1_b - Physical standby database
- Fast-Start Failover target
Fast-Start Failover: ENABLED
Threshold: 30 seconds
Target: db1_b
Observer: observer.demo.org
Lag Limit: 30 seconds (not in use)
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Current status for "FSF":
SUCCESS
Use the 'show fast_start failover' command to see which user configurable FSFO failover conditions are in effect.
show fast_start failover;
Fast-Start Failover: ENABLED
Threshold: 30 seconds
Target: db1_b
Observer: observer.demo.org
Lag Limit: 30 seconds (not in use)
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configurable Failover Conditions
Health Conditions:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle Error Conditions:
(none)
dgmgrl sys/password@db1_a
switchover to db1_b
Performing switchover NOW, please wait...
New primary database "db1_b" is opening...
Operation requires shutdown of instance "db1" on database "db1_a"
Shutting down instance "db1"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "db1" on database "db1_a"
Starting instance "db1"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "db1_b"
show configuration
Configuration
Name: FSF
Enabled: YES
Protection Mode: MaxAvailability
Databases:
db1_b - Primary database
db1_a - Physical standby database
- Fast-Start Failover target
Fast-Start Failover: ENABLED
Current status for "FSF":
SUCCESS
Now let's test switchover in the other direction.
switchover to db1_a
Performing switchover NOW, please wait...
New primary database "db1_a" is opening...
Operation requires shutdown of instance "db1" on database "db1_b"
Shutting down instance "db1"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "db1" on database "db1_b"
Starting instance "db1"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "db1_a"
show configuration
Configuration
Name: FSF
Enabled: YES
Protection Mode: MaxAvailability
Databases:
db1_a - Primary database
db1_b - Physical standby database
- Fast-Start Failover target
Fast-Start Failover: ENABLED
Current status for "FSF":
SUCCESS
select (sysdate - oldest_flashback_time)*24*60 as history from v$flashback_database_log;
HISTORY
----------
140.35
Don't initiate failover unless at least 30 minutes of history is available.
-- Note that DDL statements automatically commit
create table x as select * from all_objects;
Table created.
select count(*) from x;
COUNT(*)
----------
68855
shutdown abort
observer log:
Initiating Fast-Start Failover to database "db1_b"...
Performing failover NOW, please wait...
Failover succeeded, new primary is "db1_b"
Examine the Broker configuration by logging into dgmgrl on the new primary. Notice that the former primary is now disabled.
dgmgrl sys/password@db1_b
show configuration
Configuration
Name: FSF
Enabled: YES
Protection Mode: MaxAvailability
Databases:
db1_b - Primary database
db1_a - Physical standby database
(disabled)
- Fast-Start Failover target
Fast-Start Failover: ENABLED
Current status for "FSF":
Warning: ORA-16608: one or more databases have warnings
select count(*) from x;
COUNT(*)
----------
68855
startup mount
observer log:
Initiating reinstatement for database "db1_a"...
Reinstating database "db1_a", please wait...
Operation requires shutdown of instance "db1" on database "db1_a"
Shutting down instance "db1"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "db1" on database "db1_a"
Starting instance "db1"...
ORACLE instance started.
Database mounted.
Continuing to reinstate database "db1_a" ...
Reinstatement of database "db1_a" succeeded
dgmgrl status:
Configuration
Name: FSF
Enabled: YES
Protection Mode: MaxAvailability
Databases:
db1_b - Primary database
db1_a - Physical standby database
- Fast-Start Failover target
Fast-Start Failover: ENABLED
Current status for "FSF":
SUCCESS
mkdir -p /u01/app/oracle/admin/wallet
Create a wallet and set the default username and password to the database's SYSDBA credentials (usually SYS).
mkstore -wrl /u01/app/oracle/admin/wallet -createEntry oracle.security.client.default_username SYS
mkstore -wrl /u01/app/oracle/admin/wallet -createEntry oracle.security.client.default_password <sys password>
Add the wallet location and override to sqlnet.ora.
WALLET_LOCATION =
(SOURCE =
(METHOD = FILE)
(METHOD_DATA =
(DIRECTORY = /u01/app/oracle/admin/wallet)
)
)
SQLNET.WALLET_OVERRIDE = TRUE
mkdir -p /u01/app/oracle/admin/db1/observer
nohup dgmgrl /@db1 "start observer file='/u01/app/oracle/admin/db1/observer/fsfo.dat'" \
>> /u01/app/oracle/admin/db1/observer/dgmgrl.log &