The DBA’s Guide to Setting Up Oracle RAC One Node and Oracle Data Guard, Part 2 (Advanced)

by Martin Bach
Published July 2011

In this installment, learn the manual, more advanced approach to this configuration.

In Part 1 of this article, we covered a Oracle Data Guard Broker configuration for Oracle RAC One Node and Oracle Data Guard. This configuration is well suited for less experienced DBAs who don’t want or need to be in control of each step during the role transitions. Furthermore, the broker is ideal in situations where automation is key or the staff is less experienced. In this installment, we will examine a manual configuration that gives the DBA more fine-grained control.

Unlike the broker approach, the manual Oracle Data Guard configuration requires the user to understand the process better and how Oracle Data Guard works, and requires the user to exercise more attention to detail, ensuring the configuration is identical across all standby databases. With the increase of standby database configurations to 30 in Oracle 11.2, this becomes even more important.

Creating a Manual Oracle Data Guard Configuration

Surprisingly there is no difference between how a Oracle RAC One Node standby is configured compared to how this task is done with an Oracle RAC database. In generic terms, you will have to perform the configuration steps on each database, primary and standby. It is important to prepare the primary database for a standby role, including the creation of standby redo logs. The question is not if the role change is going to arrive as a task on your desk, it’s rather when that is going to happen. And remember that the database doesn’t have to be the faulty component in the stack-a problem with the middle tier can equally cause a massive outage.

The main initialization parameter to modify with regards to standby databases is the log_archive_dest_n parameter which defines where to ship an archived log to, and in which role that should happen. One of the real improvements to Oracle Data Guard came with Oracle Database 10g, which allowed configuring log shipping based on database role. In other words, you could set up the log shipping on the primary, instructing it to send redo only if it was in a standby role.

Let’s have a look at how this is done in real life. Start by configuring the primary database.

Copy




        SQL alter system set log_archive_dest_2=
          2> 'service=RONDG valid_for=(online_logfiles,primary_role) db_unique_name=RONDG';
    System altered.
    SQL> alter system set fal_server=RONDG;
    System altered.
    SQL> alter system set log_archive_config='dg_config=(RON,RONDG)';
    System altered.
    SQL> alter system set standby_file_management='auto';
    System altered.

These steps are the minimum necessary to create a configuration. The log_archive_dest_2 parameter specifies that the database RON should submit redo to the service “RONDG” when it is in primary role. The Oracle Data Guard Concepts and Administration guide has all the options, the more interesting ones probably being around the compression option (might require the Advanced Compression option). With Oracle 11.1 traffic caused by gap resolution (FAL – fetch archive log) could be compressed to save bandwidth, and now with 11.2 you can even compress ordinary redo log shipping.

The necessary configuration steps on the standby database are very similar, just the inverse to the previous ones:

Copy



        SQL alter system set log_archive_dest_2=
          2> 'service=RON valid_for=(online_logfiles,primary_role) db_unique_name=RON';
    System altered.
    SQL> alter system set fal_server=RON;
    System altered.
    SQL> alter system set log_archive_config='dg_config=(RON,RONDG)';
    System altered.
    SQL> alter system set standby_file_management='auto';
    System altered.

Optionally create standby redo log files for your standby (and primary database!) at this stage for real time apply. If you are planning to, you should add standby redo logs for each thread, of the same size as the online redo logs. One member is enough; standby redo logs do not need to be multiplexed. Also, you should create one more standby redo log group than existing groups per thread.

In our primary database we have two threads (instance RON_1 is currently up and running which explains why thread 2 is closed):




RON SQL> select thread#,status,enabled,groups from v$thread;

   THREAD# STATUS ENABLED      GROUPS
---------- ------ -------- ----------
         1 OPEN   PUBLIC            2
         2 CLOSED PUBLIC            2

RON SQL>

You can furthermore query v$log to check about size and groups per thread you have, columns bytes, group# and thread# reveal important information.

Adding standby redo logs then is simple. For each thread, add group# + 1 standby redo logs. For my database with 5 groups per thread, the command to use could be as follows:

Copy



BEGIN
  FOR THREAD IN 1..2 LOOP
    FOR LOGFILE IN 1..6 LOOP
      EXECUTE IMMEDIATE'ALTER DATABASE ADD LOGFILE THREAD ' || THREAD || ' SIZE 150M';
    END LOOP;
  END LOOP;
END;
/

That will create six standby redo logs per thread, enough for real time apply. Now let’s enable managed recovery to complete the configuration.

Copy



        SQL> alter database recover managed standby database using current logfile
  2  disconnect from session;

You should see now that media recovery is taking place – the alert log should emit lines similar to these:



        Media Recovery Log +FRA/rondg/archivelog/2011_03_29/thread_1_seq_265.270.747054265
           Media Recovery Log +FRA/rondg/archivelog/2011_03_29/thread_1_seq_266.271.747054265
           Media Recovery Log +FRA/rondg/archivelog/2011_03_29/thread_1_seq_267.261.747048749

Well done, you are on track! Before closing the ticket used to implement the Oracle Data Guard configuration I usually check if log shipping is working.

Testing Data Guard Log Shipping

The first step after a successful configuration is to test if log shipping works. Here’s an example: I am querying the current log sequence number from the primary database before performing a log switch:

Copy




        SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     297
Next log sequence to archive   300
Current log sequence           300
SQL> alter system switch logfile

System altered.

The standby database was current up to sequence# 299 before the log switch. After the log switch it reported the new log applied:



SQL> select max(sequence#) from v$log_history;

MAX(SEQUENCE#)
--------------
           299

-- log switch on primary

SQL> /

MAX(SEQUENCE#)
--------------
           300

In an Oracle RAC database you’d have to take the individual threads into account in the above query, but with Oracle RAC One Node and only one active instance, this is not needed. This is good news for developers of custom monitoring scripts.

Flashback Database

Just to be on the safe side, I usually enable Flashback Database on all involved databases before a switchover -- this includes the standby database(s) as well! Checking if it has been enabled is simple, as is the correction if it’s not. Consider this example of database RONDG where Flashback was not enabled:

Copy




        SQL> select db_unique_name,database_role,flashback_on from v$database
            2 /
    DB_UNIQUE_NAME DATABASE_ROLE FLASHBACK_ON
       ------------------------------ ---------------- ------------------
       RONDG PHYSICAL STANDBY NO
    SQL> alter database recover managed standby database cancel;
    Database altered.
    SQL> alter database flashback on
    Database altered.
    SQL> alter database recover managed standby database using current logfile
        2 disconnect from session;
    Database altered
    SQL> select db_unique_name,database_role,flashback_on from v$database;
    DB_UNIQUE_NAME DATABASE_ROLE FLASHBACK_ON
       ------------------------------ ---------------- ------------------
       RONDG PHYSICAL STANDBY YES

Testing a Graceful Switchover

No Oracle Data Guard setup is complete without regular switchover tests. A lot of experienced instructors from Oracle and DBAs from the field also suggest that a failover should be exercised regularly, but most sites I have visited refuse this kind of stringent testing. The failover situation is shown in the next section.

Before beginning any switchover operation it is prudent to ensure you are logged on to the correct database and that flashback is enabled on all involved databases. In this scenario, it is:

Copy



        SQL> select db_unique_name,database_role, flashback_on from v$database;
    DB_UNIQUE_NAME DATABASE_ROLE FLASHBACK_ON
       ------------------------------ ---------------- ------------------
     RON PRIMARY YES
    SQL> select db_unique_name,database_role, flashback_on from v$database;
    DB_UNIQUE_NAME DATABASE_ROLE FLASHBACK_ON
       ------------------------------ ---------------- ------------------
       RONDG PHYSICAL STANDBY YES

Initiate the switchover command on the primary, as shown below:

Copy




SQL> alter database commit to switchover to physical standby with session shutdown;

Database altered.

SQL> shutdown immediate
ORA-01507: database not mounted

ORACLE instance shut down.

[oracle@node1 ~]$ srvctl start database -d RON -o mount
[oracle@node1 ~]$ srvctl status database -d RON
Instance RON_1 is running on node node1
Online relocation: INACTIVE

Connected to the database we see that the role has indeed changed:



SQL> select db_unique_name,database_role from v$database;

DB_UNIQUE_NAME                 DATABASE_ROLE
------------------------------ ----------------
RON                            PHYSICAL STANDBY

Now all that remains to be done is to enable managed recovery:

Copy



        SQL> alter database recover managed standby database
  2  using current logfile
  3  disconnect from session;   

Database altered.

You should also use srvctl modify database to update the database’s role to physical standby to avoid violating your license (unless you have Oracle Active Data Guard of course).

On the standby, issue these commands after the alter database commit to switchover... completed to turn it into a primary:

Copy



    SQL> alter database commit to switchover to primary;
    Database altered.
    SQL> alter database open;
    Database altered.
    SQL> select db_unique_name,database_role,flashback_on from v$database;
    DB_UNIQUE_NAME DATABASE_ROLE FLASHBACK_ON
       ------------------------------ ---------------- ------------------
          RONDG PRIMARY YES

It’s important to check that log shipping is working, using the same technique as before. First query the primary database:

Copy



SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 304
Next log sequence to archive 304
Current log sequence 304
SQL> alter system switch logfile;
System altered.

Next check the standby database:

Copy



        SQL> select max(sequence#) from v$archived_log where applied='YES' and 
            2 resetlogs_change#=(select max(resetlogs_change#) from v$archived_log);

    MAX(SEQUENCE#)
       --------------
        305

Alternatively you could show alert –tail –f option in ADRCI to follow Oracle as it ships, archives and applies the log. The view V$MANAGED_STANDBY is also very useful.

Finally start the service RON_APP.example.com and update the configuration of RONDG to reflect that it is now a primary database using srvctl modify database.

Testing a Failover

Taking the scenario one step further is the test of a failover. Previous to Oracle Database 10g, a failover meant that the failed database had to be restored from backup. Luckily this has changed! In the next section I am going to explore how to make the best use of the Flashback Database feature to minimize the required effort to get the database back.

Following the previous example of graceful switchover to RONDG, let’s start by simulating a failure on the primary database, RONDG:


    
    SQL> select db_unique_name,database_role,flashback_on from v$database;

    DB_UNIQUE_NAME                 DATABASE_ROLE    FLASHBACK_ON
    ------------------------------ ---------------- ------------------
    RONDG                          PRIMARY           YES
    [oracle@drnode1 ~]$ srvctl stop database -d RONDG -o abort

Assume that the primary database cannot be started again within the agreed service level, and the standby has to be activated. The following example shows these steps on the standby database:

Copy



    SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;
    Database altered.
    SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;
    Database altered.
    SQL> select db_unique_name,database_role,flashback_on from v$database;

    DB_UNIQUE_NAME                 DATABASE_ROLE    FLASHBACK_ON
    ------------------------------ ---------------- ------------------
   RON                             PRIMARY          YES

With this done, the service has been restored, and the application teams busily re-point their application servers to the new database. Again, don’t forget to reflect the configuration change in the Oracle Cluster Registry. If not started already, ensure that the service RON_APP.example.com is started on RONDG before handing the system over to the application server team.

Reinstate the Old Primary as Physical Standby

As we said earlier, activating the standby database meant a lot of work in Oracle9i Database and before, but luckily this can be avoided with Flashback Database. (But only if all the required logs are still there!). The new primary database keeps a record of the System Change Number at which it has been activated:



    SQL> select standby_became_primary_scn , database_role from v$database
    STANDBY_BECAME_PRIMARY_SCN DATABASE_ROLE
    -------------------------- ----------------    
                       1397880 PRIMARY

The old primary, RONDG, has to be flashed back to this SCN and converted back to a physical standby. After the problem has been rectified, the database can be brought up again:



    [oracle@drnode1 ~]$ sqlplus / as sysdba

    SQL*Plus: Release 11.2.0.2.0 Production on Fri Dec 31 14:50:19 2010

    Copyright (c) 1982, 2010, Oracle. All rights reserved.

    Connected to an idle instance.

    SQL> startup mount
    ORACLE instance started.

    Total System Global Area 4275781632 bytes
    Fixed Size 2233336 bytes
    Variable Size 2348813320 bytes
    Database Buffers 1912602624 bytes
    Redo Buffers 12132352 bytes
    Database mounted.
    SQL> flashback database to scn 1397880;

    Flashback complete.

    SQL> alter database convert to physical standby;

    Database altered.

    SQL> shutdown immediate
    sORA-01507: database not mounted 

    ORACLE instance shut down.
    SQL> startup mount
    ORACLE instance started.

    Total System Global Area 4275781632 bytes
    Fixed Size 2233336 bytes
    Variable Size 2348813320 bytes
    Database Buffers 1912602624 bytes
    Redo Buffers 12132352 bytes
    Database mounted.

    SQL> select db_unique_name,database_role from v$database;

    DB_UNIQUE_NAME DATABASE_ROLE
    ------------------------------ ----------------
    RONDG                          PHYSICAL STANDBY

    SQL> alter database recover managed standby database disconnect from session
           2 /

Log shipping and application should be validated again before the system can be declared functional. At a later stage a graceful switchover can restore things to how they were.

Summary

In this two-part article, you have learned how to set up a physical standby database for an Oracle RAC One Node primary using ACFS for the diagnostic destination and other administrative files. The standby database has been registered with the Oracle Cluster Registry, and manual as well as broker configurations have been created and tested. The scenario could be taken further by trying Fast Start Failover for lights-out management of a RON environment, but this remains an exercise for the reader.

Martin Bach is a freelance Oracle consultant in the UK and is the co-author of Pro Oracle Database 11g RAC on Linux (Apress).