Easy Disaster Proof Production

By Porus Homi Havewala
Published January 2010

DBA: Grid Management

Using Oracle Enterprise Manager Grid Control for Data Guard setup, management (including switchover or failover), and monitoring can save a lot of time and resources

The cost of unplanned downtime can be prohibitive, especially in today’s economic climate. Thousands of dollars per hour in employee productivity can be lost, along with the potential loss of business – resulting in the competition moving in swiftly to fill the gap. Corporate prestige, and the associated share price, is also at stake.

To protect against server failures, the catchword has always been High Availability (HA). For many years, active-passive clustering was used to implement this concept. This typically used the idea of switchable LUNs ( Logical Unit Number, in storage parlance) failed-over from one server to another. Database files on the LUN were made accessible to the passive server, and the Oracle instance restarted and recovered there.

Download

Oracle Database

This kind of dated technology was soon superseded by the new concept of active-active database clusters – via Oracle Real Application Clusters (RAC) – in which multiple servers can access the same database on shared storage. This approach protects against any server failures, affords HA, and promotes optimum use of all servers via load balancing and horizontal scale-out capabilities.

But for site failures – natural or man-made disasters threatening entire computer sites – HA is not sufficient. Rather, a true disaster recovery (DR) solution is required, in which a database "standby" is located at a different site far enough away to be insulated from any primary site disaster.

In the early days, establishing DR for Oracle databases involved setting up a manual standby database. This technique was developed by Oracle Consulting in the pre-7.3 days – it involved installing Oracle software on a standby server, copying a backup of the primary database across, starting its recovery, and setting up manually written shell scripts to transport the newly generated archive logs from the primary to the standby. Other scripts applied these logs onto the standby database in continuous recovery mode. The application of logs was also monitored by scripts.

At times, gaps needed to be resolved manually if the standby fell behind and log files did not come across or were deleted by some other process. Similar manual standby techniques are in use even today by certain companies using Oracle Database Standard Edition, in which this approach is the only option.

Officially, Oracle started to support standby databases in Oracle 7.3, with improvements made in each database version. Oracle8i (8.1.7) allowed the use of Oracle SqlNet 8 to transport the logs instead of scripted OS-level file transport. Managed Database Recovery, which was the automatic application of archived logs on the standby was also introduced. This removed a number of scripts in the process and increased automation.

With the release of Oracle Data Guard in Oracle 8.1.7, the story really came together. Available as a separate install and unzipped under the Oracle Home, it contained shell scripts to deploy Data Guard and a Data Guard Command line control interface through which the Data Guard agents could be started up on both servers. Finally, it allowed easy commands to perform a switchover or a failover.

In Oracle9i Database, Oracle Data Guard was enhanced by making it part of the kernel, with the redo transport and apply handled by server background processes. Also introduced was the Data Guard Broker and its background process (DMON).

Enhancements continue to be made with each new release. However, with all the new features, Data Guard setup naturally became more and more complex and this increased the risks of human error. Fortunately, Oracle created a powerful tool to aid in the setup of Data Guard: Oracle Enterprise Manager.

In particular, with its brand-new interface and architecture, Oracle Enterprise Manager Grid Control 10g streamlined and automated many day-to-day DBA tasks – including performance management (diagnosis and tuning), creation and execution of scheduled database backups, execution of scheduled scripts at the OS and database levels, download and application of database patches, controlled configuration management of the server, the OS and the database; and also the setup, management, and day-to-day monitoring of Oracle Data Guard Standby Databases – the subject of this article.

Today, Grid Control allows you to setup, manage, and monitor Oracle Data Guard in all database versions since Oracle9i, which it makes it a very powerful tool at corporate sites which often have multiple database versions. (It is obviously necessary that a Grid Control site should be properly architected to perform these varied tasks. For an overview of a recommended scalable architecture for Grid Control, see my published article on the Oracle Technology Network, " Grid Control Architecture for Very Large Sites".)

Grid Control offers a streamlined, error-free step-by-step way to set up Oracle Data Guard on any server running the Grid Control agent. In the case of one major corporation that had standardized on the use of Grid Control, multiple DBA teams in physical sites across an entire continent were using the tool to manage and monitor multiple versions of Oracle databases for different projects. Data Guard setup, management (including switchover or failover), and monitoring were all performed using Grid Control, resulting in considerable time savings, elimination of customized scripts, and reduction in human error .

Any corporate site can follow this example and achieve considerable cost savings in the long run, besides making life a little easier for its DBAs.

First Steps: Creating the Standby Oracle Home

Consider an Oracle Enterprise Manager Grid Control 10g Installation, patched to Release 5 (10.2.0.5.0). The Home Page of the console appears as in Figure 1.

Figure 1 Grid Control Home Page

Grid Control requires an agent to be installed on each server target you wish to monitor and manage. In your scenario, there is a primary database FINPRD1 on a server on which the agent has been installed, and this database is managed via Grid Control.

You are now considering another server for use as the standby database server. It has the same OS as the primary. The agent has been installed on this server in its own agent Home.

Since the Oracle Home does not exist, first install the Oracle Database Software on that server and create a Home. You can do this by selecting the Clone Oracle Home utility under the “Cloning” section of the Deployments Tab in Grid Control, as can be seen in Figure 2.

Figure 2 Deployments Tab

Any production use of the Clone Oracle Home utility requires a license for the Oracle Enterprise Manager Provisioning Pack . This versatile pack also allows a myriad of other automations, such as the cloning of Oracle databases and patching them via deployment procedures.

The first page of the Cloning utility that appears (Figure 3) allows you to select the Oracle Home that you wish to clone: the Source Home.

Figure 3 Clone Oracle Home: Source Home

The “View Source Type” dropbox on this page signifies that the Source Home can be selected from any installed Oracle Home visible as a target in Oracle Grid Control, or it can be selected from the Oracle Grid Control Software Libraries.

The latter option allows you to store a “Gold Copy” of a standardized Oracle Home in the corporate site, and use the cloning utility to create new Oracle software installations as and when required – a great benefit of the Provisioning Pack, along with the cloning of standardizedgold-copyOracle databases.

Select the Home on the Primary server. Click Next. The following page (Figure 4) allows you to specify the working directory during the copy, and the files to exclude.

Figure 4 Clone Oracle Home: Source Settings

By default, certain files in the exclude list will not be copied to the cloned Home. This makes sense since these files are closely related to the source Oracle Home and may not be applicable to the destination Home:

*.log,*.dbf,*.aud,*.trc,EMStagedPatches,sqlnet.ora,tnsnames.ora,listener.ora,oratab

The next page (Figure 5) allows you to select from a list of destination hosts. Only the hosts with the same operating system will be shown on the page. Select the host on which you intend to install the standby database – the Oracle Home will be copied to the Home location specified, which is:

D:\app\porushh\product\11.1.0\db_1

The new Oracle Home name is specified as “OraDb11g_home_stby” and the logon credentials are supplied for the standby host. Note that it is possible to add multiple destinations on this page, meaning that multiple cloned homes can be created on different destinations using this “Clone Oracle Home” utility from one source Oracle Home or Gold Copy.

Figure 5 Clone Oracle Home: Destinations

The next step allows you to specify pre-cloning and post-cloning scripts. This helps in customizing the cloning operation. You can select a host command or a script. Click Next. You are asked to specify the hostname and the Oracle Base for the destination Home. Specify the Oracle Base as D:\app\porushh instead of the original C:\app\porushh.

On the following page (Figure 6), you are asked to schedule the cloning operation. A job is created for the cloning which you can schedule either immediately or at a future time. The Enterprise Manager job scheduler starts the job at the chosen time.

Figure 6 Clone Oracle Home: Schedule

The Summary page appears. Confirm the cloning operation. Click on View Job to see the progress of the job. If the schedule was set to immediate, the job will proceed at once (Figure 7) with its operation of cloning the Oracle Home.

Figure 7 Job Execution: Clone Oracle Home

In the step that is displayed on the page – "Prepare Source Home" ,which is one of the initial steps in the "Clone Oracle Home" job – you can see that a large zip file is created in a temporary directory with the listed files excluded:



C:\OracleHomes\agent10g\bin\zip -r -S -9 
"C:\temp\oemchsED724047ACDB4DBCB7C2EE14AC386371\homeArchive" * -x "" "*.log" 
"*.log/*" "*.dbf" "*.dbf/*" "*.aud" "*.aud/*" "*.trc" "*.trc/*" 
"EMStagedPatches" "*/EMStagedPatches" "EMStagedPatches/*" "*/EMStagedPatches/*" 
"sqlnet.ora" "*/sqlnet.ora" "sqlnet.ora/*" "*/sqlnet.ora/*" "tnsnames.ora" 
"*/tnsnames.ora" "tnsnames.ora/*" "*/tnsnames.ora/*" "listener.ora" 
"*/listener.ora" "listener.ora/*" "*/listener.ora/*" "oratab" "*/oratab" 
"oratab/*" "*/oratab/*" 1> NUL

This zip file is then extracted to the destination directory. Other steps of the Home creation follow, like automatically discovering it in Oracle Enterprise Manager Grid Control so it is now available for future use. When the job is completed and the Home created, you can now proceed with the next step of creating the standby database.

Creating the Standby Database using Grid Control

The FINPRD1 database is the primary (production) database. From the Oracle Enterprise Manager Grid Control Console, select Targets..Databases and click on this database to bring up its Home Page. Click on the Availability Tab (Figure 8).

Figure 8 Availability Tab

Under the “Data Guard” section on this page, select Add Standby Database. You are now asked to login to the FINPRD1 database, if you have not already done so. Connect as SYSDBA and proceed. Since this is the first time Data Guard is being used for this database, select Add Standby Database to configure the environment (Figure 9).

Figure 9 Add Standby Database

Select the type of standby database you wish to create (Figure 10). A new physical standby database or a new logical standby database are the main options, or you can add an existing primary-standby database duo (that may have been set up manually) so that it is managed by Data Broker and can be monitored by Grid Control.

Figure 10 Select Type of Standby Database

A physical standby database is normally the best choice for DR, since it is an exact copy of the primary database and is the fastest type of standby. Select this option and continue.

The next step is the selection of the method that will be used to create the physical standby (Figure 11).

Usually an online backup of the primary database is performed, using Oracle Recovery Manager (RMAN). This provides the latest copy of the primary database to be used in setting up the standby, and the files are copied directly by RMAN.

There is also another possibility: if the database is very large, you can use a previously created whole RMAN database backup.

Figure 11 Add Standby Database: Backup Type

You decide to “Perform an online backup of the Primary database” along with “Use Recovery Manager (RMAN) to copy database files”. A staging area is not required in this case since the files will be directly copied by RMAN.

The other option in the online backup is to copy the database files via a staging area on both the servers, which you do not select.

The “Add Standby Database: Backup Options” page appears (Figure 12).

Figure 12 Add Standby Database: Backup Options

On this page, specify the number of concurrent RMAN processes that will be used to copy the online backup to the standby. If there is sufficient bandwidth, you can increase the number of processes (default 2) so that the creation of the standby is faster. Also specify the primary server credentials.

Under the section “Primary Database Standby Redo Log Files”, you should add standby redo log files to the Primary database if they are not already there. Standby redo log files are Oracle best practice for standby databases; they allow the primary database to receive redo log data when it transitions to the standby role. The synchronous and asynchronous redo transport modes also require that the redo transport destination have a standby redo log. So these logs are quite important. You can decide to use Oracle-managed files (OMF) for the standby redo files to keep the setup as simple as possible.

The next page (Figure 13) is where you specify the standby database attributes, such as the instance name (which must be unique on the standby host). Name the standby database as FINSTBY1. Specify that the standby database is setup using a “File System”. If an Oracle Automatic Storage Management (ASM) instance has been set up on the standby server, then ASM can be specified as the standby database storage.

Figure 13 Add Standby Database: Database Location

On the same page, you select the standby host and the Oracle Home where the standby will be created. Oracle Enterprise Manager Grid Control allows you to select from a list of host targets on which the Grid Control Agent has already been installed.

As a pre-requisite of Oracle Data Guard the operating system of the primary and standby hosts should be the same, and the list reflects that requirement. Note that there are specific cases where there can be differences in Oracle binaries between primary and standby systems in a Data Guard configuration due to differences in word sizes (operating system and/or database), Linux distributions, operating systems, or hardware architectures. (For current information on supported Data Guard configurations, please see MetaLink Note 413484.1 .) The Data Guard standby creation wizard does not allow you to create standby databases on these supported mixed platforms and these kinds of standby databases must be created manually and imported into Grid Control.

The Oracle Home should exist on the standby host and must also match the version of the primary Oracle Home. Since you have already created the new Oracle Home on the standby host using the “Clone Oracle Home” capability of the Oracle Enterprise Manager Provisioning Pack, proceed to the other steps.

The next page (Figure 14) allows you to specify the standby database file locations. Customize the file locations of the standby database, or accept the Optimal Flexibile Architecture (OFA) structure suggested by Oracle.

Figure 14 Add Standby Database: File Locations

On this page, a separate section “Backup File Access” will be displayed only when the standby is being created on a host other than the primary database, and RMAN is not being used to copy the database files of an 11g database. This section is not visible in Figure 14 because RMAN has been chosen. This demonstrates the intuitive nature of the Grid Control pages, that also differ across database versions as per the features available in each version.

When the Backup File Access section is displayed, choose a method to make the backup files of the primary database available to the standby host. One option is to transfer files from the primary host working directory to a standby host directory – simply specify a temporary location on the standby host for the backup files copied from the primary.

As the file transfer mechanism use FTP, the fastest method. An alternative is the Enterprise Manager HTTP Server option if FTP is not supported or not allowed by corporate policy. Configure Enterprise Manager to use HTTPS, so files are transferred in a secure manner.

In the real world, another commonly used method of accessing the backup files is via NFS – this is specifically useful if the Primary database is large in size and the file transfers would take a lot of time. If the working directory in the primary host can be accessed via NFS, a share, or some other network method from the standby host, then you can select “Directly Access the Primary host working directory location from the Standby host using a network pathname”. This is recommended since it would save time and disk space, since the backup files would not need to be transferred to the standby host.

On the same page (Figure 14), the “Network Configuration File Location” is used to specify the location of the directory where the target system’s OracleNet files reside so that Data Guard can add configuration information for the new Standby database, to the listener.ora and tnsnames.ora files. It is set by default to the new ORACLE HOME’s network/admin subdirectory.

Click on the Customize button to examine the file locations of the standby database. You find that the OFA suggestions would place the standby database files in locations such as:

D:\app\porushh\product\11.1.0\db_1\oradata\FINSTBY1

You decide to change these locations. You can set the locations for all datafiles together, as also for tempfiles, logfiles, control files, directory objects and external files. These are visible in various sections. Use the “Set location for all files” and the “Go” button for each type of file, to set the locations to the following:

C:\ORADATA\FINSTBY1

When the changes are complete, the page appears as in Figure 15. Note that the “Log Files” section shows four of the redo logs as Oracle-managed files. These are the same as the standby redo logs that have just been added to the primary database.

Figure 15 Customize Destination Options

Click on OK. Grid Control displays a warning that the directories specified are not present, and will be created automatically. Click Continue.

The next page (Figure 16) is the Standby Database Configuration page. This is an important page where the standby database parameters are set.

Figure 16 Add Standby Database: Configuration

The first parameter is the unique name of the standby database in the enterprise: the DB_UNIQUE_NAME parameter. You set this to FINSTBY1 since you are confident this name is unique in your corporate.

The target name is the display name used by Enterprise Manager. It is recommended to be the same as the unique name, so you set this to FINSTBY1 as well.

The Standby Archive location is where archived redo logs received from the primary database will be placed. It is suggested as:

D:\app\porushh\product\11.1.0\db_1\oradata\FINSTBY1\arc

Since you also want to use this as a Flash Recovery area, change this to:

D:\flash_recovery_area\FINSTBY1

The size of the Flash Recovery Area (FRA) is dependent on what information you plan on keeping in the FRA; flashback database logs, archive logs, a full copy of the database, etc. This can be changed as per your requirements or per corporate policy. Here we will use 4000MB.

On the same page, specify the Enterprise Manager monitoring credentials for Data Guard. These can be non-SYSDBA, such as the “dbsnmp” user, if this is dictated by corporate policy. The disadvantage of this is that it will not be possible for Enterprise Manager to monitor a mounted physical standby database – since only a SYSDBA login can connect to a mounted database. Specify SYSDBA monitoring credentials to cover this scenario.

In the Data Broker section, select to use the Data Guard Broker to manage the Data Guard configuration (recommended). At the bottom of the page (not completely shown in Figure 16) also select to use the Enterprise Manager connect descriptors as the Data Guard Connect Identifiers for both the primary and standby databases. Or you can choose to provide your own TNS identifiers to use existing service names.

The Review page is displayed (Figure 17). Expand the Standby Database Storage section, make sure all the details are correct, and click Finish.

Figure 17 Add Standby Database: Review

The Enterprise Manager job for creation of the standby database is submitted after completion of a few preliminary steps, such as creating the Data Guard configuration and the preparation of the job.

The initial steps are seen in Figure 18. We should note here that Cancel is allowed prior to the point of submission of the standby database creation job.

Figure 18 Processing: Add Standby Database (Cancel Allowed)

The final stages of the preliminary steps appear as in Figure 19, and the process can no longer be cancelled since the standby database creation job has been submitted.

Figure 19 Processing: Add Standby Database (Cancel Not Possible)

The standby database creation job is now submitted and proceeds to build the standby as per the RMAN or other methods chosen in the preceding pages. Once the job is submitted you will be returned to the Data Guard home page.

When the creation job completes, the Primary-Standby database duo is visible in the “Targets..Databases” list of Enterprise Manager Grid Control (Figure 20), as a Database Instance: Primary and a Database Instance: Physical Standby.

Both databases are seen as Status – Up.

Figure 20 Targets..Databases

Select the primary database FINPRD1 and from its database home page, go to the Availability tab. In the “Data Guard” section, a number of new options are now visible (Figure 21) since the standby database for this primary is now available.

These options are “Performance”, “Verify Configuration”, and “Add Standby Database”, besides the previous “Setup and Manage”.

Figure 21 New Data Guard Options displayed

Select Setup and Manage; this is the main Data Guard Management page (Figure 22) for your Primary-Standby Data Guard Configuration.

You can observe at a glance the Data Guard Status (normal), the Protection Mode used (Maximum Performance by default) and whether Fast-Start Failover is enabled or not (by default, disabled). This is a new feature as of Data Guard 10g, which if enabled, allows the standby database to assume primary status without human intervention – in the case of a primary failure.

Figure 22 Data Guard Management Page

The “Standby Progress Summary ” is prominently displayed on this page and is an indicator of the Transport lag and the Apply lag in a graphical format. This allows you to understand how far behind the standby is from the primary in terms of the transport and application of logs.

Both the primary and standby databases are at “Normal” status. The current log of the primary, the last received and applied log on the standby, and the estimated failover time (less than 1 second) are displayed. From this page, you can edit the Data Guard properties of the primary or the standby, or even add additional standby databases (up to nine can be added to a Data Guard Configuration).

Most important, from this page you can perform a “switchover” or a “failover” to the selected standby database. Normally a switchover is used for planned downtime, whereas a failover is used when there is a real disaster (unplanned downtime).

Under the section “Additional Administration”, it is recommended to click on Verify Configuration to go through an automatic series of checks on your Primary-Standby Duo. Grid Control verifies various standby database settings (Figure 23).

The current log on the primary database is switched, the protection mode is verified, the standby redo log files are checked, and the log switch is verified. A detailed report is produced on the health of your Data Guard setup. This helps in testing your setup proactively and finding any existing issues.

Figure 23 Processing: Verify Configuration

The Data Guard properties of the primary or standby databases can easily be edited via the main Data Guard Management page, seen in Figure 22.

Click on Properties..Edit under the Primary Database section, or the Edit button under the Standby Databases section after first selecting the Standby Database you wish to edit.

This brings up a series of property tabs. In the case of the Primary database, the first such General properties tab (Figure 24) is where you can turn the Redo transport on or off – this starts or stops the shipping of redo data to all the standby databases.

This is useful if you were bringing the primary database down for some scheduled work. The standby database keeps on running.

Figure 24 Edit Primary Database Properties: General Tab

The second tab displays the “Standby Role Properties” (Figure 25).

This allows you to change important Data Guard properties such as the “Redo Transport Mode” (ASYNC by default), the “Net timeout”, or the “Apply Delay” – using this, you purposely leave a gap of time between the standby and the primary, so as to protect against any manual errors committed by users on the primary database.

Figure 25 Edit Primary Database Properties: Standby Role Properties

As an example, if 30 minutes are specified as the “Apply Delay”, then logs are not applied on the standby database till 30 minutes have passed. The DBA has a chance to stop any user error from going across to the standby in this case.

The third tab (“Common Properties”) shows the Data Guard connect identifier which is in use, the number of log archive processes, and the log archive trace level – which can be set to a higher value to aid in debugging. This is seen in Figure 26.

Figure 26 Edit Primary Database Properties: Common Properties

Conclusion and Additional Information on Grid Control

You have managed to easily “disaster proof” your production using Oracle Enterprise Manager Grid Control, which clearly offers a highly sophisticated apparatus to set up Data Guard standby configurations.

Grid Control is also very useful for the day-to-day monitoring and management of Data Guard configurations, including switchovers or failovers in the case of planned or unplanned downtime. It employs a common Data Guard interface for Oracle9i, 10g, and 11g databases, and the new features in each database version or release are automatically offered for use during Data Guard setup, thus making Grid Control a very versatile management system for implementing Data Guard throughout the corporate site.

Additionally, Grid Control can be used efficiently for the setup and scheduling of Oracle RMAN backups for corporate databases. This is explained in my article “ Oracle RMAN Backups: Pushing the Easy Button”, which demonstrates how a large corporate moved to managing all its Oracle Database backups using Oracle Enterprise Manager Grid Control.

And if you want to learn how to automate the patching of all your Oracle RAC or non-RAC databases, ASM instances and Clusterware in your environment, read another recent article of mine titled “ Patch a Thousand Databases, Using Oracle Enterprise Manager Grid Control”. Enjoy the power of Grid Control.

Porus Homi Havewala, an Oracle ACE director (Database), is a Principal Consultant at S & I Systems Pte Ltd, an Oracle Platinum Partner in Singapore. He has extensive experience in Oracle technology since 1994 and has served as a production DBA, senior consultant, e-business technical DBA and systems administrator, development DBA, and database designer/ modeler (using Oracle Designer, of course). He was involved as the lead database architect and technical team leader for the first production Oracle Enterprise Manager Grid Control site in the world, a large telecommunications giant in Australia. He is an avid Grid Control evangelist and has written several OTN articles on managing the Grid..