Oracle Data Guard and Remote Mirroring Solutions

Remote Mirroring solutions conceptually appear to offer simple and complete data protection. However, for data resident in Oracle databases, Oracle Data Guard, with its in-built zero data loss capability, is more efficient, less expensive and better optimized for data protection and disaster recovery than traditional remote mirroring solutions. 

Remote mirroring solutions do add value by protecting non-Oracle-database-data (e.g. filesystem data, or data in a database that is not Oracle). However, customers do not need to buy or integrate any remote mirroring solution with Oracle Data Guard to get any special data protection benefit for their Oracle databases. 

The following sections provide details on the relative advantages of using Oracle Data Guard compared to using remote mirroring solutions for data protection and disaster recovery scenarios for Oracle databases. Click here for a table summarizing these benefits.

 

  • Better Network Efficiency - With Oracle Data Guard, only the redo data need to be sent to the remote site. However, if a remote mirroring solution is used for data protection, typically the database files, the online logs, the archive logs and the control file must be mirrored. If the flash recovery area is on the source volume that is remote-mirrored, the flashback logs would also be remotely mirrored. This means that compared to Data Guard, a remote mirroring solution will send each change many more times to the remote site.

Further, database writes happen a lot more often than log writes because each log write typically contains many changes (this capability is known as group commit). This means that the network bandwidth needed for a database redo shipping solution such as Data Guard is considerably less than that of a remote mirroring solution. Even more importantly, this means far fewer network round trips.

In an internal analysis of Oracle's corporate e-mail systems, as shown in the following graph, it was demonstrated that 7 times more data was transmitted over the network and 27 times more I/O operations were performed using a remote mirroring solution, compared to using Data Guard.

 

 

  • Better Performance - Both Data Guard and remote mirroring solutions offer the option of zero data loss protection in the event of primary site failure. This requires complete synchronization between primary and standby databases; meaning that applications cannot proceed to the next transaction until the data from the current committed transaction is written to disk at the remote location. Application performance is thus affected by the time it takes to transmit data from the primary site to the remote location (i.e. network latency), write it to disk (disk I/O), and receive return acknowledgement from the standby site (network latency) that the data has been received.

Remember from above that Data Guard only transmits writes to the redo logs of the primary database, whereas remote mirroring solutions must transmit these writes as well as every write to data files, additional members of online log file groups, archived log files, and control files.  The Oracle processs that writes to data files is called Database Writer (DBWR).  Data Guard is designed never to affect DBWR on the primary database, since anything that slows down DBWR can have a significant impact on database performance. Remote mirroring solutions, however, do impact DBWR performance because they subject all DBWR writes to network and disk I/O induced delays inherent to synchronous, zero-data-loss configurations.

This fact was proven independently by a company who is a leading provider of intelligent infrastructure services that enable and protect interactions over voice and data networks. The objective of this user's testing was to determine the impact of network latency on the primary database in Data Guard configuration compared to a standard, off-the-shelf remote mirrroring solution.  The tests utilized an OLTP workload that generated approximately 3 MB/sec of redo data. A synchronous zero data loss configuration was used and various degrees of network latency were tested, producing the following results.


RTT (Network Latency) in milliseconds Latency Impact on Primary Database with Data Guard Latency Impact on Primary Database with Remote Mirroring
0 4% 3%
10 4% 26%
15 no test run 39%
20 10% no test run


The test results are dramatic.  The remote mirroring solution inflicted more than six times the performance overhead of Data Guard at 10ms RTT. Since larger inter-site distances typically imply higher network latencies, this shows the difficulty of using remote mirroring solutions for zero data loss protection when a reasonable geographic separation is an important disaster recovery (DR) objective. Data Guard showed much better performance making it practical to achieve far greater geographic seperation, resulting in better data protection and higher availability than is possible to achieve using remote mirroring. 

  • Better suited for WANs - Remote-mirroring solutions based on storage systems often have a distance limitation due to the underlying communication technology (Fibre Channel, ESCON) used by the storage systems. In a typical example, the maximum distance between these two boxes connected in a point-to-point fashion and running synchronously can be only few tens of km. Using specialized devices this can be extended to approximately 100 km. However, for the DR data center to be located beyond this distance, a series of repeaters and converters from third party vendors have to be used. These devices convert ESCON/Fibre Channel to the appropriate IP, ATM or SONET networks. 

This approach adds to the overall cost, because, besides the remote mirroring solution, there is also a cost associated with the converters. Secondly, this may introduce latency in the system, impacting the production database performance and making such a configuration especially unsuitable for synchronous transport (necessary for the zero data loss capability) for any distance greater than a LAN. This problem may be mitigated by introducing intermediate storage boxes in the communication path, but that only adds to the overall cost. The other solution is to use variations of synchronous transmission - however, depending on the remote mirroring solution, anything other than synchronous transmission of data may not preserve write ordering across all mirrored volumes that the database resides on. This means such configurations cannot guarantee data consistency at all times, making them unsuitable as a data protection / disaster recovery solution for OLTP data.

Since Data Guard transmits only redo data to the standby sites, using a standard IP network, and preserves transactional consistency across all the protection modes (i.e. whether using synchronous or asynchronous mode of transport), and does not need expensive interim storage boxes, it is a much better DR and data protection solution for a WAN.

 

  • Better resilience and data protection - Oracle Data Guard ensures much better data protection and data resilience than remote mirroring solutions, since corruptions introduced on the production database are likely to be simply mirrored by remote mirroring solutions to the standby site, but are eliminated by Data Guard. For example, if a stray write occurs to a disk, or there is a corruption in the file system, or the Host Bus Adaptor corrupts a block as it is written to disk, then a remote mirroring solution will most likely propagate this corruption to the DR site. Because Data Guard validates redo blocks before propagating to and applying them on the standby database, all such external corruptions are eliminated by Data Guard.

Implicit with the fact that Data Guard is validating redo data before it is applied, is the other fact that the standby database is available for read-only access while redo data is applied. This is a significant difference from remote mirroring where the target volumes are not even mounted while data is mirrored. This makes a standby database maintained by remote mirroring a "cold, lights-out" database whose state can not be predictably determined until the mirroring process is turned off and the database mounted and started. This Data Guard advantage was described candidly in the following way during a conversation with IT managers at a global logistics services provider:

"With remote mirroring it's scary. You don't know what you have is working, unless you bring it up at failover time. So at failover time, I can be the hero if it all works, or I can lose my job if it doesn't. With Data Guard, you always know it's working, so that's great."

In addition, Oracle Data Guard is integrated with the Flashback Database feature, and allows application of changes to be delayed as well. These capabilities prevent many human errors and data corruptions from propagating, and/or affecting the standby database. Remote mirroring does not have that advantage - any inadvertent drop of a critical table will be instantly propagated to the remote copy of the database files.

 

  • Higher Flexibility - Data Guard is implemented on top of pure commodity hardware. It only requires a standard TCP/IP-based network link between the two computers. There is no fancy or expensive hardware required. It also allows the storage to be laid out in a different fashion from the primary. For example, customers can put the files on different disks, volumes, file systems, etc. 

On the contrary, remote mirroring solutions are restrictive in the sense that many of these remote mirroring solutions are proprietary and can be used with only the identically configured storage systems from the same vendor that manufactures these remote mirroring solutions. This restriction is an important point to bear in mind as customers choose the configurations of their DR sites. One Data Guard customer - a large insurance company, who has its primary site storage from one vendor, chose another vendor as the storage for its DR site (for business reasons). They could not have done that if they were using a remote mirroring solution.

 

  • Better Functionality - Data Guard, with its full suite of data protection features - e.g. Redo Apply (physical standby databases), SQL Apply (logical standby databases), multiple protection modes, push-button automated switchover/failover capabilities, automatic gap detection and resolution, GUI-driven management and monitoring framework, cascaded redo log destinations, etc., is a much more comprehensive and effective solution optimized for data protection and disaster recovery, than remote mirroring solutions. 

 

  • Higher ROI - Businesses have to ensure that they are getting as much value as possible from their IT investments, and no IT infrastructure is sitting idle. Data Guard has been architected to allow businesses get something useful out of their expensive investment in a DR site. This is typically not possible with remote mirroring solutions.

Using Oracle Data Guard, the physical standby database (using the Active Data Guard option in Oracle Database 11g) can be open read-only for reporting and queries while changes are being applied to the standby databases. This extends the functionality of Data Guard to beyond DR - e.g. customers can offload resource intensive reporting operations to their standby databases, thereby improving the performance of their production database. Remote-mirroring solutions can't provide such capabilities. In those cases, the remote databases are just idling away and the DR site resources are not effectively utilized, reducing the ROI of such systems.

Data Guard is well-integrated with Recovery Manager (RMAN), allowing fast incremental backups to be off-loaded to the physical standby database, saving critical system resources at the primary site and enabling efficient utilization of system resources on the standby site. Typically, customers will not have that flexibility if they were to use a remote mirroring solution. 

Data Guard in Oracle Database 11g offers Snapshot Standby database, that allows the physical standby database to be used for testing, cloning, read-write reporting, in addition to providing disaster protection, by saving the changes from the production database. This is one of the many ways that Data Guard allows customers to extract tremendous value out of their DR investment. That is not possible with remote-mirroring solutions.

Data Guard offers an integrated switchover capability to address planned maintenances, e.g. hardware or OS upgrades, minimizing the cost of downtime associated with such scheduled outages. This typically cannot be achieved with a remote mirroring solution.

Data Guard also supports an automatic failover capability through the Fast-Start Failover feature. In the event of an outage at the production site, this allows a standby database to become a primary database very quickly, without requiring any manual intervention - assuring high availability on top of disaster recovery. Using supported interfaces, applications can also automatically connect to the new primary database.



Finally, Oracle Data Guard is available as an out-of-the-box feature of the core Oracle database. Besides, it is pre-integrated with other HA solutions from Oracle, e.g. Real Application Clusters (RAC), RMAN, etc. However, remote mirroring solutions are extra cost purchases and require complex integration with the database. Not only that, using remote mirroring solutions may also involve capacity pricing in addition to the basic licensing for the product, i.e. remote mirroring vendors may charge for the size of the data replicated (e.g. units of 50 GB of replicated data), which significantly adds up the cost for large OLTP databases.

 

  • Other issues to consider - There are various other issues one should consider while evaluating Data Guard and a remote mirroring solution for disaster recovery purpose. For example, what facility does the remote mirroring solution provide to instantiate the remote database? Note that for very large databases, instantiating a remote database over the network, as may be required by some remote mirroring solutions, may take a significantly long time. With Data Guard, the standby database may simply be instantiated from a tape backup of the primary, cutting down on the initialization time, resources and costs.

Another issue to consider is the handling of network connectivity problems. Data Guard allows the Maximum Availability mode, under which zero data loss is enabled through synchronous redo transmission to the standby site, but if there is a network problem and the standby site is unreachable, processing continues on the primary database. When network connectivity is restored, re-synchronization of the standby databases is automatically and gracefully handled by Data Guard. Customers considering a remote mirroring solution should investigate how it allows such dynamic switching from one variation on synchronous transmission to the other. Similarly, customers should also determine how it recovers from potential network communication problems.

 

Clearly, Oracle Data Guard provides a compelling set of technical and business reasons that justify its adoption as the disaster recovery and data protection technology of choice, over traditional remote mirroring solutions. The following table summarizes these justifications.

 

Data Guard Benefits Over Remote Mirroring

  Oracle Data Guard
Better network efficiency  
Transmit only redo data Yes
Transmit less data Yes
Use less network I/O-s Yes
   
Better Performance  
Minimal latency impact Yes
Support large distances in synchronous mode for zero data loss Yes
   
Better suited for WANs  
No distance limitation Yes
Network transmission based on standard TCP/IP Yes
No protocol conversion Yes
No additional latency Yes
Preserve write-ordering and data consistency Yes
   
Better data resilience/protection  
Zero data loss capability Yes
Protect from logical corruptions and human errors Yes
Prevent propagating physical data corruptions through redo-block consistency checks at the standby Yes
Standby database always open with validated, consistent data Yes
Skip tables on the standby site Yes
   
Higher flexibility  
Use commodity hardware Yes
Primary/standby storage systems do not have to be identically configured Yes
No vendor lock-in for primary/standby storage Yes
   
Better functionality  
Support both Redo Apply and SQL Apply functionality in the same configuration Yes
Multiple data protection modes to balance data availability and system performance Yes
Automatic failover and push-button automated switchover/failover capabilities Yes
Automatic gap detection and resolution Yes
Graceful handling of network connectivity problems Yes
GUI-based management and monitoring framework Yes
Cascaded redo log destinations Yes
Standby database initialization with tape backups Yes
   
Higher ROI  
Standby database can be opened read-only Yes
Allow backups to be offloaded on the standby database Yes
Reporting/queries/testing using the standby database Yes
Offload data reorganization Yes
Integrated with other HA features from Oracle Yes
Integrated natively with Oracle Yes
No extra cost Yes

 

 

 

 

In-Memory Replay Banner