Oracle Data Guard and Remote Mirroring Solutions
Remote Mirroring solutions conceptually appear to
offer simple and complete data protection. However, for data resident
in Oracle databases,
Oracle Data Guard, with its in-built zero data loss capability, is
more efficient, less expensive and better optimized for data protection
and disaster recovery than traditional remote mirroring solutions.
Remote mirroring solutions do add value by
protecting non-Oracle-database-data (e.g. filesystem data, or data in a
database that is not Oracle). However, customers do not need to buy or
integrate any remote mirroring solution with Oracle Data Guard to get
any special data protection benefit for their Oracle databases.
The following sections provide details on the
relative advantages of using Oracle Data Guard compared to using remote
mirroring solutions for data protection and disaster recovery scenarios
for Oracle databases. Click here for a table
summarizing these benefits.
-
Better
Network Efficiency - With Oracle Data Guard, only the redo data
need to be sent to the remote site. However, if a remote mirroring
solution is used for data protection, typically the database files, the
online logs, the archive logs and the control file must be mirrored.
If the flash recovery area is on the source volume that is
remote-mirrored, the flashback logs would also be remotely mirrored.
This means that compared to Data Guard, a remote mirroring solution
will send each change many more
times to the remote site.
Further, database writes happen a
lot more often than log writes because each log write typically
contains many changes (this capability is known as group commit).
This
means that the
network bandwidth needed for a database redo shipping solution such as
Data Guard is
considerably less than that of a remote mirroring solution. Even more
importantly, this means far fewer network round trips.
In an internal analysis of
Oracle's corporate e-mail systems, as shown in the following graph, it
was demonstrated that 7 times more data was transmitted over the
network and 27 times more I/O operations were performed using a remote
mirroring solution, compared to using Data Guard.
-
Better
Performance - Both Data Guard and remote mirroring solutions offer
the
option of zero data loss protection in the event of primary site
failure. This requires complete synchronization between primary and
standby databases; meaning that applications cannot proceed to the next
transaction until the data from the current committed transaction is
written to disk at the remote location. Application performance is thus
affected by the time it takes to transmit data from the primary site to
the remote location (i.e. network latency), write it to disk (disk
I/O), and receive return acknowledgement from the standby site (network
latency)
that the data has been received.
Remember from above that Data
Guard only transmits writes to the redo logs of the primary
database, whereas remote mirroring solutions must transmit these
writes as
well as every write to data files, additional members of online log
file groups, archived log files, and control files. The Oracle
processs that writes to data files is called Database Writer
(DBWR). Data Guard is designed never to
affect DBWR on the primary database, since anything that slows down
DBWR can have a significant impact on database performance.
Remote mirroring solutions, however, do impact DBWR performance because
they
subject all DBWR writes to network and disk I/O induced delays
inherent to synchronous, zero-data-loss configurations.
This fact was proven
independently by a company who is a leading provider of intelligent
infrastructure services that enable and protect interactions over voice
and data networks. The objective of this user's testing was to
determine the impact of network latency on the primary database in Data
Guard configuration compared to a standard, off-the-shelf
remote mirrroring solution. The tests
utilized an OLTP workload that generated approximately 3 MB/sec of
redo data. A synchronous zero data loss configuration was
used and various degrees of network latency were tested, producing the
following results.
RTT
(Network Latency) in milliseconds
|
Latency
Impact on Primary Database with Data Guard
|
Latency
Impact on Primary Database with Remote Mirroring
|
0
|
4%
|
3%
|
10
|
4%
|
26%
|
15
|
no test
run
|
39%
|
20
|
10%
|
no test
run
|
The test results are dramatic. The remote mirroring solution
inflicted more
than six times the performance overhead of Data Guard at 10ms RTT.
Since larger inter-site distances typically imply higher network
latencies, this shows the difficulty of using remote mirroring
solutions for zero data loss protection when a reasonable geographic
separation is an important disaster recovery (DR) objective. Data Guard
showed
much better performance making it practical to achieve far greater
geographic seperation, resulting in better data protection and higher
availability than is possible to achieve using remote mirroring.
-
Better
suited for WANs - Remote-mirroring solutions based on storage
systems often have a distance limitation due to the underlying
communication technology (Fibre Channel, ESCON) used by the storage
systems. In a typical example, the maximum distance between these two
boxes connected in a point-to-point fashion and running synchronously
can be only few tens of km. Using specialized devices this can be extended to
approximately 100 km. However, for the DR data center to be located
beyond this distance, a
series of repeaters and converters from third party vendors have to be
used. These devices convert ESCON/Fibre Channel to the appropriate IP,
ATM or SONET networks.
This approach adds to the overall
cost, because, besides the remote mirroring solution, there is also a
cost associated with the converters. Secondly, this may introduce
latency in the system, impacting the production database performance
and making such a configuration especially unsuitable for synchronous
transport (necessary for the zero data loss capability) for any
distance greater than a LAN. This problem may be mitigated by
introducing intermediate storage boxes in the communication path, but
that only adds to the overall cost. The other solution is to use
variations of synchronous transmission - however, depending on the
remote mirroring solution, anything other than synchronous transmission
of data may not preserve write ordering across all mirrored volumes
that the database resides on. This means such configurations cannot
guarantee data consistency at all times, making them unsuitable as a
data protection / disaster recovery solution for OLTP data.
Since Data Guard transmits only
redo data to the standby sites, using a standard IP network, and
preserves transactional consistency across all the protection modes
(i.e. whether using synchronous or asynchronous mode of transport), and
does not need expensive interim storage boxes, it is a much better DR
and data protection solution for a WAN.
-
Better
resilience and data protection - Oracle Data Guard ensures much
better data protection and data resilience than remote mirroring
solutions, since corruptions introduced on the production database are
likely to be simply mirrored by remote mirroring solutions to the
standby site, but are eliminated by Data Guard. For example, if a stray
write occurs to a disk, or there is a corruption in the file system, or
the Host Bus Adaptor corrupts a block as it is written to disk, then a
remote mirroring solution will most likely propagate this corruption to
the DR site. Because Data Guard validates redo blocks before propagating
to and applying them on the standby database, all such external corruptions
are eliminated by Data Guard.
Implicit with the fact that Data Guard
is validating redo data before it is applied, is the other fact that the standby
database is available for read-only access while redo data is applied.
This is a significant difference from remote mirroring where the target volumes
are not even mounted while data is mirrored. This makes a standby database
maintained by remote mirroring a "cold, lights-out" database whose state can
not be predictably determined until the mirroring process is turned off and
the database mounted and started. This Data Guard advantage was described
candidly in the following way during a conversation with IT managers at a
global logistics services provider:
"With remote mirroring it's scary. You
don't know what you have is working, unless you bring it up at failover time.
So at failover time, I can be the hero if it all works, or I can lose my job
if it doesn't. With Data Guard, you always know it's working, so that's great."
In addition, Oracle Data Guard is
integrated with the
Flashback Database feature, and allows application of changes to be
delayed as well. These capabilities prevent many human errors and data
corruptions from propagating, and/or affecting the standby database.
Remote mirroring does not have that advantage - any inadvertent drop of
a critical table will be instantly propagated to the remote copy of the
database files.
Data Guard SQL Apply further improves the
resiliency of the standby by translating the redo back to SQL. This
inherently validates the changes. Also, it makes it possible to
potentially skip certain changes. None of these are possible with
remote mirroring solutions.
-
Higher
Flexibility - Data Guard is implemented on top of pure commodity
hardware. It only requires a standard TCP/IP-based network link between
the two computers. There is no fancy or expensive hardware required. It
also allows the storage to be laid out in a different fashion from the
primary. For example, customers can put the files on different disks,
volumes, file systems, etc.
On the contrary, remote mirroring
solutions are restrictive in the sense that many of these remote
mirroring solutions are proprietary and can be used with only the
identically configured storage systems from the same vendor that
manufactures these remote mirroring solutions. This restriction is an
important point to bear in mind as customers choose the configurations
of their DR sites. One Data Guard customer - a large insurance company,
who has its primary site storage from one vendor, chose another vendor
as the storage for its DR site (for business reasons). They could not
have done that if they were using a remote mirroring solution.
-
Better
Functionality - Data Guard, with its full suite of data protection
features - e.g. Redo Apply (physical standby databases), SQL Apply
(logical standby databases), multiple protection modes, push-button
automated switchover/failover capabilities, automatic gap detection and
resolution, GUI-driven management and monitoring framework, cascaded
redo log destinations, etc., is a much more comprehensive and effective
solution optimized for data protection and disaster recovery, than
remote mirroring solutions.
-
Higher ROI
- Businesses have to ensure that they are getting as much value as
possible from their IT investments, and no IT infrastructure is sitting
idle. Data Guard has been architected to allow businesses get something
useful out of their expensive investment in a DR site. This is
typically not possible with remote mirroring solutions.
Using Oracle Data Guard, the
physical standby database (using the
Active Data Guard option in Oracle Database 11g)
or logical standby database can be open read-only
for reporting and queries while changes are being applied to the standby
databases. Logical standby database can be open even read-write. This
extends the functionality of Data Guard to beyond DR - e.g. customers can
offload resource intensive reporting operations to their standby databases,
thereby improving the performance of their production database. Remote-mirroring
solutions can't provide such capabilities. In those cases, the
remote databases are just idling away and the DR site resources are not
effectively utilized, reducing the ROI of such systems.
Data Guard is well-integrated
with Recovery
Manager (RMAN), allowing fast incremental backups to be off-loaded to the physical
standby database, saving critical system resources at the primary site
and enabling efficient utilization of system resources on the standby
site. Typically, customers will not have that flexibility if they were
to use a remote mirroring solution.
Data Guard in Oracle Database 11g offers Snapshot Standby database, that
allows the physical standby database to be used for testing, cloning, read-write
reporting, in addition to providing disaster protection, by saving the changes
from the production database. This is one of the many ways that Data Guard
allows customers to extract tremendous value out of their DR investment. That is
not possible with remote-mirroring solutions.
Data Guard offers an integrated
switchover capability to address planned maintenances, e.g. hardware or
OS upgrades, minimizing the cost of downtime associated with such
scheduled outages. This typically cannot be achieved with a remote
mirroring solution.
Data Guard also supports an automatic
failover capability through the Fast-Start Failover feature. In the event of
an outage at the production site, this allows a standby database to become a
primary database
very quickly, without requiring any manual intervention - assuring high
availability on top of disaster recovery. Using supported interfaces, applications
can also automatically connect to the new primary database.
With SQL Apply, administrators
can create additional indexes and materialized views to better suit
their reporting requirements. This cannot be done with mirroring.
SQL apply can also be used to
offload the work required to do a reorganization. For example, if the
administrator wants to partition a table, then (s)he can stop the apply
process on the logical standby, partition the table, and then restart
the apply. Again, such advanced functionalities are not possible with
remote databases in a remote mirroring configuration.
Finally, Oracle Data Guard is available as an out-of-the-box feature of
the core Oracle database. Besides, it is pre-integrated with other HA
solutions from Oracle, e.g. Real Application
Clusters (RAC), RMAN, etc. However, remote mirroring solutions are
extra cost purchases and require complex integration with the database.
Not only that, using remote mirroring solutions may also involve
capacity pricing in addition to the basic licensing for the product,
i.e. remote mirroring vendors may charge for the size of the data
replicated (e.g. units of 50 GB of replicated data), which
significantly adds up the cost for large OLTP databases.
-
Other issues to consider
- There are various other issues one should consider while evaluating
Data Guard and a remote mirroring solution for disaster recovery
purpose. For example, what facility does the remote mirroring solution
provide to instantiate the remote database? Note that for very large
databases, instantiating a remote database over the network, as may be
required by some remote mirroring solutions, may take a significantly
long time. With Data Guard, the standby database may simply be
instantiated from a tape backup of the primary, cutting down on the
initialization time, resources and costs.
Another issue to consider is the
handling of network connectivity problems. Data Guard allows the
Maximum Availability mode, under which zero data loss is enabled
through synchronous redo transmission to the standby site, but if there
is a network problem and the standby site is unreachable, processing
continues on the primary database. When network connectivity is
restored, re-synchronization of the standby databases is automatically
and gracefully handled by Data Guard. Customers considering a remote
mirroring solution should investigate how it allows such dynamic
switching from one variation on synchronous transmission to the other.
Similarly, customers should also determine how it recovers from
potential network communication problems.
Clearly, Oracle Data Guard provides a compelling
set of technical and business reasons that justify its adoption as the
disaster recovery and data protection technology of choice, over
traditional remote mirroring solutions. The following table summarizes
these justifications.
Data
Guard Benefits Over Remote Mirroring
| |
Oracle Data Guard |
| Better network efficiency |
|
| Transmit only redo data |
Yes |
| Transmit less data |
Yes |
| Use less network I/O-s |
Yes |
| |
|
| Better Performance |
|
| Minimal latency impact |
Yes |
Support large distances in synchronous mode for zero data loss
|
Yes |
| |
|
| Better suited for WANs |
|
| No distance limitation |
Yes |
| Network transmission based on standard TCP/IP |
Yes |
| No protocol conversion |
Yes |
| No additional latency |
Yes |
| Preserve write-ordering and data consistency |
Yes |
| |
|
| Better data resilience/protection |
|
| Zero data loss capability |
Yes |
| Protect from logical corruptions and human errors |
Yes |
| Prevent propagating physical data corruptions through redo-block consistency checks at the standby |
Yes |
| Standby database always open with validated, consistent data |
Yes |
| Skip tables on the standby site |
Yes |
| |
|
| Higher flexibility |
|
| Use commodity hardware |
Yes |
| Primary/standby storage systems do not have to be identically configured |
Yes |
| No vendor lock-in for primary/standby storage |
Yes |
| |
|
| Better functionality |
|
| Support both Redo Apply and SQL Apply functionality in the same configuration |
Yes |
| Multiple data protection modes to balance data availability and system performance |
Yes |
| Automatic failover and push-button automated switchover/failover capabilities |
Yes |
| Automatic gap detection and resolution |
Yes |
| Graceful handling of network connectivity problems |
Yes |
| GUI-based management and monitoring framework |
Yes |
| Cascaded redo log destinations |
Yes |
| Standby database initialization with tape backups |
Yes |
| |
|
| Higher ROI |
|
| Standby database can be opened read-only or read-write |
Yes |
| Allow backups to be offloaded on the standby database |
Yes |
| Reporting/queries/testing using the standby database |
Yes |
| Offload data reorganization |
Yes |
| Integrated with other HA features from Oracle |
Yes |
| Integrated natively with Oracle |
Yes |
| No extra cost |
Yes |