by Martin Petersen and Sonny Singh, Best Practices from Emulex and Oracle
Published February 2013
Data loss and data corruption can be catastrophic, but a new standards-based, end-to-end data integrity solution—which is the result of a joint effort by EMC, Emulex, and Oracle—mitigates episodes of silent data corruption by supporting the T10 Protection Information (T10 PI) standard. The T10 PI standard provides for end-to-end advanced data integrity. This article describes the perils associated with silent data corruption and explains how to implement the data integrity solution.
One of the most common areas in which data corruption occurs is writing to disk drives. There are two basic kinds of disk drive corruption:
With virtualization servers and multicore processors, the probability that a faulty memory cell will cause an error increases. When such an error occurs without the knowledge of the application or the data center staff, this is called silent data corruption. Although silent data corruption is relatively rare, it can go undetected for long periods and result in costly downtime for business-critical functions.
Common perpetrators of silent data corruption include the following:
Data integrity protection is not new. ECC and CRC are available on most, if not all, servers, storage arrays, and Fibre Channel host bus adapters (HBAs). But these checks protect the data only temporarily within a single component. They do not ensure that the data you intended to write does not become corrupt as it travels down the data path from the application running in the server to the HBA, the switch, the storage array, and then the physical disk drive. When data corruption occurs, most applications are unaware that the data that was stored on the disk is not the data that was intended to be stored.
Over the last several years, EMC, Emulex, and Oracle have worked together to drive and implement the Protection Information additions to the T10 SBC standard, which enables the validation of data as it moves through the data path to ensure that silent data corruption does not occur.
The ultimate goal is to provide protection against silent data corruption from application to disk by creating integrity metadata, also known as protection information, coincident with data creation, and then validating the metadata throughout the data path and directing errors to the application for remediation, as shown in Figure 1.
The following steps occur when data is written:
The steps are done in reverse when data is read.
The key to implementing this solution is making sure that you are using the correct software releases and hardware equipment that support the data integrity enhancements and have been fully tested.
End-to-end data integrity is supported when each component meets the minimum version requirements listed in Table 1.Table 1
|Application/Operating System Layer|
|Application||Any Oracle Database application|
|Database||Oracle Database 11g with Oracle Automatic Storage Management|
|Operating System||Oracle Linux 5.x or 6.x with Unbreakable Enterprise Kernel versions 2.6.39-200.24.1 or later; Oracle ASMLib 2.0.8 or later|
|HBA Models (driver/firmware)||Emulex LPe12000-E or LPe12002-E with firmware 2.01a10 or later and driver 22.214.171.124.6p or later |
Emulex LightPulse LPe16000 or LPe16000B with firmware 1.1.21 or later and driver 126.96.36.199.6p or later
Note: Equivalent OEM HBA (-E) models are also supported.
|Array||EMC Symmetrix VMAX Series with Enginuity 5876.82.57|
The Emulex LightPulse LPe16000B 16G FC HBAs support protection information offload, which improves overall performance by 30%. This data integrity solution gives you the ability to protect your data and resources, while maximizing service level agreements. The latest Emulex LightPulse LPe16000B 16G FC HBAs provide the industry's highest level of data integrity with full line-rate performance and no systems overhead via vEngine CPU offload technology. In addition, Emulex's BlockGuard data integrity feature makes storage area network (SAN) deployments that use Oracle products operate faster and better.
The data integrity solution is EMC E-Lab certified, which helps ensure complete data integrity with the Oracle stack—from the application through the enterprise storage array—for all I/O operations. EMC is the first enterprise storage array vendor to join with Emulex and Oracle in implementing this end-to-end data integrity solution.
This data integrity solution embodies a truly seamless architecture. There are no added requirements for configurability. If the relevant storage, HBA, firmware, driver, Oracle Automatic Storage Management, and Oracle ASMLib components are in place, data protection is automatically enabled. This means you can conduct day-to-day system administration tasks without having to worry about cumbersome tuning of parameters.
You can verify that data protection is enabled by using the
oracleasm-discover command, which will show the data integrity profile associated with each Oracle Automatic Storage Management disk, as shown in the example below. This command takes no arguments and simply prints a list of discovered devices.
# oracleasm-discover Using ASMLib from /opt/oracle/extapi/64/asm/orcl/1/libasm.so [ASM Library - Generic Linux, version 2.0.8 (KABI_V2)] Discovered disk: ORCL:P00 [20971520 blocks (10737418240 bytes), maxio 512, integrity DIX1-512/512-IP] Discovered disk: ORCL:P01 [20971520 blocks (10737418240 bytes), maxio 512, integrity DIX1-512/512-IP] [...]
T10 PI prevents several common silent data corruption scenarios. Steps are taken by the HBA, the storage device, and the operating system to make sure that corrupted data is not written.
If a data integrity error is encountered, Oracle ASMLib will attempt to retry the I/O command. A data integrity error message and a retry message are recorded in the Oracle Automatic Storage Management trace log.
The end-to-end Oracle Linux–based data integrity solution has come to fruition through the joint efforts of Emulex, EMC, and Oracle and is a result of many years of development and collaboration.
The data protection information generated by Oracle Automatic Storage Management is validated by the Oracle Linux operating system, and then passed on to the Emulex HBA and the EMC VMAX array, thus enabling protection throughout the I/O stack.
Martin Petersen has been involved in Linux kernel development since the early 1990s. He works in Oracle's Linux Engineering group where he focuses on future I/O and storage technologies.
Sonny Singh has been with Emulex's Marketing group for three years. He is responsible for the inbound management and co-marketing of all Emulex solutions sold through Oracle and focuses on branding, go-to-market strategy, and development of co-branded solutions.
|Revision 1.0, 02/15/2013|