Enterprises have used their information technology (IT) infrastructure to provide competitive advantage, increase productivity, and empower users to make faster and more informed decisions. However, with these benefits has come an increasing dependence on that infrastructure. Should a critical application, server or data become unavailable, the entire business can be placed in jeopardy. Revenue and customers can be lost, penalties can be owed, and bad press can have a lasting effect on customers and a company's reputation. Building a high availability IT infrastructure is critical to the success and well being of all enterprises in today's fast moving economy.
Trends in computing technology are also enabling a new IT architecture, referred to as Grid computing, to be deployed. Grid computing is a new computing architecture that effectively pools large numbers of servers and storage into a flexible, on-demand computing resource for all enterprise computing needs. Technology innovations like low-cost blade servers, small and inexpensive multiprocessor servers, modular storage technologies, and open source operating systems such as Linux provide the raw material for the Grid. By harnessing these technologies, and leveraging the Grid technology available in the Oracle Database, enterprises can deliver extremely high quality of service to their users while vastly reducing their expenditures on IT. The Oracle Database enables you to capture the cost advantages of Grid enterprise computing without sacrificing performance, scalability, security, manageability, functionality, or system availability.
CAUSES OF DOWNTIME
One of the challenges in designing a highly available IT Grid infrastructure is examining and addressing all the possible causes of downtime. Figure 1 is a taxonomy of downtime which classifies it into two primary categories: unplanned and planned downtime. It is important to consider causes of both unplanned and planned downtime when designing a fault tolerant and resilient IT infrastructure.
Figure 1: Causes of Downtime
Unplanned downtime is primarily the result of computer failures or data failures. Planned downtime is primarily due to data changes or system changes that must be applied to the production system.
HIGH AVAILABILITY IN ORACLE DATABASE 11g
The Oracle database has been widely acknowledged for its technical and thought leadership in the area of high availability (HA), with a broad suite of capabilities that help businesses maintain continuous operations both during unexpected failures and scheduled maintenance activities. With Oracle Database 11g, Oracle has expanded its innovative leadership in HA, with a suite of new features that provide significant value for the customer, as described in the following outline:
- While solutions such as RAC and Data Guard have addressed this area in prior releases, Oracle Database 11g introduces new capabilities that eliminate / minimize downtime even further. One such capability is
Online Patching. Initially available for Linux, this allows certain diagnostic patches to be installed in a completely online manner, i.e. without requiring the database to be brought down and applications to be disconnected. So - a novel capability that further minimizes downtime, leading to faster diagnostic analysis, and more customer satisfaction.
Offload Processing & Utilize Resources
- Traditional methods of implementing HA involve idle servers and offline storage that cannot be used for productive work. In contrast, Oracle Database 11g provides various capabilities that allow customers to offload processing from the production server to standby servers, thereby enhancing performance levels of the production server and utilization of the standby servers. For example - the
Oracle Active Data Guard Option enables real-time read-only access to a physical standby database to offload queries, sorting, reporting, web-based access, etc. from the production database, while continuously applying changes received from the production database. Similarly, the new
Snapshot Standby capability allows physical standby to be open read/write for testing/reporting while simultaneously accepting redo data from the primary, and hence providing DR protection at the same time.
Scale for Growth
- Oracle Database's scale-out architecture supports dynamic addition/removal of servers (through RAC) and storage disks (through ASM) in a grid model, allowing easy ways for customers to expand their architecture as their business grows. Oracle Database 11g allows the use of the new capabilities in a similarly innovative manner. For example, the previously discussed Active Data Guard now allows physical standbys to be used in a
Reader Farm configuration, where multiple physical standbys can be used to offer real-time read access in a highly scalable manner. An on-line music catalog provider can now have multiple physical standbys that scale web-site read access (e.g. catalog browsing). At peak holiday periods when website traffic is expected to increase, the customer simply adds more physical standbys to support the additional workload, and this doesn�t incur any downtime of the production database.
- Oracle Database's HA technologies, since they are all integrated and are all Oracle-aware, can provide tremendous value-added service. For example, by optimizing the direct integration between Oracle Secure Backup and RMAN,
Oracle Secure Backup 10.2 is considerably faster compared to competitive products. Another stellar solution in this area is
Data Recovery Advisor. It�s a tool (accessible through Enterprise Manager or RMAN CLI) that has the Oracle kernel intelligence to automatically diagnose data failures in the database, present various possible repair options, and execute desired repairs at the user's request. Using such smart integration throughout the Oracle Database HA solution set, much of the error-prone and manual operations are taken out of the day-to-day database administration tasks, thereby improving the overall availability of the database.
Operational best practices are key to the successful implementation of IT infrastructure. Technology alone is not enough.
Oracle Maximum Availability Architecture (MAA) is a fully integrated and proven blueprint for building highly available systems. Enterprises that have based their system architecture on MAA find they can quickly and efficiently design and deploy applications that meet their business requirements for system availability. MAA encompasses specific design and configuration recommendations, which have been extensively reviewed and tested to ensure optimum system availability and reliability. The MAA blueprints examine and detail the combined use of key Oracle Database features for high availability including Real Application Clusters, Data Guard, Streams, Recovery Manager, Enterprise Manager, etc. They also address the configuration and integration of other critical components of highly available systems including servers, storage, networking, and the application server.