TECHNOLOGY: Business Intelligence
Oracle Warehouse Builder 11g Release 2 and Heterogeneous Databases
By Mark Rittman
Use code template mappings to transform non-Oracle database data at the source.
Developers using Oracle Warehouse Builder have always been able to leverage the features and scalability of Oracle Database to load, transform, and store their data warehouse data. Oracle Warehouse Builder provides a graphical point-and-click environment that helps you easily map data between sources and targets with a set of rich SQL and PL/SQL transformations and manage the full data warehouse project lifecycle, from capture of the data model to deployment across multiple environments.
Oracle Warehouse Builder 11g Release 2 extends this capability to non-Oracle sources and targets by leveraging technology from another Oracle product, Oracle Data Integrator. Code templates based on Oracle Data Integrator knowledge modules implement best practices for connecting to, extracting, validating, and integrating data from a wide range of databases and middleware sources. Code templates use a mix of languages and substitution variables to leverage the native capabilities of various datasources. You can also write new code templates for features or datasources not built into Oracle Warehouse Builder.
Code templates that handle various aspects of the extract, transform, and load (ETL) process come in six categories:
Code templates provide the technology foundation for three key new Oracle Warehouse Builder features: heterogeneous database support, support for CDC, and support for code template mappings.
Heterogeneous database support. Oracle Warehouse Builder 11g Release 2 supports access to a range of Oracle Database instances and non-Oracle databases through the code template framework. Source platforms that work out of the box with code templates include Oracle Database, files, IBM DB2 UDB, and Microsoft SQL Server, and support for other platforms such as Oracle Essbase and MySQL can be added through Oracle metabase scripting.
Developers can still choose to access non-Oracle databases through gateways or Open Database Connectivity (ODBC), but the new code template framework makes setting up connections to other databases easier and does not require configuration steps outside of Oracle Warehouse Builder.
Support for CDC. Oracle Database has had native capabilities for the identification and transportation of new and changed data from datasources since Oracle9i Database. CDC code templates remove the complexity of identifying such data, through techniques such as mining the source database redo log or applying triggers to capture changes prior to a database commit. The code templates that ship with Oracle Warehouse Builder 11g Release 2 provide this functionality for several Oracle and non-Oracle database datasources. Select the tables for which to track changes when defining your database module, and Oracle Warehouse Builder will handle the technical details in the background.
Code template mappings. Probably the most significant new feature for Oracle Warehouse Builder developers, though, is support for code template mappings. Code template mappings combine the familiarity and flexibility of traditional Oracle Warehouse Builder mappings (now called database-resident mappings) with the integration possibilities provided by code templates.
For example, with code template mappings, you can extract data natively from a non-Oracle database; perform a JOIN between tables directly on the source database platform; and then load the result, using a MERGE statement, into an Oracle Database instance, thereby following SQL best practices at both the source and the target. The remainder of this article steps through a simple example.
Code Template Mapping Example: Cross-Platform Data Integration
Let’s take two Microsoft SQL Server 2005 tables containing customer and address data, join them, and then incrementally load data from them into an Oracle Database 11g table.
To work through this example, you need Oracle Database 11g Release 2 (which comes with Oracle Warehouse Builder 11g Release 2 already installed) or you can install the Oracle Warehouse Builder 11g Release 2 standalone software in an Oracle Database 10g Release 1 (or higher) instance. To use the default Oracle Warehouse Builder schema in an Oracle Database 11g Release 2 installation, unlock the OWBSYS and OWBSYS_AUDIT accounts. Among other things, the OWBSYS account (or another schema you might have created for the same purpose) consists of the repository that houses the workspaces and workspace users. In turn, each workspace contains one or more projects. To get started, run the Repository Assistant to create a workspace and a project to use for this example. See the Oracle Warehouse Builder Installation and Administration Guide for details.
Preliminary setup. In addition to Oracle Warehouse Builder 11g Release 2, you need the source Microsoft SQL Server database and Microsoft’s JDBC driver, which enables Oracle Warehouse Builder 11g Release 2 to connect natively to Microsoft SQL Server 2005. Download Microsoft SQL Server 2005 JDBC Driver 1.0 from the Microsoft Website. Uncompress the archive, and copy the sqljdbc.jar file from the package to the $OWBHOME/owb/lib/ext directory.
The built-in code templates that access non-Oracle database data consist of Java code that executes in a Java 2 Platform, Enterprise Edition (J2EE) container on the Control Center Agent (also new in this release), the J2EE runtime included with Oracle Warehouse Builder 11g Release 2. Code template mappings are deployed to the Control Center Agent. A module that ties connection and process to the agent is defined during the course of this example, so go ahead and start the Control Center Agent before continuing.
To start the Control Center Agent, execute the ccastart script within the $OWBHOME/owb/bin/unix or $OWBHOME/owb/bin/windows directory:
oracle@ora11g cd /u01/app/oracle/product/11.2.0/owb11gR2/owb/bin/unixoracle@ora11g ./ccastart
The first time the Control Center Agent is started, you must enter a password for the Oracle Containers for J2EE (OC4J) Administrator account. You will need the password later in this example, so make a note of it.
With the server components in place, we can turn to the Oracle Warehouse Builder client application, Design Center, to get started creating the modules—the containers that identify a location and define the load, transform, or other actions—we’ll need for this integration.
Step 1: Create the SQL Server source module. To begin, do the following:
Step 2: Create the Oracle Database target module. The Oracle Database target module identifies the location of the Oracle Database instance and the table for the transformed data. For this example, because we are using a MERGE transformation to incrementally load the data, the table requires a primary key. To create this module in your project
Step 3: Create the template mapping module. To take advantage of the new heterogeneous connectivity features in Oracle Warehouse Builder, you must create your data mapping as a code template mapping. Code template mappings are created in the Template Mappings area of a project. Before you can create a code template mapping, however, you must define a template mapping module that connects to a Control Center Agent, as follows:
With the source, target, and container modules defined, you can now create the code template mapping.
From the Projects pane, right-click the DEFAULT_AGENT template mapping module created in the previous steps and select New Mapping .
The logical view is complete, so we can now define the execution view.
Step 4: Create the execution view for the code template mapping. Execution units are new in this release of Oracle Warehouse Builder. They are used to associate related mapping operations with a specific code template. This particular code template mapping requires two execution units: one that executes on the SQL Server source and another that executes on the Oracle Database target.
After defining the execution units, you will use two code templates to perform the data mapping:
To create these two execution units and assign their code templates, return to the Design Center. With your mapping open and with Execution View selected, click the Default Execution Units button in the top left corner. The two execution units needed for this project are created automatically.
Step 5: Deploy the code template mapping. The mapping is now ready to deploy to the Control Center Agent, where it will run. There are a couple of different ways to do this; the simpler one is to open the Control Center Manager from the Tools menu of Design Center and then do the following in the Control Center navigation tree:
Check for any warnings or errors that may be generated during deployment, but you can safely ignore any VLD_2753 and VLD-2752 warnings on the source execution unit—they will be suppressed in a future patch release.
Finally, run your code template mapping and check that data has been transferred as expected into your Oracle Database instance. As with database-resident mappings, you can start and then monitor the progress of your mapping, using the Control Center Manager, or you can start it directly from within the Project Navigator. For this example, we’ll run the template mapping from the Project Navigator and then display the results of the execution in the log.
Step 6: Execute the code template mapping. In the final step, do the following:
The Best of Both Worlds
As you have seen in this example, the new code template mapping capability enables you to leverage the strengths of non-Oracle source databases without losing the benefits of classic Oracle Warehouse Builder integration, load, and control code mappings. Oracle Database-specific transformation operators—such as dimension operators, cube operators, and match-merge operators—can be used with the new code templates to support both heterogeneous and Oracle Database-specific data transformations.
By mixing and matching database-resident and code template mappings in your process flows, you can create data integration processes that leverage the capabilities of both types of mapping. Check out the Oracle Warehouse Builder 11g Release 2 documentation on Oracle Technology Network for more details.
Code templates bring the heterogeneous data connectivity and data integration capabilities of Oracle Data Integrator into the Oracle Warehouse Builder toolset, giving you the ability to load, validate, and integrate data across a wider range of datasources. Best of all, because code templates are written with a well-documented template language and a set of substitution variables, you can extend them or write your own to suit your own particular circumstances.
Mark Rittman is an Oracle ACE director and cofounder of Rittman Mead Consulting, a U.K.-based Oracle partner providing specialized business intelligence, data warehousing, and performance management solutions.