Making the Move from Oracle Warehouse Builder to Oracle Data Integrator 12c
by Stewart Bryson
A detailed guide to a phased migration to Oracle Data Integrator 12c
Oracle customers have deployed solutions using Oracle Warehouse Builder (OWB) since the 2.0.4 version was released in 2000. Because of its close integration to the Oracle Database and its reliance on Oracle PL/SQL as a deployment mechanism, it's been a popular choice for data warehousing and BI shops that lean toward backend solutions as opposed to more heterogeneous tool solutions.
With its 12c release, Oracle Data Integrator (ODI) now appeals to both the database and tool-oriented crowds. Besides delivering an impressive new release with major advancements in data integration functionality, ODI 12c has also added capabilities for OWB customers looking to make the switch to ODI. First, we have a new feature called OWB Runtime Integration that allows ODI agents to orchestrate and execute OWB processes. Additionally, we have a new OWB to ODI Migration Utility that allows us to convert a subset of our OWB development artifacts into comparable ODI artifacts.
In this article, we'll cover these two approaches to making the move from OWB to ODI 12c: the integration route, and the migration route. I'll start by walking through a fairly standard ODI installation, one that includes the configuration of JEE agents, and finish with a description of a phased migration, which makes use of both the integration and migration capabilities in ODI 12c.
Our environment is Oracle Linux 6.4, which we installed using the Red Hat Kernel instead of the Unbreakable Linux Kernel. Because it also has the Oracle Database installed, we included several additional packages specific to that install, but those packages are not required for ODI. The first component needed for both installing and running ODI 12c is a Java Development Kit (JDK). ODI 12c supports only version 7 of the JDK, so I installed JDK 1.7.0_21 (available from http://java.oracle.com). Our Linux environment has several different Java installations, so we add the following to the .bash_profile to ensure we always use the Oracle JDK:
ODI Universal Installer
Making sure our JAVA_HOME is set successfully, we launch the Universal Installer for ODI by executing the jar file executable:
==> java -jar odi_121200.jar
The noteworthy steps in the ODI installation are identified below:
Step 1: Welcome
On the welcome screen, we see the message:
"If you plan to install the JEE Agent, then ensure that you have installed Oracle Fusion Middleware Infrastructure 12c."
Ignore this message; the Enterprise Installation of ODI will install WebLogic Server (WLS) and Fusion Middleware (FMW) infrastructure libraries behind the scenes.
Step 2: Installation Location
/app/oracle/product/odi_1 as the location for our ODI ORACLE_HOME.
Step 3: Installation Type
We'll want to run either JEE agents or standalone collocated agents in our new Enterprise Installation of ODI 12c. It's worth noting that pure standalone agents-which don't require WLS at all-are still supported and can be installed using the Standalone Installation option. JEE agents are the de facto standard for production environments because they provide all of the failover and scalability that WLS provides; I would need a really good reason not to use a JEE agent. So we select the Enterprise Installation option, but we'll investigate configuring a JEE agent after the installation is complete. As mentioned earlier, WLS 12.1.2 and FMW Infrastructure libraries 12.1.2 are also part of the install, which we can see by looking at the Internal Features in Figure 1. The remaining screens in the install are informational only and require no input.
Along with the ODI 12c install, the Universal Installer also creates a zip archive called
odi_1212_opatch.zip, which, when extracted, includes three patches: 16926420, 17053768 and 17170540. We need to unzip this file, navigate into the directory for each of these patches, and apply the patch. The application of patch 17170540 is demonstrated below:
==> cd 17170540/ ==> $ORACLE_HOME/OPatch/opatch apply Oracle Interim Patch Installer version 18.104.22.168.0 Copyright (c) 2013, Oracle Corporation. All rights reserved. Oracle Home : /app/oracle/product/odi_1 Central Inventory : /app/oraInventory from : /app/oracle/product/odi_1/oraInst.loc OPatch version : 22.214.171.124.0 OUI version : 126.96.36.199.0 Log file location : /app/oracle/product/odi_1/cfgtoollogs/opatch/17170540_Jan_02_2014_15_48_01/ apply2014-01-02_15-47-56PM_1.log OPatch detects the Middleware Home as "/app/oracle/product/odi_1" Applying interim patch '17170540' to OH '/app/oracle/product/odi_1' Verifying environment and performing prerequisite checks... All checks passed. Please shutdown Oracle instances running out of this ORACLE_HOME on the local system. (Oracle Home = '/app/oracle/product/odi_1') Is the local system ready for patching? [y|n] y User Responded with: Y Backing up files... Patching component oracle.as.common.clone, 188.8.131.52.0... Verifying the update... Patch 17170540 successfully applied Log file location: /app/oracle/product/odi_1/cfgtoollogs/opatch/17170540_Jan_02_2014_15_48_01/ apply2014-01-02_15-47-56PM_1.log OPatch succeeded. ==>
Repository Creation Utility
Unlike Oracle Business Intelligence Enterprise Edition (OBIEE) and some other FMW products, which require the Repository Creation Utility (RCU) to be run prior to the software installation, ODI actually installs the RCU as part of the base product install. The RCU executable is located in
$ORACLE_HOME/oracle_common/bin, and is launched as indicated below, with the noteworthy steps described after
Step 1: Create Repository
We definitely want to create a new repository, so we start by checking the primary option Create Repository. We want to create the database objects during this step (as opposed to generating scripts to be run later), so we choose the secondary option System Load and Product Load.
Step 2: Select Components
As is standard with FMW RCUs, we select a schema prefix to delineate our particular FMW schema from others that may exist in the same database. I choose the default DEV, and then select the sub-component Master and Work Repository under the Oracle Data Integrator component. This will automatically pre-select all required options to support ODI, as indicated in Figure 2.
Step 3: Custom Variables
We provide a password for the SUPERVISOR account, which is the built-in administrator account in ODI. We also need to specify a Work Repository Type, which we specify as (D) Development, the default, and a Work Repository Name, which we set to WORKREP, also the default.
Step 4: Map Tablespaces
In this step, we define the default tablespace and temp tablespace for each of the schemas the RCU is creating. As it's unnecessary to have a separate tablespace for each schema, I create a single tablespace REPO (repository) to be used for all FMW schemas, along with the default temp tablespace TEMP, as demonstrated in Figure 3.
JEE Agent Configuration
Once we have ODI installed and the database repository created, it's time to configure our JEE agent, which entails first configuring a FMW domain. The configuration script is located in
$ORACLE_HOME/oracle_common/common/bin, and is executed as indicated below, with the noteworthy steps identified after.
Page 1: Create Domain
We call our new Domain
odi_domain and create it in the default location of
$ORACLE_HOME/user_projects/domains. The full path value that we provide to the universal installer is:
Page 2: Templates
To make it easy to deploy standard functionality for different FMW products, the configuration utility contains templates for configuring and deploying different applications in the domain. The first template we select is Oracle Enterprise Manager Plugin for ODI - 184.108.40.206 [em], which also selects the following additional templates:
- Oracle Enterprise Manager - 220.127.116.11 [em]
- Oracle JRF - 18.104.22.168 [oracle_common]
- WebLogic Coherence Cluster Extension - 22.214.171.124 [wlserver]
Next, we select the template called Oracle Data Integrator - Agent - 126.96.36.199 [odi], which automatically selects the following templates:
- Oracle Data Integrator - Agent Libraries - 188.8.131.52 [odi]
- Oracle Data Integrator SDK Shared Library Template - 184.108.40.206 [odi]
Even though we are configuring a JEE Agent in this article, we also include the template Oracle Data Integrator - Standalone Collocated Agent - 220.127.116.11 [odi], so we could also choose to deploy a standalone collocated agent. We also include Oracle Data Integrator - Console - 18.104.22.168 [odi].
Page 3: Application Location
The Configuration Utility will recommend a default Application location-the location of our .ear file-based on the options we selected for our Domain location on Page 1. We simply accept the defaults.
Page 5: Domain Mode and JDK
We want the
boot.properties file, so we'll create a Development domain. However, a
boot.properties file is easy enough to create, so a Production domain would be a sensible choice as well.
Notice that the Configuration Utility is aware of our JDK location based on the value of the JAVA_HOME environment variable. That is indeed the JDK we want to use.
Page 6: Database Configuration Type
We choose the RCU Data option to use the Service Table schema—in our case, DEV_STB—to automatically configure the datasources necessary for our particular templates. We would need a really good reason not to use this.
Page 9: Credentials
We use this screen to configure two entries in the FMW Credential Store. The first key is already partially populated: the SUPERVISOR key is looking for our SUPERVISOR username and password. The second key we have to add by clicking the Add button, and then specifying the domain name, along with the administration username and password for the domain, as illustrated in Figure 4.
Page 10: Advanced Configuration
On this screen, we select these options: Administration Server, Node Manager and Managed Servers, Clusters and Coherence. The other options are not required.
Page 11: Administration Server
We need an Administration Server for our JEE agent deployment. We keep all the defaults, except that we need to provide a specific listen address. We can use the hostname or IP; we choose the hostname
Page 13: Managed Servers
Our JEE Agent also needs a Managed Server to provide the JEE container for our agent. We keep all the defaults for this, specifying the following values as demonstrated in Figure 5.
|Listen Port:||15101 (default Managed Server port)|
|SSL Listen Port:||Disabled|
Page 14: Clusters
We choose a unique name for our FMW Cluster, in our case we use
ODI_cluster1. We leave the Cluster Address attribute blank.
Page 15: Assign Servers to Clusters
Page 16: Coherence Clusters
Leave the default values of
defaultCoherenceCluster and 0.
Page 17: Machines
We need to configure our FMW Machine. Since our environment is Linux, we'll choose the Unix Machine tab with the following options:
|Name:||oracle (any name unique in the cluster)|
|Enable Post Bind GID:||Unchecked|
|Post Bind GID:||nobody|
|Enable Post Bind UID:||Unchecked|
|Post Bind UID:||nobody|
|Node Manager Listen Address:||
|Node Manager Listen Port:||5556 (default Node Manager port)|
Page 18: Assign Servers to Machines
We move both
ODI_server1 under the Oracle Unix Machine.
Clicking through the rest of the informational screens should give us a completed domain configuration and an application deployed to the managed server, which we can use for our JEE agent. To make things easier in the future, we add the
DOMAIN_HOME environment variable to our
.bash_profile file, as demonstrated below:
Agent Topology Creation
We're now ready to configure our JEE agent in ODI Studio to use our new deployed FMW Application. We open ODI Studio using the command line call below:
Using the ODI Topology Navigator, under Physical Architecture, we right-click Agents and then select New Agent. We specify the following parameter values as demonstrated in Figure 6.
|Port:||15101 (default Managed Server port)|
As part of the domain configuration above, the Node Manager is likely already running. In case it isn't, and for future reference, we can start the Node Manager using the following command:
==> nohup $DOMAIN_HOME/bin/startNodeManager.sh > nm.out&  31642
Once we ensure that the Node Manager is running, we are then able to start the Administration Server, using the following command:
==> nohup $DOMAIN_HOME/bin/startWebLogic.sh > admin.out&  32359
With the Administration Server in RUNNING mode, we are now able to start the Managed Server. This can be done from the command-line similar to how we started the Node Manager and Administration Server, or using Fusion Middleware Control (FMC), using the URL pattern below:
For our environment, this equates to:
For demonstration purposes, we use FMC. Under the WebLogic Domain tree, we chose odi_domain, followed by ODI_cluster1, and finally ODI_server1. On the resulting page, we can click the Start Up button:
Once we have the Managed Server running successfully, our JEE agent should now be available to us. Back in the Topology Navigator in ODI Studio, we right-click the OracleDIAgent in the Physical Architecture pane and choose the option Test, which should give use feedback that the agent is available. The final task is just to create an agent in the Logical Architecture pane that maps to our physical agent through a context; in our case, using the Global Context.
OWB Runtime Integration
With ODI 12c installed and operable and a JEE Agent configured correctly, we can now investigate the first integration point with OWB: Runtime Integration. This works by configuring an existing OWB workspace as an element in ODI's Topology, and having ODI execute OWB processes using ODI packages. Before we get into the technical details, let's have a look at our existing OWB project from a high-level, and see how we can incorporate it into ODI.
Sales Subject Area
We are loading a reasonably standard data warehouse around sales data, with a fact table, called SALES_FACT and a series of dimension tables: STAFF_DIM, STORE_DIM, PRODUCT_DIM, and CUSTOMER_DIM. We have mappings loading each of these tables, and the orchestration of the load has been constructed using process flows. There is one process flow module called COMMON_WF that has a single process flow package called SBATCH, which is our naming standard for "Standard Batch." Inside this process flow package are three individual process flows: LOAD_DIMS, LOAD_FACTS and MAIN_LOAD, as demonstrated in Figures 8-10. MAIN_LOAD is simply a high-level process flow executing the dimension and fact loads in order.
The separation of elements into separate process flows in OWB is a common design approach: it provides flexibility in orchestrating our overall batch load, as well as allowing us to execute granular pieces of the complete batch load for unit testing purposes.
We now need to configure the integration with our existing OWB workspace. Like any other element in the ODI topology, we begin by adding a data server in the Physical Architecture pane of the Topology Navigator. There is a new technology in ODI 12c called OWB Runtime Repository. We right-click on that technology and choose New Data Server, which opens the Data Server Definition tab, as demonstrated in Figure 11. We provide a name for the data server, and under Connection we provide the username and password of the OWB workspace owner. We name the data server ORCL after the Oracle database instance where it resides (any unique name would suffice), and we provide the OWBREP workspace owner and the associated password.
On the JDBC tab, we use the oracle.jdbc.OracleDriver driver and provide the connection details for the Oracle database holding the OWB workspace. We click the Save button, and then right-click our new Data Server in the Physical Architecture pane and choose New Physical Schema, which will open the Definition tab. We choose the correct value for Workspace (Schema) from the dropdown list-for us this is OWBREP.OWBREP-and then accept the remainder of the defaults as demonstrated in Figure 12.
Once our data server and physical schema have been created, we need to create a logical schema and assign the physical schema to it via a context. We choose the Logical Architecture pane in the Topology Navigator, and we again right-click the OWB Runtime Repository Technology, and choose New Logical Schema, which opens the Definition tab. We're only working with the Global Context, so we map the new logical schema-which for us is called OWBREP-to our previously created physical schema, which completes the topology configuration.
Package Tool: OdiStartOwbJob
With the Topology in place, we can now use ODI Packages to execute any of our OWB processes, which includes either OWB mappings or process flows: anything that can be executed by the OWB Control Center. Since we already have our entire load process orchestrated with OWB process flows, the easiest thing for us to configure is a package to simply execute our MAIN_LOAD Process Flow. In ODI Designer, right-click on Packages and select New Package. In the Package Editor, we have a new Tool option called OdiStartOwbJob, which we add as a Package Step to the Package Editor palette. We then configure the Package Step General tab with the following information, as demonstrated in Figure 13.
|Location Name:||OWF_MGR (The Oracle Workflow owner for our Process Flow location)|
Click the Command tab in the Package Step to see the ODI Tool command generated by the values we provided:
OdiStartOwbJob "-WORKSPACE=OWBREP" "-LOCATION=OWF_MGR" "-OBJECT_NAME=SBATCH/MAIN_LOAD" "-OBJECT_TYPE=PROCESSFLOW"
Once the package has been completed, it's easy to create an ODI Scenario and execute it using our JEE agent. We can see in Figure 14 that the ODI and OWB auditing is completely integrated now, with ODI being aware of all the processes and child processes executed by the OWB Control Center, as well as the execution results, including execution times and the number of records affected.
The Side-by-Side Approach
It's easy to grasp the value of this functionality: our legacy ETL processes in OWB can be managed and integrated with newer ODI processes, with the two working together in a coherent data integration strategy. We'll call this the side-by-side approach: continuing to run our legacy OWB processes as an integrated element of our overall Oracle data integration solution. The side-by-side approach is really viable only when our OWB processes are true legacy: we can leave them in OWB (because, let's face it, they just work) and live with that fact. The business has no enhancements planned where we would have to "touch" any of those processes, because we have new business intelligence initiatives from our stakeholders that we can develop completely in ODI. In this way, our OWB code is like almost any other legacy code-whether it be PL/SQL, Perl, etc.-except that Oracle has produced a complete integration platform. But what if our OWB code isn't pure legacy? What if the development around it is vibrant and ongoing? That's where the OWB to ODI Migration Utility comes into play.
The OWB to ODI Migration Utility
To support the new Migration Utility, we have a few more patches that need to be installed, again using the opatch method above. The first is patch 17053768 to apply to our ODI ORACLE_HOME; the second is patch 17830453, which needs to be installed for OWB. The OWB patch is either applied to the ORACLE_HOME associated with the Oracle database, or to a stand-alone OWB ORACLE_HOME if applicable, and works only on top of OWB version 22.214.171.124. The application of the OWB patch is demonstrated below:
==> cd 17547241/ ==> opatch apply Oracle Interim Patch Installer version 126.96.36.199.4 Copyright (c) 2012, Oracle Corporation. All rights reserved. Oracle Home : /app/oracle/product/11.2.0/dbhome_1 Central Inventory : /app/oraInventory from : /app/oracle/product/11.2.0/dbhome_1/oraInst.loc OPatch version : 188.8.131.52.4 OUI version : 184.108.40.206.0 Log file location : /app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/ 17547241_Dec_25_2013_13_30_17/apply2013-12-25_13-30-17PM_1.log Applying interim patch '17547241' to OH '/app/oracle/product/11.2.0/dbhome_1' Verifying environment and performing prerequisite checks... All checks passed. Backing up files... Apply mode Patching component oracle.owb.rsf, 220.127.116.11.0... Verifying the update... Apply mode Patch 17547241 successfully applied Log file location: /app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/ 17547241_Dec_25_2013_13_30_17/apply2013-12-25_13-30-17PM_1.log OPatch succeeded. ==>
As our environment is Linux, patch 17547241 has installed a new command-line utility in the OWB ORACLE_HOME called migration.sh. The exact location is $OWB_HOME/bin/unix, which for our environment is listed below:
Performing the Migration
This command-line utility uses command-line options combined with a configuration file to specify the objects we want to migrate, as well as specific configuration options associated with those objects. It doesn't migrate everything. The full list is specified in the documentation listed here: http://docs.oracle.com/middleware/1212/odi/ODIMG/understanding.htm#CHDCIDDG. Some of the noteworthy items that are not migrated include:
- Dimensional modeling metadata, and any mappings that use that metadata, including dimensions and cubes
- Process flows
- Data quality, data profiles and data auditors
The third item in this list, process flows, should be noted specifically. If we choose to migrate an entire project to ODI, we will only have the mappings and the metadata associated with them when the migration is complete. So it's worth noting that we will need to use ODI packages, load plans or both to develop a new orchestration strategy post-migration. In many cases, this is no minor piece of work. We'll investigate some of the strategies relevant to this later in this article.
The Migration Utility has three run modes:
- 1. FAST_CHECK: Performs a read-only check of the OWB repository and reports back the items than can and cannot be migrated.
- 2. DRY_RUN: Performs a migration to ODI using the ODI 12c SDK, but does not perform a commit at the end of the process.
- 3. RUN (Default): Executes the migration and commits migrated objects to the target ODI 12c repository.
The run mode is specified in the migration configuration parameter. The configuration parameter can be called anything and placed anywhere, since the file location is specified when executing the utility. The installation of the Migration Utility patch also creates a sample configuration file called
migration.config located in the
$OWB_HOME/bin/admin directory. Some of the specific driver properties that we have in our configuration file are listed below:
ODI_MASTER_USER=DEV_ODI_REPO ODI_MASTER_URL=jdbc:oracle:thin:@oracle.localdomain:1521:orcl ODI_MASTER_DRIVER=oracle.jdbc.OracleDriver ODI_USERNAME=SUPERVISOR ODI_WORK_REPOSITORY_NAME=WORKREP OWB_WORKSPACE_OWNER=OWBREP OWB_URL=oracle.localdomain:1521:orcl OWB_WORKSPACE_NAME=OWBREP MIGRATION_LOG_FILE=/app/oracle/product/11.2.0/dbhome_1/owb/bin/unix/migration.log MIGRATION_MODE=RUN MIGRATE_DEPENDENCIES=true MIGRATION_OBJECTS=PROJECT.GCBC_SALES.MODULE.ETL;
Notice the last two migration properties: MIGRATE_DEPENDENCIES and MIGRATION_OBJECTS. We can use these two parameters to specify the content we want to migrate. MIGRATION_OBJECTS uses dot-notation to specify which OWB project we want to migrate, and more granularly, which individual object or bulk object we want to migrate. We can see which parameters are supported in the documentation: http://docs.oracle.com/middleware/1212/odi/ODIMG/migrating.htm#CHDIDBDH.
We're migrating the specific OWB module called ETL. In our workspace, the OWB mappings for the entire project exist in this single module, so all of them will be migrated. However, our specification for MIGRATE_DEPENDENCIES will instruct the utility that we also want to migrate dependent objects, such as table metadata, sequence metadata, locations, etc.
The Migration Utility accepts a few command-line options as well-primarily for passwords, so they aren't explicitly written in the configuration file-along with the path to the configuration file. The command-line options and order are specified below:
- 1. ODI master repository password: the password for the ODI_MASTER_USER specified in the configuration file
- 2. ODI user password: the password for the ODI_USERNAME specified in the configuration file
- 3. OWB workspace password: the password for the OWB_WORKSPACE_OWNER specified in the configuration file
- 4. Migration configuration file: the path for the configuration file created above.
We execute the Migration Utility by entering the following at the command-line:
==> ./migration.sh welcome1 welcome1 welcome1 gcbc_sales.config
When we specified the location of the migration log file using the parameter MIGRATION_LOG_FILE, this will also specify the name and location of our migration report file, which based on our configuration file will be called
migration.report. The top of the report shows us the aggregated results from the Migration Utility, with the specific objects migrated listed further down in the report (omitted here for brevity):
Statistics ------------ Total Projects Migrated: 2 ******************************************************************************** PROJECT: PUBLIC_PROJECT Object Types Migrated Not-Migrated --------------- --------- ------------- LOCATION: 1 0 ******************************************************************************** PROJECT: GCBC_SALES Object Types Migrated Not-Migrated --------------- --------- ------------- SEQUENCE: 4 0 TABLE: 17 0 MAPPING_MODULE: 4 0 MODULE: 4 0 MAPPING: 5 0
We have a relatively small project here, but all the mappings in the ETL module were migrated successfully. The workspace metadata for tables and sequences was also migrated, becoming content in models in the ODI Designer.
We can now see the results of the migration process by viewing the ODI Topology and seeing that our OWB locations and modules were migrated as demonstrated in Figure 15. ODI data servers, physical schemas and logical schemas were created for all the modules represented in our OWB mappings: GCBC_CRM, GCBC_EDW, GCBC_POS and GCBC_STAGING.
One point to notice: the Migration Utility creates separate ODI data servers for each OWB location, even though many of the OWB locations exist in the same physical Oracle database. Although this may seem troubling at first, the situation is easily corrected with the power of the ODI Topology. We can create new data servers with multiple physical schemas if desired, remapping those to logical schemas, which allows ODI to generate more efficient code by understanding that the schemas exist in the same database. Use cases like this one is the main reason the ODI Topology abstracts our physical and logical schemas, and also why we shouldn't be too concerned that the Migration Utility handles it in this way.
Leaving our physical architecture the way the Migration Utility created, we still need to make a few small changes. The Migration Utility doesn't bring over the passwords from OWB locations, so we need to provide them in the Connection section of our data server configurations, as demonstrated in Figure 16. We also have to set the work schema for any of our physical schemas that are involved with mappings, as demonstrated in Figure 17.
Investigating a Single Mapping: MAP_SALES_FACT
Let's have a look at an individual OWB mapping and see in detail what the Migration Utility gave us on the other end. In Figure 18 we see one of our source OWB mappings, called MAP_SALES_FACT.
This mapping pulls data from two tables in the GCBC_POS schema, which are joined together using a joiner component. The resulting data then maps to another Joiner along with the dimension tables from the GCBC_EDW schema-the reporting target schema-to do the lookup for the surrogate keys before finally loading the data into the SALES_FACT table. All the dimension tables are joined to the source data set using a single joiner called SURROGATE_PIPELINE. We have two separate expression components in the mapping as well: one called simply EXPRESSION, the other called GET_DATE_KEY. We also have pre-mapping process and post-mapping process components, which are very common in OWB mappings. Without the functionality of the Knowledge Module (KM) architecture that we have in ODI, OWB developers often resigned themselves to adding custom, repeatable processes to their OWB mappings with these intra-mapping components. The migrated ODI Mapping MAP_SALES_FACT is shown in Figure 19, which we will discuss in the next section.
Components and Component-Style KMs
One of the reasons that a Migration Utility exists now is because it was easier to build in ODI 12c. Both tools use a flow-based design based on components on a palette. We have many of the same or similar components in ODI that we had in OWB. The list is smaller in ODI, and is demonstrated in Figure 20. We can develop with a smaller number of components because other elements in the ODI architecture—including the topology and the KM framework—reduce the number of individual components required.
In looking at the migrated mapping in Figure 19, notice the single joiner from OWB called SURROGATE_PIPELINE that joined all the dimension tables to the source data is converted to four distinct Join components in ODI. The ODI join component does support having more than two incoming connections, but the Migration Utility does not generate the designer metadata in that way. This shouldn't concern us much: the new component-style KMs in ODI 12c generate code comparable to OWB using the four distinct join components, and that same code is nearly identical to what ODI generates when a single join component is used.
Expressions are a new feature in ODI 12c, and they retain similar functionality to expressions in OWB. The strictly declarative design paradigm using interfaces in ODI 11g had no place for components, much less an expression component. But ODI 12c mixes the declarative and flow-based designs. Each attribute in the target component has a built-in location for configuring declarative transformation logic, but the use of expression components is available to make transformation logic explicit and reusable, and also assists the Migration Utility in converting OWB mappings.
In ODI, KMs are pluggable templates that use the ODI Substitution API to control the generated code for both source and targets. Component-style KMs are a new, complimentary code-generation tool encapsulating modular, reusable pieces of logic specific to particular components. This allows the use of template-based KMs only where we need them: not for the entire mapping, as was the case in ODI 11g. To see or modify the assignment of the loading KM (LKM) for a specific target in a mapping, we switch over to the physical view and click the access point for that target. This is the "touch-point" for the arrow that extends from the source to the target. For MAP_SALES_FACT, our access point is the GET_EFFECTIVE_DATE expression, which is demonstrated in Figure 21.
The Migration Utility chooses a single component-style LKM for all our converted mappings: LKM Oracle to Oracle Pull (DB Link). This new KM is a sensible choice because OWB always uses a database link to pull data from a remote server. A slight modification we made to our migrated mappings was to hard-code a database link into the SOURCE_ACCESS_DB_LINK option in the KM to eliminate the process of repeatedly dropping and creating the database link with each run, also demonstrated in Figure 21.
If we aren't happy with the database link approach (though I'm not sure why we wouldn't be), we could use a different KM that resembles loading techniques common in other ETL tools using separate connections to the source and target databases, combined with array-based processing. Figure 22 shows the selection of LKM SQL to SQL (Built-In). Keep in mind that the database link will typically outperform array-based processing, and is a major value-add when deploying ODI instead of other ETL tools.
At first sight, it may seem like our pre- and post-mapping process components were lost in the migration. They certainly aren't visible on the logical mapping palette, and they aren't an option in the list of components shown in Figure 20. Further investigation shows that the logical mapping view is the wrong place to look. Instead, we should again be looking at the physical mapping view and some of the options available in the new component-style KM. To see the post-mapping process, we click our target table SALES_FACT, navigate to the Extract Options section in the Properties pane, and have a look at the BEGIN_MAPPING_SQL option:
The original PL/SQL call from our OWB post-mapping process is listed below:
BEGIN "TRANS_ETL"."END_MAPPING"(REPLACE(GET_MODEL_NAME, '"', NULL) , ); END
This PL/SQL procedure executed a few custom post-load options for the table, and it used the individual mapping name to pull options from a configuration table to decide what processes to run. With the customization power of ODI, it's unlikely we would elect to code post-load options in this way, but these sorts of changes aren't made overnight, so it's important that we are able to initially execute the PL/SQL procedure in a similar way. The GET_MODEL_NAME constant (demonstrated above) in an OWB mapping always returns the mapping name wrapped in double-quotes, so we could use that to eliminate hard-coded values. We need an equivalent in ODI, which we have using the Substitution API. The modified call is listed below:
BEGIN TRANS_ETL.END_MAPPING('<%=odiRef.getPop("POP_NAME")%>'); END
Finding the pre-mapping process was a little more difficult, and seemingly arbitrary. Although we have several dimension tables that are joined in on the target side, the custom PL/SQL call was placed in the Extract Options for the CUSTOMER_DIM table, which we can only assume was chosen alphabetically. Regardless, a similar modification to the PL/SQL call gives us the desired result. With our small tweaks in place, the execution of the mapping from the ODI Operator is show in Figure 24.
The Big-Bang Approach: One Fell Swoop
The Migration Utility is an impressive piece of functionality. Although we had to polish the mappings a bit post-migration, all the complex source-to-target logic survived intact, and when it's all said and done, that's what we care about the most. The Migration Utility gives us a tremendous jumpstart, but it isn't a complete migration solution. Honestly, it's daunting to consider how many lines of code would be needed to furnish a complete, start-to-finish Migration Utility.
When considering the list of OWB content that isn't supported by the Migration Utility, the most noteworthy exclusion is process flows. Even though Oracle made the correct choice in focusing first on mappings, we are left with the requirement to re-orchestrate our load process or processes using some combination of load plans and packages. Depending on the scope and complexity of the data integration processes currently in OWB, this could run the gamut from uncomfortable to excruciating, especially when considering formal QA and regression testing processes driven by data validation.
Considering what a complete migration from OWB would entail, below is a list of required steps and optional steps that would factor in to such a migration:
Required Migration Steps
- 1. Configure and execute the Migration Utility.
- 2. Work through the topology and make data server and physical schema changes.
- 3. Tweak the component-style KM parameters in each mapping for appropriate database links, pre- and post-mapping processes, etc.
- 4. Design an orchestration process using packages and load plans to replace process flows.
- 5. Unit test select mappings representative of certain design patterns and ensure successful completion.
- 6. Regression test complete load processes, including data reconciliation to ensure that ODI is producing the exact same results as OWB.
Optional Migration Steps
- 1. Replace custom PL/SQL processes with more robust ODI-based functionality, including custom KMs, ODI packages and procedures, and load plan functionality.
- 2. Replace any custom TCL processes that function as part of the development process (auto-generating certain kinds of mappings or process flows) with Groovy scripting.
For certain OWB implementations, it might make sense to take the plunge and invest in the Big Bang, performing all of the required steps and possibly some of the optional steps. Organizations that sell and support a complete data warehouse solution would likely consider producing a new version of that solution running solely on ODI. An example of such a migration is Oracle's Business Intelligence Applications (OBIA), although that migration was not from OWB but from Informatica. Keeping portions of the load routines in Informatica and other portions in ODI wouldn't sit well with customers trying to maximize the return-on-investment (ROI) of purchasing enterprise software.
Similarly, organizations that utilize a Business Intelligence Competency Center (BICC) recognize tremendous value from standardization and a decreased footprint. Introducing some kind of hybrid solution, where legacy portions of the solution still run on OWB while non-legacy portions are implemented with ODI could possibly reduce the overall value of the solution by introducing increased support and maintenance costs.
The Phased Approach: Adding Value
So the Big Bang approach is attractive to producers of enterprise software and organizations with a BICC. What about the rest of us? Organizations are already delivering more with less, and struggling to meet the daily needs of their stakeholders with current roadmaps. Giving the green light to a new tool migration project seems like a million miles away. On the other side of the spectrum, we have the side-by-side approach that relegates our OWB processes to "legacy" status and puts them out to pasture until they die.
Maybe we should consider a third option: an approach that utilizes runtime integration and the Migration Utility, blending the long-term goal of migrating to ODI 12c with the short-term focus on providing value to the business. We'll call this the "phased approach," and its mission statement is simple:
"Any task undertaken to migrate content from OWB to ODI will add immediate value to our BI stakeholders."
Perhaps this mission statement seems antithetical. Is it possible to add value to a process of migrating functionality from one platform to another? Obviously, when the OWB process we are migrating needs an enhancement, we see the immediate value-add. Can we extend the reach of the phased approach even further, finding opportunities for migration even when those features aren't due for enhancements?
Let's investigate what the phased approach would look like. Here are the actions we take to deliver on this methodology:
- 1. We start by developing ODI scenarios to execute our OWB process flows inside of load plans in a reasonably granular fashion.
- 2. We put our OWB workspace in "maintenance mode," allowing development only when emergency "hot fixes" require it. All new, BI roadmap development would occur using ODI Designer content, including mappings, procedures, packages, load plans and custom KM development. Any auto-generation tasks that would have previously been written using TCL would instead use Groovy. At this point, we haven't deviated too far from the side-by-side approach.
- 3. Current OWB content slated for enhancement would be migrated to ODI using the Migration Utility, either one at a time or in batch, depending on how many OWB processes figure into our new enhancement. We would tweak the migrated content to optimal execution-similar to the tweaks made above when working with the Migration Utility-followed by developing the enhancements requested by the stakeholders.
- 4. We make any additional enhancements to our process-including re-architecting portions of our OWB processes using ODI features-whenever we feel we can bring immediate value to the business.
Granular Execution of Process Flows
In Figure 13 we demonstrated an ODI package and scenario to execute MAIN_LOAD, which is our high-level, entry-point process flow used to initiate a complete load.
Action #1 in our methodology prescribes building scenarios for our process flows, but building them as granularly as possible. For instance, we would execute the LOAD_DIMS and LOAD_FACTS Process Flows from Figures 9 and 10 as different steps in the same load plan as opposed to executing MAIN_LOAD as a single step, as demonstrated in Figure 25. The reason for this will hopefully be clear a bit later on.
New ODI Development
Action #2 in our methodology recommends putting our OWB workspace in "maintenance mode," eliminating any new development except in emergency circumstances. In this example, we'll create an ODI mapping to load a new fact table in our BI system called SURVEYS_FACT, as demonstrated in Figure 26, using a new feature called reusable mappings as a way to encapsulate core pieces of business logic.
Once this is complete, we need to orchestrate MAP_SURVEYS_FACT into our batch load process (Action #3), which we do by generating a scenario for the mapping adding a new step in our load plan, as demonstrated in Figure 27. When generating scenarios, I've used the prefixes ODI_ and OWB_ to distinguish where the mappings exist.
But considering Action #4, we also take this opportunity to add value above and beyond what has been requested in the new feature request. We go ahead and generate packages and corresponding scenarios for all the OWB mappings in this subject area, and add them as individual steps in the load plan instead of having a single step execute an OWB process flow.
Why do we want to do this? Because ODI load plans provide an important feature that OWB process flows don't have: restartability. The "restart from failed children" setting allows us to gracefully recover from failed executions. If one of the dimension loads fails, we would be able to correct the issues with that single mapping and then restart the load plan knowing that any of the mappings that succeeded would not run again during the restart. This adds immediate value to stakeholders by minimizing the time and effort associated with recovering from failed runs, and holds true to our mission statement.
Piecemeal Migrations: "Just in Time" Development
Now we find ourselves needing to modify the logic in the OWB mapping MAP_SALES_FACT. The business has requested an enhancement to the way discounts are calculated, so we need new logic for the DISCOUNT measure in the SALES_FACT table. It's time to follow Action 3) above in our methodology, using the Migration Utility to get an ODI version of this mapping so we can change the specific calculations. We tweak the value of MIGRATION_OBJECTS in our Migration Utility configuration file, as shown below, to execute the migration for a single mapping:
We start with the migrated ODI mapping, demonstrated above in Figure 19. We encapsulate the new discount calculation logic in a reusable mapping, as demonstrated in Figure 28, so that any subsequent processes that need to pull this source data will be able to use the join logic as corresponding discount calculations in a single, unified process.
Alone, 12c provides a tremendous amount of new functionality, beginning with the new declarative flow-based design paradigm that will feel familiar to developers of other ETL tools, especially OWB. This paradigm provides a much easier transition for OWB developers than was perhaps possible with previous versions of ODI. Combined with Runtime Integration and the Migration Utility, OWB customers are now secure in the years of investment placed in OWB. There's never been a better time to migrate to ODI, and this article has outlined three different approaches organizations can take.
Using a geographic metaphor, if the Big Bang approach is represented as one coast and the side-by-side approach as the other coast, then the phased approach is like the "flyover states" and will find an audience within diverse types of organizations. These approaches all have distinct value and provide ROI to customers in different ways. Whether our OWB investment is strictly legacy in nature, or alive and flourishing, we can now move forward to a future in ODI that reduces migration risk and adds value specific to an organization's needs.
About the Author
Stewart Bryson, Chief Innovation Officer at Rittman Mead, is an Oracle ACE specializing in business intelligence. He is a prolific writer and a frequent speaker at Oracle community events including Oracle Open World, ODTUG Kaleidoscope, IOUG Collaborate, RMOUG Training Days, and UKOUG Conference and Exhibition.