TECHNOLOGY: Business Analytics
Integrate and AnalyzeBy Mark Rittman
Combine structured and unstructured data for analysis and new insights.
Oracle Endeca Information Discovery applications enable organizations to create rich, interactive data discovery applications that consume data from all types of datasources, from the more traditional “structured” data sets found in Oracle Database and Oracle E-Business Suite to the unstructured data of documents and social media feeds.
Oracle Endeca Information Discovery Release 3.0 extends this capability by integrating with Oracle Business Intelligence Enterprise Edition 11g, providing the ability to build data discovery applications around the facts, dimensions, hierarchies, and integrated data sets of the enterprise semantic model.
In this article, I take a look at this new integration by creating an Oracle Endeca Information Discovery Studio application that uses the sample application (SampleApp) for Oracle Business Intelligence Enterprise Edition as its datasource; uses Oracle Endeca Information Discovery Integrator to load data from the Flight Delays subject area provided by the SampleApp; and uses the SampleApp subject area table metadata to create an Oracle Endeca Server data domain. I then use Oracle Endeca Information Discovery Studio to create an initial data discovery web application, which you can then extend later with additional, unstructured datasources and data visualization components.
Prerequisites for the SAMPLE
If you want to create this article’s sample application yourself, you will need to download the following products from the Oracle Software Delivery Cloud website (edelivery.oracle.com), using either your full license or a trial license. Product versions are available for Microsoft Windows x64 and Linux x86-64.
The SampleApp for Oracle Business Intelligence Enterprise Edition 184.108.40.206.2 BP1 (V207), which I will use as the datasource, can be downloaded—preinstalled and preconfigured—as an Oracle VM VirtualBox image, from the Oracle Technology Network website.
This SampleApp (V207) Oracle VM VirtualBox image comes with several demonstration subject areas, including “X – Airline Delay,” which I will use for this example. Using Oracle Endeca Information Discovery Integrator 3.0, I will connect to the business intelligence server component and then create an Oracle Endeca Server data domain, using a selection of the subject area’s tables as a datasource. Then, once the data domain is loaded and ready for use, I will use Oracle Endeca Information Discovery Studio 3.0 to create a web application for exploring the data set.
In addition to the SampleApp Oracle VM VirtualBox image, you will need a Microsoft Windows–based environment in order to use the administration tool (which can also be the environment you use to run Oracle Endeca Information Discovery 3.0). For details on how to download and configure the administration tool to work in a separate Linux-based business intelligence server environment, see “4.5 Admintool access to SampleApp RPD” in the “SampleApp V207 - Virtual Machine Image Deployment Guide” document.
To create an Oracle Endeca Server data domain that gets its data from an Oracle Business Intelligence repository, a wizard within Oracle Endeca Information Discovery Integrator connects to the repository to enable you to select a particular subject area and set of tables to use as the data domain datasource, and then the wizard automatically creates data domain attributes based on the names of the selected tables and columns. However, business model table and column names within the SampleApp repository are prefixed with numbers to aid in referencing them within sample dashboards, so you will need to create a version of these tables without the number prefixes—the Oracle Endeca Server data domain attribute names cannot start with a number.
To create a version of the “X – Airline Delay” subject area and underlying business model that complies with this naming restriction but does not affect any other SampleApp reports that use the original object names, follow these steps:
Figure 1: Renaming the duplicate business model’s tables to remove numeric table name prefixes
Finally, to save this updated repository and make it available for the next stage in this process, select File -> Check In Changes, and for the Do you wish to check global consistency? question, choose No to avoid checking in all the other objects in the SampleApp repository that are not relevant to this example. Finally, select File -> Save and then File -> Close to save the updated repository back to the server and close the administration tool’s connection to it.
Creating the Data Domain
Now that you have prepared the Oracle Business Intelligence repository for use with Oracle Endeca Information Discovery Integrator, you can connect to the Oracle Business Intelligence Enterprise Edition repository and create the first cut of your Oracle Endeca Server data domain. To do this, follow these steps, again based on a Microsoft Windows x86-64 development environment.
Viewing Flight Delay Data
To take an initial look at what’s in the Oracle Endeca Server data domain you’ve just created with Oracle Endeca Information Discovery Integrator, you can quickly create an Oracle Endeca Information Discovery Studio application that enables you to explore the data set and that you can extend afterward to try out more of Oracle Endeca Information Discovery Studio’s features. To create this “first cut” Oracle Endeca Information Discovery Studio application, follow this final set of steps:
Now click Create Application and then Go to Application to view the application in your web browser, as shown in Figure 3.
Figure 3: The initial Oracle Endeca Information Discovery Studio application
You can use this data discovery application example to navigate and search through the attributes loaded from Oracle Business Intelligence Enterprise Edition into Oracle Endeca Server and see how Oracle Endeca Server’s “faceted search” facility enables you to search and refine your target data set.
Oracle Endeca Information Discovery applications extend the capabilities of Oracle’s business intelligence platform to encompass unstructured and semistructured datasources, enabling you to use the unique capabilities of Oracle Endeca Server to search, analyze, and aggregate data from any source.
With the ability to now use data from the Oracle Business Intelligence repository alongside database, file, and other datasources, you can quickly create web-based data discovery applications that build on the work you’ve already done to model the business data within your organization, ensuring a “single version of the truth” while dramatically reducing the time it takes to bring the core structured data together