As Published In
Oracle Magazine
November/December 2013

TECHNOLOGY: Business Analytics

  

Integrate and Analyze

By Mark Rittman Oracle ACE Director

 

Combine structured and unstructured data for analysis and new insights.

Oracle Endeca Information Discovery applications enable organizations to create rich, interactive data discovery applications that consume data from all types of datasources, from the more traditional “structured” data sets found in Oracle Database and Oracle E-Business Suite to the unstructured data of documents and social media feeds.

Oracle Endeca Information Discovery Release 3.0 extends this capability by integrating with Oracle Business Intelligence Enterprise Edition 11g, providing the ability to build data discovery applications around the facts, dimensions, hierarchies, and integrated data sets of the enterprise semantic model.

In this article, I take a look at this new integration by creating an Oracle Endeca Information Discovery Studio application that uses the sample application (SampleApp) for Oracle Business Intelligence Enterprise Edition as its datasource; uses Oracle Endeca Information Discovery Integrator to load data from the Flight Delays subject area provided by the SampleApp; and uses the SampleApp subject area table metadata to create an Oracle Endeca Server data domain. I then use Oracle Endeca Information Discovery Studio to create an initial data discovery web application, which you can then extend later with additional, unstructured datasources and data visualization components.

Prerequisites for the SAMPLE

If you want to create this article’s sample application yourself, you will need to download the following products from the Oracle Software Delivery Cloud website (edelivery.oracle.com), using either your full license or a trial license. Product versions are available for Microsoft Windows x64 and Linux x86-64. 

  • Oracle Endeca Server (7.5.1.1), with Oracle WebLogic Server 10.3.6, a separate download

  • Oracle Endeca Information Discovery Integrator (3.0)

  • Oracle Endeca Information Discovery Studio (3.0)

The SampleApp for Oracle Business Intelligence Enterprise Edition 11.1.1.6.2 BP1 (V207), which I will use as the datasource, can be downloaded—preinstalled and preconfigured—as an Oracle VM VirtualBox image, from the Oracle Technology Network website.

This SampleApp (V207) Oracle VM VirtualBox image comes with several demonstration subject areas, including “X – Airline Delay,” which I will use for this example. Using Oracle Endeca Information Discovery Integrator 3.0, I will connect to the business intelligence server component and then create an Oracle Endeca Server data domain, using a selection of the subject area’s tables as a datasource. Then, once the data domain is loaded and ready for use, I will use Oracle Endeca Information Discovery Studio 3.0 to create a web application for exploring the data set.

In addition to the SampleApp Oracle VM VirtualBox image, you will need a Microsoft Windows–based environment in order to use the administration tool (which can also be the environment you use to run Oracle Endeca Information Discovery 3.0). For details on how to download and configure the administration tool to work in a separate Linux-based business intelligence server environment, see “4.5 Admintool access to SampleApp RPD” in the “SampleApp V207 - Virtual Machine Image Deployment Guide” document.

Required Amendments

To create an Oracle Endeca Server data domain that gets its data from an Oracle Business Intelligence repository, a wizard within Oracle Endeca Information Discovery Integrator connects to the repository to enable you to select a particular subject area and set of tables to use as the data domain datasource, and then the wizard automatically creates data domain attributes based on the names of the selected tables and columns. However, business model table and column names within the SampleApp repository are prefixed with numbers to aid in referencing them within sample dashboards, so you will need to create a version of these tables without the number prefixes—the Oracle Endeca Server data domain attribute names cannot start with a number.

To create a version of the “X – Airline Delay” subject area and underlying business model that complies with this naming restriction but does not affect any other SampleApp reports that use the original object names, follow these steps: 

  1. In the Microsoft Windows–based environment into which you have previously downloaded and installed Oracle Business Intelligence’s administration tool, select Start -> Oracle Business Intelligence Enterprise Edition Client -> Administration. When the administration tool opens, select File -> Open Online, select the connection to SampleApp, and enter the login credentials for the repository. For example:

    Repository Password: Admin123
    User: weblogic
    Password: Admin123
    ODBC DSN: <>

  2. With the administration tool open and the SampleApp repository’s three layers of repository metadata ready for editing, navigate to the Presentation layer on the far left and, within it, right-click the X – Airlines Delay subject area and select Duplicate with Business Model from the contextual menu. In the Copy Business Model and Subject Area dialog box, click X0 – Airlines / X – Airlines Delay to select it, and then type in the following values to provide names for the duplicate subject area and the underlying business model:

    New business model name: OEID Source BM – Airline Delay
    New subject area name: OEID Source – Airline Delay

    After you’ve entered the new name, click OK to duplicate the repository metadata and close the dialog box.

  3. Now, still using the administration tool, but this time working within the Business Model and Mapping metadata layer window, locate the OEID Source BM business model you created in the previous step and rename all the objects to remove the numbers at the start of the object names. Start with the 00 Time logical dimension, right-click i5, select Rename, and then remove the numbers from the start of the names, so that, for example, 00 Time becomes Time, 11 Origin becomes Origin, and so on, until none of the table names start with numbers. When you have finished renaming these objects, the business model should look like Figure 1.

 o63ba-fig1

Figure 1: Renaming the duplicate business model’s tables to remove numeric table name prefixes

Finally, to save this updated repository and make it available for the next stage in this process, select File -> Check In Changes, and for the Do you wish to check global consistency? question, choose No to avoid checking in all the other objects in the SampleApp repository that are not relevant to this example. Finally, select File -> Save and then File -> Close to save the updated repository back to the server and close the administration tool’s connection to it.

Creating the Data Domain

Now that you have prepared the Oracle Business Intelligence repository for use with Oracle Endeca Information Discovery Integrator, you can connect to the Oracle Business Intelligence Enterprise Edition repository and create the first cut of your Oracle Endeca Server data domain. To do this, follow these steps, again based on a Microsoft Windows x86-64 development environment.

  1. In Windows, select Start -> All Programs -> Oracle Endeca Information Discovery 3.0.0 -> Integrator.
  2. When the Oracle Endeca Information Discovery Integrator application opens, locate the Navigator panel in the top left corner of the application window and select New -> Project. When the New Project dialog box appears, locate the Load Data from OBI Server option within the Endeca Information Discovery folder, click it, and click Next to proceed.
  3. The Load Data from OBI Server wizard appears. On the first page of the wizard, select Create a new project and type in a name for the project—such as Flight Delays from OBIEE—and click Next again to continue.
  4. Next, when prompted, enter the connection details for your instance of Oracle Endeca Server and provide a name for the new data domain. For example:

    Endeca Server host: oeid30.mycompany.com
    <>
    Endeca Server port: 7001
    Data domain name: flight_delays
    <>

    Click Next to continue.

  5. On the next page, enter the User, Password, OBI Server host, and OBI Server port values to connect to the SampleApp repository and server, replacing the OBI Server host value with the hostname or IP address of your SampleApp Oracle VM VirtualBox virtual machine.

    User: weblogic
    Password: Admin123
    OBI Server host: obieesampleapp.mycompany.com
    <>
    OBI Server port: 9703

    Once you’ve entered these details, click Connect to OBI Server to tell Oracle Endeca Information Discovery Integrator to retrieve the list of subject areas provided by the SampleApp repository, and click Next to continue.

  6. On the next page of the wizard, select the OEID Source – Airline Delay subject area you created in the previous set of steps, and then select the following tables to use as the datasource for your data domain, as shown in Figure 2.

    Arrival Delay
    Carrier
    Date
    Destination
    Flight Facts
    Origin
    Route
    Schedule

    Once you’ve selected these tables, click Next to continue.

    o63ba-fig2

    Figure 2: Selecting subject area tables for import into the Oracle Endeca Server data domain

  7. The next page in the wizard displays the list of Oracle Endeca Server data domain attributes that will be created to contain the imported data. On this wizard page, you can fine-tune the configuration of these attributes. In the Search Interface Name column on this wizard page, type in the following values for the attribute keys listed below to create two initial search interfaces, one for all attributes relating to a flight’s origin and another for the flight’s destination.

    Attribute Key Search Interface Name
    Orig_Airport_Map_Orig_US_ State_Name Origin
    Orig_Airport_Orig_Region_Name Origin
    Orig_Airport_Orig_Division_Name Origin
    Orig_Airport_Orig_City_Name Origin
    Orig_Airport_Orig_Airport Origin
    Dest_Airport_Map_Orig_US_ State_Name Destination
    Dest _Airport_Orig_Region_Name Destination
    Dest_Airport_Orig_Division_Name Destination
    Dest _Airport_Orig_City_Name Destination
    Dest _Airport_Orig_Airport Destination

    After entering the key values, click Edit Finished and then Finish to close the wizard and return to the Oracle Endeca Information Discovery Integrator main window. If you look in the Navigator panel now and locate the project you created, you will see that Oracle Endeca Information Discovery Integrator has created a complete project that reads data in from Oracle Business Intelligence Enterprise Edition and uses it to create an Oracle Endeca Server data domain.

  8. The flight delays data set in SampleApp contains details of more than 6 million flights, so to keep the resultant data domain manageably small—at least for this initial load—you can add a filter to the extraction SQL query generated by Oracle Endeca Information Discovery Integrator so that only flights from Q1 2010 are extracted and loaded. To add this filter, locate the data-in folder within the Navigator view, open it, right-click the QueryStatement.sql object within it, and then select Open With -> Text Editor. Add the following WHERE clause to the automatically generated SQL statement to restrict the extraction to flights in Q1 2010:

    where "OEID Source BM - Airline Delay"."Time"."Dep Qtr" = '2010 Q1'
    


    Then use File -> Save from the Oracle Endeca Information Discovery Integrator menu to save the changes you’ve made to this file.

  9. Finally, load the SampleApp data into Oracle Endeca Server. Within the Navigator panel, locate and open the graph folder, and then double-click the Baseline.grf graph within the set of graphs to open it in the Oracle Endeca Information Discovery Integrator main window. Click anywhere within the large gray box in the graph to select it, and then select Run -> Run from the Integrator menu to start the graph execution and load data from Oracle Business Intelligence Enterprise Edition into your data domain.

Viewing Flight Delay Data

Next Steps 


 READ more about Oracle Endeca Information Discovery applications

DOWNLOAD
 Oracle Endeca Information Discovery applications
 SampleApp Oracle VM VirtualBox image

 

To take an initial look at what’s in the Oracle Endeca Server data domain you’ve just created with Oracle Endeca Information Discovery Integrator, you can quickly create an Oracle Endeca Information Discovery Studio application that enables you to explore the data set and that you can extend afterward to try out more of Oracle Endeca Information Discovery Studio’s features. To create this “first cut” Oracle Endeca Information Discovery Studio application, follow this final set of steps: 

  1. In your web browser, navigate to your Oracle Endeca Information Discovery Studio website—http://localhost:7002, for example—and enter the login credentials—admin@oracle.com/welcome1, for example—for an Oracle Endeca Information Discovery Studio administrative user.

  2. When the Discovery Applications web page appears, navigate to the menu at the top right of the page, click the down arrow, and select Control Panel. Then, when the Control Panel page appears, click Information Discovery -> Data Source to define a datasource connection Oracle Endeca Information Discovery Studio will use to connect to the data domain you created in the previous steps.

  3. On the Data Source page that appears, click New Data Source, and for Data Source Definition, enter Flight_Delays for Data Source ID and enter the following datasource JSON file definition, replacing the server parameter value with the name of the server that hosts your Oracle Endeca Server instance:

    {
     "dataDomainName": "flight_delays",
     "name": "Flight_Delays",
      "port": "7001",
      "server": " oeid30.mycompany.com "
    } 


    Click Validate to test the connection and then Save when you are done. When you return to the main application, click Back to Home in the top right corner to return to the Discovery Applications page.

  4. Staying on this page, click New Application to create your application, and enter the following details when prompted:

    Application Name: Flight Delays Explorer
    (Oracle Magazine)
    Data Source: Flight_Delays

Now click Create Application and then Go to Application to view the application in your web browser, as shown in Figure 3.

o63ba-fig3

Figure 3: The initial Oracle Endeca Information Discovery Studio application

You can use this data discovery application example to navigate and search through the attributes loaded from Oracle Business Intelligence Enterprise Edition into Oracle Endeca Server and see how Oracle Endeca Server’s “faceted search” facility enables you to search and refine your target data set.

Conclusion

Oracle Endeca Information Discovery applications extend the capabilities of Oracle’s business intelligence platform to encompass unstructured and semistructured datasources, enabling you to use the unique capabilities of Oracle Endeca Server to search, analyze, and aggregate data from any source.

With the ability to now use data from the Oracle Business Intelligence repository alongside database, file, and other datasources, you can quickly create web-based data discovery applications that build on the work you’ve already done to model the business data within your organization, ensuring a “single version of the truth” while dramatically reducing the time it takes to bring the core structured data together


Mark Rittman Headshot


Mark Rittman
is an Oracle ACE Director, cofounder of Rittman Mead, and author of the Oracle Press book Oracle Business Intelligence 11g Developers Guide and writes for the Rittman Mead blog at rittmanmead.com/blog.

 

Send us your comments