Building OLAP Cubes

In this tutorial, you use the Analytic Workspace Manager (AWM) tool to build an OLAP cube. You:

• Create a cube and its dependent components, including:
• Dimensions
• Measures (stored and calculated)
• Map the OLAP model to source data
• Enable MV rewrite to the cube
• Load data into the dimensions and measures
• View the OLAP data

NOTE: This tutorial requires either Oracle Database 11.2 or 12.1.

Approximately 60 minutes

Topics

This tutorial covers the following topics:

 Overview Scenario Prerequisites Creating an Analytic Workspace Defining Dimensions and Levels Defining Hierarchies Defining and Reviewing Attributes Mapping Dimensions to Relational Sources Using Templates to Define Data Objects Defining Cubes Creating Measures Mapping Cubes to Relational Sources Enabling Query Rewrite to Cube MVs (optional) Loading and Viewing Cube Data More Information

Place the cursor over this icon to load and view all the screenshots for this tutorial. (Caution: Because this action loads all screenshots simultaneously, response time may be slow depending on your Internet connection.)

Note: Alternatively, you can place the cursor over each individual icon in the following steps to load and view only the screenshot associated with that step.

Overview

AWM is a tool for creating, developing, and managing multidimensional data in an Oracle data warehouse. With this easy-to-use GUI tool, you create the container for OLAP data, an analytic workspace (AW), and then add OLAP dimensions and cubes.

In Oracle OLAP, a Cube provides a convenient way of collecting stored and calculated measures with similar characteristics, including dimensionality, aggregation rules, and so on. A particular AW may contain more than one cube, and each cube may describe a different dimensional shape. Multiple cubes in the same AW may share one or more dimensions. Therefore, a cube is simply a logical object that helps an administrator to build and maintain data in an AW.

After creating cubes, measures, and dimensions, you map the dimensions and stored measures to existing star, snowflake, and normalized relational sources and then load the data. OLAP data can then be queried with simple SQL.

Scenario

The source data for this tutorial the OLAPTRAIN schema. OLAPTRAIN is a star schema that was sourced from a base transactional system which contains data for a fictional electronics store. The star schema contains “dimension” tables, which describe the relationships in the data, and “fact” tables, which contain the metrics used to measure performance.

The following are the tables, intended for analysis, that previously have gone through the ETL (Extraction, Transformation, and Loading of heterogeneous data) process:

 Table Description CHANNELS Table containing distribution channels for customers purchases. CUSTOMERS Table that show who purchased products, and where products are sold for the Geography dimension. PRODUCTS Table containing products that are sold by the company. TIMES Table containing time periods when products were sold. SALES_FACT Stores purchases in dollars, quantity, and price, by channel of distribution, product item, day, and customer.

Designing a Logical Data Model

After examining the relational tables, the available levels, hierarchies, and attributes for each dimension are identified. In addition, the required stored and calculated measures are identified as part of the business requirements definition process. The resulting logical model becomes the design for the OLAP data model.

Identifying Dimensions

Using the source data tables as the primary input, the following dimensions have been identified as requirements for the OLAP data model:

• Channel
• Geography
• Product
• Time

Identifying Levels

When designing your OLAP model, you also determine the level of summarization that you want to load into your cube. You may not necessarily want to replicate the data in your source as a cube. You can always query the detail data (since all of the data is in the Oracle database), by joining the cube to the fact table.

Your business requirements for summary management and analysis purposes should define the lowest level of detail for each dimension in the OLAP cube. You can load data into the cube at any level. After performing a business requirements analysis, the following the levels of summarization within each dimension have been identified as part of the OLAP data model:

 Channel dimension has two classes of distribution channels: Direct and Indirect. The children of these two values are the lowest level of detail and will be grouped in the Channel level. From the order of highest level of summarization to the lowest level of detail, levels will be: All Channels, Class, and Channel. Geography dimension reflects how company performs customer and geographic analysis along regions. Although the CUSTOMERS dimension table contains the following levels of detail: Region > Country > State-Province > City > Customer, the levels of summarization required for geographic analysis in the OLAP system will be (highest to lowest): All Regions, Region, Country, and State-Province. Product dimension will have six levels. These levels reflect the same levels of detail in the source data. From highest to lowest, the OLAP levels are: All Products, Department, Category, Type, Subtype, and Item. Time dimension will have four levels (highest to lowest): All Years, Calendar Year, Calendar Quarter, and Month. Data is available for the years 2005–2007.

Within each dimension, notice that an "All" (Total) level is added as the highest level of summarization. Adding this highest level provides additional flexibility as application users analyze OLAP data.

Identifying Hierarchies

Hierarchies organize the levels within each dimension. To identify hierarchies, you group the levels in the correct order of summarization and in a way that supports the identified types of analysis. You can orgainize levels into any number of hierarchies for each dimension.

In this OLAP data model, only one hierarchy is required for each dimension. The hierachy levels are designed as shown in the table above.

Identifying Measures

Analysis requirements include both stored and calculated measures. Two of the measures are acquired from the fact table, and the remaining measures are created and managed as OLAP calculations:

 Stored Measures Sales Quantity Calculated Measures Sales Year-to-Date Sales Year-to-Date Prior Year Sales Year-to-Date Prior Year % Change Sales Prior Year % Change Sales Prior Period Sales Prior Period % Change Sales Rank in Product Level Sales Rank in Product Parent Share of Product Sales within Parent Share of Product Sales within Total

Before you perform this tutorial, you should:

 1 Install Oracle Database 11.2 or 12.1 with the OLAP Option. 2 Download Analytic Workspace Manager from the Oracle OLAP page on OTN, and unzip it into any directory on your Windows machine. 3 Download and install the OLAPTRAIN schema following the instructions in Installing the Oracle OLAP 11g Sample Schema. Notes: The Sample Schema installation package includes two parts: 1. Installing the base OLAPTRAIN schema 2. Installing the SALESTRACK analytic workspace in the OLAPTRAIN schema Only complete the first part -- installing the base OLAPTRAIN schema. 4 a. Download olaptrain_templates.zip to a location on the machine where AWM is installed. b. Unzip the files. The resulting directory structure is: \templates\calcs The files in the ...\templates directory, and the ...\templates\calcs directory are used later in this tutorial.

Creating an Analytic Workspace

An analytic workspace is a container for multidimensional data objects and procedures written in OLAP DML. It is created using the AWM tool. Perform the following steps:

Defining Dimensions and Levels

Dimensions are lists of unique members that identify and categorize data. They form the edges of a cube, and thus the measures within the cube. Dimensions may contain levels, hierarchies, and attributes. You may define levels at the same time that create a dimension, or you may define the levels later.

You can define dimensions either as 'User' or as 'Time' dimension type. Business analysis is performed on historical data, so fully defined time periods are vital. For a Time type dimension, your source data must have columns for period end dates and time span. These required attributes support OLAP time-series analysis, such as comparisons with earlier time periods. If this information is not available, then you can define Time as a normal dimension, but it does not support time-based analysis.

Defining Hierarchies

For business analysis, data is typically summarized at various levels. For example, your database may contain daily snapshots of a transactional database. Days are thus the base level. However, you might summarize this data at the monthy, quarterly, and yearly levels.

A hierarchy is a logical structure that uses ordered levels as a means of organizing data. It can be used to define data aggregation; for example, in a time dimension, a hierarchy might be used to aggregate data from the month level to the quarter level to the year level. A hierarchy can be used to define a navigational drill path, regardless of whether the levels in the hierarchy represent aggregated totals.

Dimensions can have one or more hierarchies. If you define multiple hierarchies, one of them must be defined as the default hierarchy.

Defining and Reviewing Attributes

Attributes provide information about the individual members of a dimension. They are used for labeling data displays and selecting data. All dimensions are created with long and short description attributes. Time dimensions also have time-span and end-date attributes. In addition, you can create your own user attributes

In this section, you create a CHANNEL_TYPE attribute, and also review the description attributes for the CHANNEL dimension you just created. Perform the following steps:

1. Rick-click the Attributes folder, then select Create Attribute.   2. In the Create Attribute dialog, select or enter the following: a. Name = CHANNEL_TYPE b. Attribute Type = User c. In the “Apply Attributes to” box: Drill on the Channel dimension. Deselect the Channel dimension check box. Select only the CHANNEL level check box (the lowest level). Note: The Channel Type attribute only applies to the lowest level in the Sales Channel hierarchy. The Create Attribute dialog box should look like this: Click Create.   3. Expand the Attributes folder to view the Channel dimension attributes.   4. Select the LONG_DESCRIPTION attribute. In the right-hand pane, notice that description attributes are defined for all levels in the hierarchy, in contrast to the user attribute that you just created..

After creating OLAP data objects, you map them to tables and views in Oracle Database. You map the key column in the dimension table to the Member attribute in the OLAP dimension. In addition, you map the appropriate attribute columns in the dimension table to the associated OLAP dimension attributes.

Afterward, you can load data into your analytic workspace using the Maintain Analytic Workspace wizard.

1. Expand the CHANNEL dimension and click Mappings. Result: Two panes appear to the right: of the navigator -- the Schemas pane, and the Mapping pane. In the Table Mapping view (right-hand pane), the Source Column fields are initially blank, as shown here:   2. Ensure that Star Schema is selected as the Type of Dimension Table, as shown in the previous step.   3. In the source schema pane, expand OLAPTRAIN > Tables > CHANNELS. Then, drag the source columns from the Schema pane to the Mapping pane for the CHANNEL and CLASS levels as shown in the picture below. The ALL_CHANNELS level in the hierarchy does not contain a source data column. For "All/Total" hierarchy levels, you can enter constants or single row SQL functions. Enter the following constants for the ALL_CHANNELS level (single quotes are required for text literals): Member = ‘ALL_CHANNELS’ Description attributes = ‘All Channels’ The resulting mapping should look like this: Note: The "All/Total" value ensures that there is a single node at the top of the hierarchy that will be the summary of the data for that dimension.   4. In the lower right corner of the mapping pane, click Apply. Result: The Channel dimension is ready to have data loaded. Although you could load the dimension data now, you will perform this step later in the tutorial.

The template feature in Analytic Workspace Manager saves the definition of the OLAP data objects as an XML file. Using a saved template, you can create a new analytic workspace, dimension, cube, and measure exactly like an existing object, with or without mappings. Templates do not include the data, only the definition of the object.

Templates allow you to:

 Share analytic workspace designs with other users. Transfer object definitions to other schema or instances. Persist object definitions outside database. Place object definitions in source control.

In this section, you create three dimensions, GEOGRAPHY, PRODUCT and TIME from previously saved templates. Perform the following steps:

1. Right-click the Dimensions folder, then select Create Dimension From Template.   2. At the Create Dimensions From Template dialog box, locate the ...\templates directory, where you installed the olaptrain template files. For example: c:<your_path>\templates Then, select GEOGRAPHY.XML in the templates directory and click Create.   3. The new GEOGRAPHY dimension appears under the Dimensions folder. Drill on Levels, Hierarchies, and Attributes to view its elements.   4. To create the PRODUCT dimension, right-click the Dimensions folder, then select Create Dimension From Template, as you did previously in step 1. 5. At the Create Dimensions From Template dialog box, locate the PRODUCT.XML file and click Create.   6. The new PRODUCT dimension appears under the Dimensions folder.   7. Now, create the TIME dimension by right-clicking the Dimensions folder and selecting Create Dimension From Template. 8. At the Create Dimensions From Template dialog box, locate the TIME.XML file and click Create. Result: The new TIME dimension appears under the Dimensions folder.   9. In the navigator, drill on TIME > Attributes. Notice that there are two special attributes -- END_DATE and TIME_SPAN -- have been created for the TIME dimension. Since this dimension was defined as a "Time" type, these attributes are automatically created. They must be mapped to apppropriate source data columns for certain OLAP time series analysis features to be enabled. You will leverage these attributes when you create time series calculations later in this tutorial.   10. The Geography, Product and Time templates all included mappings. To view the mappings for the Time dimension, click Mappings under TIME in the navigator. As with the other the "All/Total" level in the hierarchy is mapped to either constants or single-row SQL functions. You can examine the mappings for any of these dimensions by clicking on the Mappings tab under the dimension node in the navigator.

In Oracle OLAP, a Cube provides a convenient way of collecting measures of the same dimensionality. Therefore, a cube is simply an object that helps an administrator to build and maintain an AW.

Cubes aid in the definition of measures with common characteristics, including the following:

 The edges of a Cube are defined by its dimensions. If multiple measures have the same dimensionality, it is likely that they will be defined in the same cube.. Measures that share sparsity patterns and aggregation rules are commonly defined in the same Cube. Measures in the same Cube have the same relationships to other logical objects and can easily be analyzed and displayed together. A particular AW may contain more than one Cube, and each cube may describe a different dimensional shape. Multiple Cubes in the same AW may share one or more dimensions.

For example, sales data can be organized into a cube, whose edges contain values from the channel, geography, product, and time dimensions and whose body contains measures that might include dollar sales, unit sales, and a range calculated measures based on sales and quantity sold.

Perform the following steps to create a cube that will be used to organize a variety of sales measures:

1. Right click the Cubes folder, then click Create Cube.   2. In the General tab of the Create Cube window, specify the following: a. Name: SALES_CUBE b. Use the Add tool (>) to selected dimensions in the following order: CHANNEL TIME GEOGRAPHY PRODUCT Result: the Create Cube window should look like this: Notes: The order in which the dimensions are listed in a cube may affect performance because it determines the way the data is stored on disk. In general, when you dimension a cube, the first dimension in a cube has the fewest number of dimension members, and the last dimension has the largest number of dimension members. This is the case in the OLAPTRAIN schema.   3. Select the Storage tab. Accept the default option to Use compression, and then enable the Sparse option for all dimensions, as shown here Notes: What is Sparsity? When there are a large number of empty cells in a cube, the cube is said to be “sparse.” This is very common in dimensional data models. Most commonly, all dimensions are marked as sparse. When one or more dimensions as marked as sparse, OLAP creates a special index for the cube that automatically manages sparsity. The Compression feature can be used to significantly reduce the size of cubes and improve performance of both data loads and queries. Since most dimensional data is sparse, the Compression option is selected as a default.   4. Select the Aggregation tab. Then, in the Precompute sub-tab, specify a value of 30 for Cost-based aggregation, as shown here: Notes: Cost-based aggregation enables you to select compression for your cube. Specify a percentage value and the database will precompute and store the most costly aggregate values based on your input. Using a setting of 30 causes a larger percentage of cube data to be aggregated and stored than the default setting of 20.   5. At the bottom of the Create Cube dialog box, click Create. Result: the SALES_CUBE node appears in the navigator under Cubes.

You can create two types of measures in a cube: Stored (or Base) measures, and Calculated measures. Every measure that belongs to a particular cube shares the characteristics that were defined for the cube.

Stored Measures

Base measures store the facts collected about your business. When you create base measures in your OLAP data model, you will map them to source data just as you have done with dimensions.

Calculated Measures

One of the powerful features of the Oracle OLAP technology is the ability to efficiently and easily generate business calculations of data held in the database. In any OLAP implementation, the number of calculated measures greatly exceeds the number of stored measures.

OLAP calcuated measures are derived from base measures or other calculated measures.These calculations are computed dynamically as users query the data. Calculations are automatically exposed as columns in a cube view – making it very easy for users to leverage the rich analytic functionality through very simple SQL.

AWM makes it very easy to define calculated measures using a graphical Calculation Builder. The Calculation Builder contains pre-defined examples for many common business calculation types. You select the calculation type you want, and then modify the example to create exactly the calculation that you need.

In this section, you will create two stored measures and ten calculated measures. Three of the calculated measures are created using the Calculation Builder, and seven are created using XML template files.

After creating an OLAP cube, you map it to relational data sources in Oracle Database. When mapping the cube, drag the appropriate source data column to the associated field for the OLAP cube element.

You map the following fields:

 The stored measures that are defined within the cube. The lowest level of detail for each dimension hierarchy. The Join Condition field. This field associates the foreign key (fk) column from the fact table to the primary key (pk) column from the dimension table.

Afterward, you can load data into your analytic workspace using the Maintain Analytic Workspace wizard.

Enabling Query Rewrite to Cube MVs (optional)

In an extension of the Materialized View capabilities for Oracle Database, OLAP cubes can be represented as a cube-organized materialized views (Cube MVs). The query optimizer automatically recognizes when an existing Cube MV can and should be used to satisfy a SQL summary request. A Cube MV represents a significant summary space, and benefits include both ease of manageability and improved query performance.

If your OLAP system requirements do not include a need for summary management of exiting SQL-based BI applications, then you can skip this optional task.

Notes:

 If you chose to enable query rewrite, supporting cube MV objects are automatically created and managed by the Oracle Database. Before you can enable materialized views for the cube, you must first map the cube.

To enable query rewrite and MV refresh for your OLAP cube, peform the following steps.

1. In the navigator, click SALES_CUBE.   2. In the right pane, click the Materialized Views tab and select the following options: Enable Materialized View Refresh of the Cube Enable Query Rewrite Notes: MV Refresh If you select Enable MV Refresh, you also specify the refresh method and mode for the cube. Cube MV refresh methods include, Complete, Force, and Fast. The default Refresh Mode is On Demand. Query Rewrite If you select Enable Query Rewrite, supporting cube MV objects are automatically created by the database when you click Apply. When a Cube is enabled for query rewrite, the associated Dimensions are automatically enabled for MV refresh as well.   3. Accept the default settings for all other options, and then click Apply. Result: the following information box appears: When the information box closes, cube MVs are enabled and ready for use by the Materialized View subsystem. Note: For more information on enabling and troubleshooting Query Rewrite to Cube MVs, see this white paper.

The Maintenance Wizard loads and aggregates the data in a single step. You can load all mapped objects in the analytic workspace, or individual dimensions and measures. You can also choose to run the job immediately, enter it in the Oracle Job Queue, or save it as a SQL script.

By default, when you load data to a cube, the dimensions of that cube are also processed. If you have already loaded dimension data, you can specify only to load measure data.

In the following steps, you load all data for the cube and run the job immediately. Then you view the data in AWM.