In this tutorial, you create an analytic workspace with logical dimensions and cube objects using Analytic Workspace Manager (AWM).
Approximately 45 minutes
This tutorial covers the following topics:
Place the cursor over this icon to load and view all
the screenshots for this tutorial. (Caution: Because this action loads all screenshots
simultaneously, response time may be slow depending on your Internet connection.)
Note: Alternatively, you can place the cursor over each individual icon in the following steps to load and view only the screenshot associated with that step.
Analytic Workspace Manager 10g is a tool for creating, developing, and managing multidimensional analytic workspaces in an Oracle OLAP data warehouse. With this easy-to-use GUI tool, you create an analytic workspace with logical dimension and cube objects. Afterwards, you map these objects with existing star, snowflake, and normalized relational sources and then load the data. In addition, throughout the tutorial you use templates, and make data storage decisions.
The starting point is various existing star, snowflake, and normalized schemas of the fictional Global Computing Company, which distributes hardware and software components on a worldwide basis. The following are the tables intended for analysis that previously have gone through the ETL (Extraction, Transformation, and Loading of heterogeneous data) process:
| Table | Description |
| CHANNEL_DIM | Star style dimension table for Channel dimension |
| CUSTOMER_DIM | Star style dimension table for Customer dimension |
| TIME_DIM | Star style dimension table for Time dimension |
| TIME_YEAR_DIM | Snowflake style dimension table for Year level of Time dimension |
| TIME_QUARTER_DIM | Snowflake style dimension table for Quarter level of Time dimension |
| TIME_MONTH_DIM | Snowflake style dimension table for Month level of Time dimension |
| PRODUCT_TOTAL_PRODUCT_MEMBER | Normalized style table with members of the Product dimension at the Total Product level |
| PRODUCT_CLASS_MEMBER | Normalized style table with members of the Product dimension at the Class level |
| PRODUCT_FAMILY_MEMBER | Normalized style table with members of the Product dimension at the Family level |
| PRODUCT_ITEM_MEMBER | Normalized style table with members of the Product dimension at the Item level |
| PRODUCT_TOTAL_PRODUCT_DSC | Normalized style table with descriptions of members of the Product dimension at the Total Product level |
| PRODUCT_CLASS_DSC | Normalized style table with descriptions of members of the Product dimension at the Class level |
| PRODUCT_FAMILY_DSC | Normalized style table with descriptions of members of the Product dimension at the Family level |
| PRODUCT_ITEM_DSC | Normalized style table with descriptions of members of the Product dimension at the Item level |
| PRODUCT_ITEM_BUYER | Normalized style table with values of Buyer attribute for members of the Product dimension at the Items level |
| PRODUCT_ITEM_MARKETING_MANAGER | Normalized style table with values of Marketing Manager attribute for members of the Product dimension at the Items level |
| PRODUCT_ITEM_PACKAGE | Normalized style table with values of Package attribute for members of the Product dimension at the Items level |
| UNITS_HISTORY_FACT | Historical fact data measures of the Sales Cube |
| UNITS_UPDATE_FACT | Latest month fact data for measures of the Sales Cube |
| PRICE_AND_COST_HIST_FACT | Historical fact data for measures of the Price and Cost cube |
| PRICE_AND_COST_UPD_FACT | Last month fact data for measures of the Price and Cost cube |
Designing a Logical Data Model
After examining these relational tables, which support analysis
at Global Computing, the dimensions, levels, hierarchies, and attributes in
the logical data model are identified. Furthermore, the relationships within
each dimension are identified. The resulting data model is used to design
the Global analytic workspace.
Identifying Dimensions
Four dimensions are used to organize the facts in the database.
| Channel | Shows how data varies according to each distribution channel. | |
| Customer | Shows how data varies by customer or geographic area. | |
| Product | Shows how data varies by product. | |
| Time | Shows how data varies over time. | |
Identifying Levels
Now that you have identified dimensions, you can identify the levels of summarization within each dimension. Analysis requirements at Global Computing reveal that:
| There are three distribution channels: Sales, Catalog, and Internet. These three values are the lowest level of detail in the data warehouse and are grouped in the Channel level. From the order of highest level of summarization to the lowest level of detail, levels are Total Channel and Channel. | ||
| Global performs customer and geographic analysis along the line of shipments to customers and by market segmentation. In each case, the lowest level of detail in the data model is the Ship To location. | ||
| Shipments, the levels of summarization is (highest to lowest): Total Customers, Region, Warehouse, and Ship To. | ||
| Market Segmentation, the levels of summarization is (highest to lowest): Total Market, Market Segment, Account, and Ship To. | ||
| Product dimension is four levels (highest to lowest): Total, Class, Family, and Item. | ||
| Time dimension is three levels (highest to lowest): Year, Quarter, and Month. | ||
Within the Channel, Customer, and Product dimensions, you added a Total level as the highest level of summarization. Adding this highest level provides additional flexibility as application users analyze data.
Identifying Hierarchies
Hierarchies organize the levels within each dimension. To identify hierarchies, you group the levels in the correct order of summarization and in a way that supports the identified types of analysis.
For the Channel, Product, and Time dimensions, Global Computing requires only one hierarchy for each dimension. For the Customer dimension, however, Global Computing requires two hierarchies. Analysis within the Customer dimension tends to be either by geographic area or market segment. Therefore, you organize levels into two hierarchies, Shipments and Market Segment.
Identifying Stored Measures
Four facts are acquired from the transactional database:
| Sales | ||
| Units | ||
| Unit Price | ||
| Unit Cost | ||
All of the other facts can be derived from these basic facts. The derived facts can be calculated in the analytic workspace on demand. If experience shows that some of these derived facts are being used heavily and the calculations are putting a noticeable load on the system, then some of these facts can be calculated and stored in the analytic workspace as a data maintenance procedure.
Before you perform this tutorial, you should:
| 1. |
Perform the Installing Oracle Database 10g on Windows OBE. Note: you can also perform this tutorial connecting to a non-windows database. |
|
| 2. | Download Analytic Workspace Manager from OTN and unzip it into any directory on your Windows machine. |
|
| 3. | Download and unzip awm.zip into your working directory (i.e. d:\wkdir) | |
Perform the following steps to setup the database for use in this tutorial.
| 1. |
You first need to create a tablespace and data files. Open SQL*Plus and execute the following script: @c:\wkdir\awm_setup01The awm_setup01.sql script contains the following statements: sqlplus / as sysdba
CREATE TABLESPACE GLOBAL
DATAFILE '<oracle_base>\oradata\orcl\GLOBAL.DBF'
SIZE 90M AUTOEXTEND ON NEXT 5M;
CREATE TEMPORARY TABLESPACE GLOBAL_TEMP
TEMPFILE '<oracle_base>\oradata\orcl\GLOBAL_TEMP.DBF'
SIZE 90M AUTOEXTEND ON NEXT 5M
UNIFORM SIZE 256K;
<oracle_base> is the directory where your oracle_home is located. Modify the script to contain your directory.
|
| 2. |
Now you want to create a user and grant the required role. Open SQL*Plus and execute the following script: @c:\wkdir\awm_setup02The awm_setup02.sql script contains the following statements: CREATE USER GLOBAL
IDENTIFIED BY GLOBAL DEFAULT TABLESPACE GLOBAL
TEMPORARY TABLESPACE GLOBAL_TEMP
QUOTA UNLIMITED ON GLOBAL;
GRANT OLAP_USER TO GLOBAL;
|
| 3. |
At this point you can import your data from the GLOBAL_TABLES.DMP into the user you just created. Execute the following commands from a command prompt: cd \wkdir
IMP GLOBAL/GLOBAL
Import file: EXPDAT.DMP > GLOBAL_TABLES.DMP
Enter insert buffer size (minimum is 8192) 30720> 30720
List contents of import file only (yes/no): no > no
Ignore create error due to object existence (yes/no): no > no
Import grants (yes/no): yes > yes
Import table data (yes/no): yes > yes
Import entire export file (yes/no): no > yes
|
The analytic workspace, which is multidimensional data objects and procedures written in OLAP DML, is created using the Analytic Workspace Manager. Perform the following steps:
| 1. | Open Windows Explorer and double-click <your_path>\awm\bin\awm.exe.
|
| 2. | Right-click Databases and select Add Database to tree.
|
| 3. | Enter MyAW in the Description field and <hostname>:1521:<SID> in the Connection Information field and click Create.
|
| 4. | Expand MyAW.
|
| 5. | Enter global as the Username and Password and click OK.
|
| 6. | Expand Schema > GLOBAL. Right-click Analytic Workspaces and select Create Analytic Workspace.
|
| 7. | Enter GLOBAL as the Name and click Create.
|
| 8. | Your Analytic Workspace has been created.
|
Dimensions are lists of unique values that identify and categorize data. They form the edges of a logical cube, and thus the measures (facts) within the cube. Dimensions are the parents of levels, hierarchies, and attributes in the logical model. You define these supporting objects, in addition to the dimension itself, in order to have a fully functional dimension.
You can define dimensions either as 'User' or as 'Time' dimensions. Business analysis is performed on historical data, so fully defined time periods are vital. Your time dimension source must have columns for period end dates and time span. These required attributes support time-series analysis, such as comparisons with earlier time periods. If this information is not available, then you can define Time as a normal dimension, but it does not support time-based analysis.
| 1. |
Right-click the Dimensions folder and select Create Dimension.
|
| 2. |
At the default General tab in the Create Dimension dialog box, enter CHANNEL as the name and select User Dimension as dimension type. Then click Create.
|
| 4. |
Your dimension has been created.
|
For business analysis, data is typically summarized at various levels. For example, your database may contain daily snapshots of a transactional database. Days are thus the base level. You might summarize this data at the weekly, quarterly, and yearly levels.
Levels have parent-child or one-to-many relationships, which form a hierarchy. For example, each week summarizes seven days, each quarter summarizes 13 weeks, and each year summarizes four quarters. This hierarchical structure enables analysts to detect trends at the higher levels, then drill down to the lower levels to identify factors that contributed to a trend.
For each level that you define, you must identify a data source for dimension members at that level. Members at all levels are stored in the same dimension. In the previous example, the Time dimension contains members for days, weeks, quarters, and years.
| 1. |
Expand the CHANNEL node and right-click the Levels folder, then select Create Level.
|
| 2. |
At the Create Level dialog box, enter TOTAL_CHANNEL as the Name and click Create.
|
| 3. |
The new TOTAL_CHANNEL level appears as an item in the Levels folder. You need to create one more Level for this dimension. Right-click again on the Levels folder and select Create Level.
|
| 4. |
Enter CHANNEL as the Name and click Create.
|
| 5. |
The levels TOTAL_CHANNEL and CHANNEL have been created for the CHANNEL dimension.
|
A hierarchy is a logical structure that uses ordered levels as a means of organizing data. It can be used to define data aggregation; for example, in a time dimension, a hierarchy might be used to aggregate data from the month level to the quarter level to the year level. A hierarchy can be used to define a navigational drill path, regardless of whether the levels in the hierarchy represent aggregated totals.
Dimensions can have one or more hierarchies. If you define multiple hierarchies, be sure to define one of them as the default hierarchy.
| 1. |
Right-click the Hierarchies folder, then select Create Hierarchy.
|
| 2. |
At the Create Hierarchy dialog box, enter Primary as the name and accept as the default hierarchy. Click >> to select all the levels and click Create.
|
| 3. |
The new PRIMARY hierarchy appears as an item in the Hierarchies folder.
|
Attributes provide information about the individual members of a dimension. They are used for labeling cross-tabular and graphical data displays, selecting data, organizing dimension members, and so forth. All dimensions are created with long and short description attributes. Time dimensions also have time-span and end-date attributes.
In this section, you review the attributes for the CHANNEL dimension you just created. Perform the following steps:
| 1. |
Expand the Attributes folder and highlight LONG_DESCRIPTION. Expand CHANNEL to review the levels. The levels must be selected in order for you to map the attributes to the data sources. This is done later in the tutorial.
|
| 2. |
Under the Attributes folder, highlight SHORT_DESCRIPTION. Review its contents.
|
The template feature in Analytic Workspace Manager saves the definition of the logical objects outside the confines of the database. Using a saved template, you can create a new analytic workspace, dimension, cube and custom measures exactly like an existing object with or without mappings. Templates do not include the data. Templates allow you to:
| Share analytic workspace designs with other users. | ||
| Transfer object definitions to other schema or instances. | ||
| Persist object definitions outside database. | ||
| Place object definitions in source control. | ||
In this section, you create three dimensions, CUSTOMER, PRODUCT and TIME from previously saved templates. Perform the following steps:
| 1. |
Right-click the Dimensions folder, then select Create Dimension From Template.
|
| 2. |
At the Create Dimensions From Template dialog box, locate the Customer Star.XML file in the <wkdir> directory and click Create.
|
| 3. |
The new CUSTOMER dimension appears under the Dimensions folder.
|
| 4. |
Now you can create the PRODUCT dimension. Right-click the Dimensions folder, then select Create Dimension From Template.
|
| 5. |
At the Create Dimensions From Template dialog box, locate the Product Other.XML file and click Create.
|
| 6. |
The new PRODUCT dimension appears under the Dimensions folder.
|
| 7. |
Now you can create the TIME dimension. Right-click the Dimensions folder, then select Create Dimension From Template.
|
| 8. |
At the Create Dimensions From Template dialog box, locate the TIME Snowflake.XML file and click Create.
|
| 9. |
The new TIME dimension appears under the Dimensions folder.
|
Cubes are logical representations of multidimensional data. The edges of the cube contain dimension members and the body of the cube contains data values. For example, sales data can be organized into a cube, whose edges contain values from the channel, customer, product, and time dimensions and whose body contains volume sales and dollar sales data.
Cubes are the parents of measures (facts). They are informational objects that identify measures with the exact same dimensions, and are candidates for being processed together at all stages: data loading, aggregation, storage, and querying.
In this section, you create two cubes: SALES_CUBE and PRICE_AND_COST_CUBE.
Perform the following steps:
| 1. |
Right click the Cubes folder, then click Create Cube.
|
| 2. |
At the default General tab in the Create Cube dialog box, enter SALES_CUBE as the name. Select >> to include all the dimensions and click the Implementation Details tab.
|
| 3. |
The order in which the dimensions are listed in a cube affects performance because it determines the way the data is stored on disk. The first dimension in a cube is the fastest-varying dimension, and the last dimension is the slowest-varying dimension. Move the dimensions in the following order: TIME Make sure the Sparse check box is selected for all dimensions except Time. Deselect the Use compression check box. Then click Create.
|
| 4. |
Your SALES_CUBE cube has been created. Right click the Cubes folder again, then click Create Cube.
|
| 5. |
At the default General tab in the Create Cube dialog box, enter PRICE_AND_COST_CUBE as the name. Select the TIME and PRODUCT dimensions and click the Rules tab.
|
| 6. |
Select Nonadditive (Do not summarize) for the Operator for both Dimensions TIME and PRODUCT. Then click Create.
|
| 7. |
Your cubes have been created.
|
Base measures store the facts collected about your business. Each measure belongs
to a particular cube, and thus shares particular characteristics with other
measures in the cube, such as the same dimensions.
| 1. |
Expand the SALES_CUBE node and right-click the Measures folder, then select Create Measure.
|
| 2. |
At the Create Measure dialog box, enter SALES as the name and click Create.
|
| 3. |
The new SALES measure appears under the Measures folder. Now you create the Units measure. Right-click the Measures folder again, then select Create Measure.
|
| 4. |
At the Create Measure dialog box, enter UNITS as the name and click Create.
|
| 5. |
Expand the PRICE_AND_COST_CUBE node and right-click the Measures folder, then select Create Measure.
|
| 6. |
At the Create Measure dialog box, enter UNIT_PRICE as the name and click Create.
|
| 7. |
Right-click the Measures folder again, then select Create Measure.
|
| 8. |
At the Create Measure dialog box, enter UNIT_COST as the name and click Create.
|
| 9. |
All your measures have been created.
|
After creating logical objects, you can map them to relational data sources
in Oracle Database. Afterward, you can load data into your analytic workspace
using the Maintain Analytic Workspace wizard.
| 1. |
Expand the CHANNEL dimension and click Mappings.
|
|||||||||||||||||||||
| 2. |
The Mapping Window is displayed in the right pane. Expand the GLOBAL schema then Tables.
|
|||||||||||||||||||||
| 3. |
Select the CHANNEL_DIM table and drag-and-drop it on the mapping canvas.
|
|||||||||||||||||||||
| 4. |
The table appears on the mapping canvas. You can move the objects around a bit so that you can see them in the same window.
|
|||||||||||||||||||||
| 5. |
Draw lines from the CHANNEL_DIM source columns to the CHANNEL target objects. Map the columns as follows:
To draw a line, click the output connecter of the source column and drag it to the input connector of the target object. Once finished, click Apply.
The results are shown. The diagram is auto arranged.
|
|||||||||||||||||||||
| 6. |
Alternatively, you can review the mappings in the Table Mapping View. Click the Table Mapping View icon on the toolbar.
|
|||||||||||||||||||||
| 7. |
The mapping is now shown in a table format.
|
|||||||||||||||||||||
| 8. |
All four dimensions are now mapped. The Customer, Product and Time templates included mappings. However, you may want to examine the Product and Time mappings as they differ from star schemas. Product uses normalized tables and Time uses snowflake tables.
|
|||||||||||||||||||||
| 9. | Now you can map your cubes. Expand the SALES_CUBE node. You receive a messaging asking you if you want to save your changes. Click Yes.
|
|||||||||||||||||||||
| 9. |
Click Mappings.
|
|||||||||||||||||||||
| 10. |
Expand GLOBAL > Tables. Select the UNITS_HISTORY_FACT table and drag-and-drop it on the mapping canvas.
|
|||||||||||||||||||||
| 11. |
Draw lines from the UNITS_HISTORY_FACT source columns to the SALES_CUBE target objects. Map the columns as follows:
To draw a line, click the output connecter of the source column and drag it to the input connector of the target object. Once finished, click Apply.
The results are shown.
|
|||||||||||||||||||||
| 12. |
Expand the PRICE_AND_COST_CUBE node and click Mappings.
|
|||||||||||||||||||||
| 13. |
Select the PRICE_AND_COST_HIST_FACT table and drag-and-drop it on the mapping canvas.
|
|||||||||||||||||||||
| 14. |
Draw lines from the PRICE_AND_COST_HIST_FACT source columns to the PRICE_AND_COST_CUBE target objects. Map the columns as follows:
To draw a line, click the output connecter of the source column and drag it to the input connector of the target object. Once finished, click Apply.
|
|||||||||||||||||||||
The Maintenance Wizard loads and aggregates the data as a single job. You can
load all mapped objects in the analytic workspace, or individual dimensions
and measures. You can also choose to run the job immediately, enter it in the
Oracle Job Queue, or save it as a SQL script.
| 1. |
Right-click the GLOBAL analytic workspace, then select Maintain Analytic Workspace GLOBAL.
|
| 2. |
Select SALES_CUBE and PRICE_AND_COST_CUBE and click > to select the cubes as well as its dimensions. Click Next.
|
| 3. |
Accept the defaults for data processing options and click Next.
|
| 4. |
Click Finish to create and run the maintenance task to load the data into the GLOBAL analytic workspace.
|
| 5. |
The maintenance task has been completed. Review the Build Log. Scroll down to the bottom.
|
| 6. |
The data has been loaded successfully into the GLOBAL analytic workspace. Click Close.
|
| 7. |
Now the GLOBAL analytic workspace contains data as specified by the default aggregation plan. You can now perform ad-hoc, multidimensional analysis with Oracle Business Intelligence Beans applications such as Discoverer Plus OLAP and Spreadsheet Add-In!
|
| 8. | To view the data, right-click the SALES_CUBE.SALES Measure and select View Data SALES.
|
| 9. | The data is shown.
|
In this tutorial, you learned how to:
| Create an analytic workspace | ||
| Define dimensions, levels and hierarchies | ||
| Review your attributes | ||
| Use templates | ||
| Define cubes | ||
| Create measures | ||
| Create a mapping to relational sources | ||
| Load data in the analytic workspace | ||