As Published In
Oracle Magazine
May/June 2008

TECHNOLOGY: Data Warehouse


Accelerating Data Warehouses

By Dan Vlamis

Oracle OLAP 11g brings high-performance data warehouse features to Oracle Database 11g.

An option of Oracle Database 11g Enterprise Edition, Oracle OLAP 11g is a full-featured online analytical processing engine. Because Oracle OLAP 11g is embedded in Oracle Database, it benefits from database scalability, security, and manageability features.

Key new features in Oracle OLAP 11g include database-managed relational views of a cube, a cube scan row source that is used by the SQL optimizer, and cube-organized materialized views. This article focuses on enhancements in Oracle OLAP 11g, how OLAP cubes fit into the Oracle data warehouse, and how they are used within SQL-based business intelligence applications to provide enriched content with high performance.

Why OLAP?

Users' queries are often unpredictable. On different days, the same users will perform trend analysis, drill down on specific product lines, and compare a week's sales against those of the same week last year. With standard relational systems, it is difficult to optimize data structures that provide consistently good query performance for such an unpredictable query pattern.

To address this need, DBAs and designers frequently create a system of summary tables or materialized views. OLAP cubes, which provide consistently fast query performance across an entire data model, often provide a better alternative to summary management. Sophisticated calculations can be easily embedded within the cube to enhance the analytic content of applications.

These calculations often rely on data from many rows and interrow calculations. For example, an OLAP cube might include a calculation that compares the current year's sales for each region and product line with those from the same period last year and two years ago. The cube structure is optimized to handle this kind of querying and analysis.

Why Oracle OLAP?

Oracle OLAP uses an analytic workspace in the database to perform OLAP analysis. Oracle OLAP stores data in the database as multidimensional cubes, which are designed for fast incremental update and query. Cubes are organized by dimensions, which act as keys to the fact data and define the basic structure of the cube. In many ways, a cube is similar to a star schema. The cube plays the role of the fact table, and an OLAP dimension plays the role of a dimension table. Dimensions can be simple lists of members, or they can be organized into levels and hierarchies. Hierarchical dimensions enable data to be aggregated from lower levels to higher levels of summarization. They support navigation such as drill-down and certain types of calculations such as Share to Parent, Share within Ancestor, and Rank within Parent. They also support many time-series calculations such as Year to Date. These types of calculations are easy to define within the analytic workspace manager (the administrative tool of Oracle OLAP 11g) and are efficiently computed within the cube at runtime.

Oracle OLAP 11g can significantly shorten query processing times for users of SQL-based business intelligence (BI) tools such as Oracle Business Intelligence Suite Enterprise Edition and other third-party tools. Oracle OLAP 11g also makes it easy to embed computations in an OLAP cube. Oracle OLAP 11g capabilities are all provided within Oracle Database, which allows for centralized control of data, business rules, and security.

Creating Cube-Organized Materialized Views for Summary Management

Cube-organized materialized views are a new capability in Oracle OLAP 11g that enables cubes to be used as a summary management solution, often replacing relational summary tables and table-based materialized views. In this scenario, an application queries the fact table by using aggregation functions (such as SUM) and a GROUP BY clause, and the query rewrite feature of the database automatically redirects the query to the cube-organized materialized view. The cube manages summary data transparently to the querying application, and users benefit from improved query performance.

To create a cube-organized materialized view, DBAs build a cube from the fact and dimension tables that applications currently query. They then enable the cube to allow materialized view refresh and query rewrite. The materialized view can be created via an API call or, more commonly, via certain options in the analytic workspace manager, as shown in Figure 1. The cube is self-indexing and manages the entire aggregate space. 

Build the GLOBAL sample cube. To create and use the GLOBAL sample OLAP cube and walk through the Oracle OLAP 11g capabilities this article describes, install and set up the products, schema, and workspace as follows:

1. Install the Oracle Database 11g database server and client. Note that the server and client are separate downloads, both available on Oracle Technology Network at otn.oracle.com. From this page, select Downloads -> Database , and then click the See All link next to the name of your operating system in the Oracle Database 11g download section. When installing the database server, you can use default options. When installing the client, make sure you select the Administrator option to install the extended set of tools, which includes the analytic workspace manager.

2. Download the GLOBAL schema (global_11g_schema.zip) from oracle.com/technetwork/products/bi/olap. From this page, click the 11.1 link next to the Global Schema selection in the Downloads section of this page. Unzip the file contents to a directory.

3. Download o38olap-132545.zip, and unzip the contents, the global_oramag.xml file, to a directory.

4. Follow the directions specified in the global_11g_readme.html documentation file, which is provided in the GLOBAL schema zip file downloaded in step 2. These directions enable you to create the GLOBAL schema with the tables necessary to build an analytic workspace.

5. Launch the analytic workspace manager. From the Windows Start menu, you can find it at {Oracle11g Client Home} -> Integrated Management Tools -> Oracle OLAP Analytic Workspace Manager and Worksheet .

6. Create a connection to your instance, by right-clicking Databases and selecting Add Database to tree. . . . Enter a description for the connection and the appropriate connection string for your instance (such as localhost:1521:orcl11). Then click Create.

7. Expand the Database tree until your new connection appears, and then click + (plus sign) next to your connection to connect to the instance. Enter GLOBAL for the username, and enter the password you assigned in step 4.

8. Expand the Schemas tree until you see Analytic Workspaces . Right-click Analytic Workspaces , and select Create Analytic Workspace from Template. . . .

9. Navigate the file chooser to find the global_oramag.xml template file you downloaded in step 3. Choose this file, and click Create . Wait for a few minutes while it creates the analytic workspace.

10. Right-click the new GLOBAL analytic workspace, and select Maintain Analytic Workspace GLOBAL.

11. Click >> (Add All), and then click Finish . This step loads data into the cube and aggregates it according to the settings specified in the template file. It may take a few minutes to complete. Once you're done, click Close to close the build log.

Run the OLAP query. With the GLOBAL cube loaded and the data aggregated, now enter the following—

 

SELECT t.calendar_year_id time,
          p.class_id product,
          c.region_id region,
          SUM(f.sales) sales
FROM time_dim t,
        product_dim p,
        customer_dim c,
        units_fact f
WHERE t.month_id = f.month_id
   AND p.item_id = f.item_id
   AND c.ship_to_id = f.ship_to_id
   AND c.region_id = 'AMER'
GROUP BY t.calendar_year_id,
              p.class_id,
              c.region_id
ORDER BY t.calendar_year_id,
              p.class_id;


—using Oracle SQL Developer (or another SQL tool of choice). This typical OLAP query summarizes the data in the UNITS_FACT fact table by year, class, and region for only those rows that have a REGION_ID value of AMER.

If there were no summary tables, this query would have to sift through the CUSTOMER_DIM (dimension) table to find all rows with REGION_ID='AMER' and then find rows in the fact table with those SHIP_TO_ID values. Finally, it would have to summarize the SALES column in the fact table, grouping the data by year, class, and region.

With Oracle OLAP 11g, using materialized views, these totals are immediately accessible. It simply fetches the rows of summarized data and returns them to the SQL processor. The result is much faster query response, with a simple summary management system that is easy to maintain.

If you run an explain plan against the query, you'll see that the Oracle optimizer uses the CUBE_SCAN operation instead of the much less efficient HASH JOIN operation, which is required when you don't have the materialized view. To see the difference, execute the commands 

alter materialized view cb$units_cube disable query rewrite

and 

alter materialized view cb$units_cube enable query rewrite


and view the resulting explain plans in each case. You will also see a huge difference in performance. In my tests on the small (300,000 lowest-level rows) GLOBAL fact table, I found that with the query rewrite feature enabled (using the cube), the query returned results two to five times as fast as without the cube. You will find larger differences in performance when using larger, more-realistically sized cubes.

The summary management system is also easy to maintain. It has one materialized view, rather than a separate materialized view for each combination of levels of each dimension. You can control the extent to which the data is precomputed, but the cube always presents itself as fully solved for every combination of the levels of the dimensions involved. A materialized view log keeps track of the rows that change in the base table, and it controls how much of the cube is incrementally updated and aggregated and when.

Additional Calculations Exposed Through Automatic Views

You can use cubes in Oracle OLAP 11g as a summary management solution, but Oracle OLAP 11g really shines when you use it to perform calculations. Oracle OLAP can perform many calculations that can be difficult to express in SQL and time-consuming to perform with the standard relational engine.

Year-to-date and share calculations are good examples of calculations that perform better with Oracle OLAP. Once defined in the analytic workspace as calculated measures , these measures are exposed as extra columns in a view that is automatically created for a cube. You can then query these measures just as you would query any base measure from the cube view.

When working with cube views, bear in mind that the view contains preaggregated data at multiple levels. As a result, you do not need to include any aggregation functions such as SUM, and you generally do not need GROUP BY clauses. But you must also specify the level of data you want to select via the appropriate WHERE clause.

Enter the query in Listing 1 into Oracle SQL Developer (or your SQL tool of choice) to look at quarterly sales and see how those sales compare on a year-to-date basis with the same periods last year. The query also shows how to return multiple levels of the time dimension in one query.

Code Listing 1: Quarterly sales and comparison query

 

SELECT t.long_description time,
          p.long_description product,
          cu.long_description region,
          ch.long_description channel,
          ROUND(u.sales) SALES,
          ROUND(u.sales_ytd) YTD,
          ROUND(u.sales_ytd_dif_yrago) DIFF,
          ROUND(u.sales_ytd_pct_dif_yrago, 6) PCT
FROM time_view t, product_view p, customer_view cu, channel_view ch, units_cube_view u
WHERE t.dim_key = u.time
   AND p.dim_key = u.product
   AND cu.dim_key = u.customer
   AND ch.dim_key = u.channel
   AND (t.level_name IN('CALENDAR_YEAR', 'CALENDAR_QUARTER') 
     OR t.time_calendar_quarter_id = 'CY1999.Q1')
   AND p.level_name IN('CLASS', 'TOTAL')
   AND cu.level_name = 'TOTAL'
   AND ch.level_name = 'TOTAL'
   AND p.dim_key = 'CLASS_SFT'
ORDER BY t.end_date, t.time_span;


Executing the query in Oracle SQL Developer returns the results shown in Figure 2. Note how rows are returned for years, quarters, and three specified months—all in one view. 

This example shows only the most-basic calculations Oracle OLAP 11g can perform. A new calculated measure wizard in the analytic workspace manager enables you to calculate shares, indexes, ranks, moving aggregates, cumulative aggregates, and user-defined expressions. You can even calculate models, forecasts, and regressions.

Refreshing Cubes, Using Materialized Views

As fact tables acquire new data, Oracle Database must maintain cubes and recalculate totals. You can simplify this procedure by taking advantage of the materialized view refresh capability in Oracle Database. By checking the Enable Materialized View Refresh of the Cube check box (shown in Figure 1), you can specify the refresh of an analytic workspace to be the same as the refresh of any other materialized view. For example, to refresh the GLOBAL analytic workspace, use the following command: 

dbms_mview.refresh('CB$UNIT_CUBE', 'F')


This command reloads the analytic workspace from the fact table and recalculates any aggregates that need updating. The mechanisms and syntax are exactly the same as with noncube materialized views, with methods for controlling aspects such as staleness and schedules. The materialized view logs identify which portions of the cube need reloading. Oracle OLAP then uses this information to reload and recalculate the appropriate portions of the cube.

Next Steps



READ more about
Oracle OLAP
"Building Analytic Workspaces"
Oracle OLAP 11g Release 1 User's Guide

DOWNLOAD
oracle.com/technetwork/products/bi/olap
o38olap-132545.zip

Aggregation

Oracle OLAP always returns data as fully solved, but you have control over the extent to which the data is preaggregated. In Oracle OLAP 10g, you could specify which levels of the cube to preaggregate. That capability is still supported in Oracle OLAP 11g, but now you can also specify the percentage of the cube you want to preaggregate. Oracle OLAP will then figure out which areas of the cube to preaggregate and which areas to aggregate on the fly. Oracle OLAP figures out how to balance runtime performance against load performance and the time needed to calculate certain aggregates. Figure 3 shows a cube creation dialog box that specifies 20 percent preaggregation.  

Presenting the Data

As with previous versions of Oracle OLAP, you can access Analytic Workspace cubes via the Java OLAP API. However, Oracle has positioned SQL as the language of choice for accessing data in Oracle OLAP cubes. With views automatically generated that return cube data via standard SQL, any application environment with a SQL interface can access data from cubes. As a result, Oracle OLAP data can be accessed by Oracle applications such as Oracle Business Intelligence Suite Enterprise Edition Plus, Oracle Discoverer, and Oracle Application Express. In addition, any other application development tool that uses SQL can also access Oracle OLAP data. As an example, Figure 4 shows an Oracle Business Intelligence Suite Enterprise Edition report running against the same Oracle OLAP cube data demonstrated in this article. Oracle OLAP provides all of the totals and computes the various calculations being displayed. 

Other New Features

figure 1
Figure 1: Analytic workspace manager
figure 2
Figure 2: Quarterly and comparison results
figure 3
Figure 3: Cube creation dialog box specifying 20 percent preaggregation
figure 4
Figure 4: Oracle Business Intelligence Suite Enterprise Edition accessing Oracle OLAP cube data

Other new features of Oracle OLAP 11g make it easier to work with analytic workspaces and further integrate OLAP into the database engine. There are new storage and partitioning advisors that simplify the process of designing cubes for optimal storage and maintenance. Furthermore, new data security policies make it easier to grant access to different portions of cubes to various users. A detailed discussion of these capabilities is beyond the scope of this article. For more information, see the Oracle OLAP 11g Release 1 (11.1) User's Guide.

Conclusion

With the enhanced capabilities in Oracle OLAP 11g, it is much easier to incorporate OLAP into the framework of a data warehouse. Key improvements such as materialized views, automatic view creation, and aggregation wizards make Oracle OLAP a compelling choice for accelerating query performance.  



Dan Vlamis
(dvlamis@vlamis.com) is president of Vlamis Software Solutions, Inc. He has been working with Oracle business intelligence products since 1986. He gives presentations regularly at major Oracle conferences and is a recognized Oracle OLAP and Oracle business intelligence expert.

Send us your comments