As Published In
Oracle Magazine
March/April 2005

TECHNOLOGY: Data Warehousing


Room with a Better View

By Arup Nanda Oracle ACE Director

Get a good, fast look at your data with materialized views.

Users at Acme Bank were complaining that typical and repeated queries were taking too long to execute. Because these queries are executed over and over by many users, any improvement in response time is bound to help.

The DBAs at Acme Bank have tuned the most commonly used queries, all the needed indexes are present, and no further SQL tuning is going to make any difference to the performance of these queries.

The Acme Bank DBAs' solution: Use materialized views (MVs).

This article discusses how to plan for MVs, how to set up and confirm different MV capabilities, how to automatically generate the scripts to create MVs, how to make query rewrite (QR) available, and how to make sure that QR gets used.

From Queries to Views

The following is a common query at Acme Bank: 

SELECT acc_type, SUM(cleared_bal) totbal
FROM accounts
GROUP BY acc_type;


And the following is an MV, mv_bal , for this query: 

CREATE OR REPLACE MATERIALIZED VIEW mv_bal
REFRESH ON DEMAND AS
SELECT acc_type, SUM(cleared_bal) totbal
FROM accounts
GROUP BY acc_type;


MVs are segments similar to tables, in which the output of queries is stored in the database.

Now suppose a user wants to get the total of all account balances for the account type 'C' and issues the following query: 

SELECT SUM(cleared_bal)
FROM accounts
WHERE acc_type = 'C';


Because the mv_bal MV already contains the totals by account type, the user could have gotten this information directly from the MV, by issuing the following: 

SELECT totbal
FROM mv_bal
WHERE acc_type = 'C';


This query against the mv_bal MV would have returned results much more quickly than the query against the accounts table. Running a query against the MV will be faster than running the original query, because querying the MV does not query the source tables.

But there is another difference between querying the source tables and the MV. When a user executes a query against the source tables, the results return the current data. Querying the MV, however, often returns the data as of the time the MV was created . And because the MV is not updated when the data in the source tables is, it is bound to get out of sync.

To keep the data in sync, the MV is refreshed from time to time, either manually or automatically. There are two ways to refresh data in MVs. In one of them, the MV is completely wiped clean and then repopulated with data from the source tables—a process known as complete refresh . In some cases, however, when the source tables may have changed very little, it is possible to refresh the MV only for changed records on the source tables—a process known as fast refresh . To use fast refresh, however, you must have created the MV as fast-refreshable. Because it updates only changed records, fast refresh is faster than complete refresh. (See the Oracle Database Data Warehousing Guide for more information on refreshing MVs.)

To make sure that users will query MVs at Acme Bank, even if they don't know anything about the MVs, the bank uses the QR feature. With QR, the database engine can rewrite the original query from the user to use an MV, automatically , so that the user need not be aware of all the MVs in place. In some cases, such as in an OLTP system, it may not be desirable to query from an MV that may not be current with the data in the source tables. QR can be disabled in such cases—either databasewide, for a specific session, or just for specific MVs.

The DBA can enable QR for a session or for the entire system, by setting the QUERY_REWRITE_INTEGRITY parameter to true. For any MV to be used in a QR, the MV must have been built to include the ENABLE QUERY REWRITE clause. For example, the following creates the acc_mgr_view MV with the QR feature from the complex Acme Bank query shown in Listing 1: 

CREATE OR REPLACE MATERIALIZED VIEW
acc_mgr_view
ENABLE QUERY REWRITE AS
<the query shown in Listing 1>
/


Code Listing 1: Original query 

select acc_mgr_id,
   acc_type_desc,
   decode
      (a.sub_acc_type,null,'?', sub_acc_type_desc)
                                                sub_acc_type_desc,
   sum(cleared_bal)                             tot_cleared_bal,
   sum(uncleared_bal)                           tot_uncleared_bal,
   avg(cleared_bal)                             avg_cleared_bal,
   avg(uncleared_bal)                           avg_uncleared_bal,
   sum(cleared_bal+uncleared_bal)       tot_total_bal,
   avg(cleared_bal+uncleared_bal)       avg_total_bal,
   min(cleared_bal+uncleared_bal)       min_total_bal
from balances b,
   accounts a,
   acc_types at,
   sub_acc_types sat
where a.acc_no = b.acc_no
and at.acc_type = a.acc_type
and sat.sub_acc_type (+) = a.sub_acc_type
group by  acc_mgr_id, acc_type_desc,
   decode
      (a.sub_acc_type,null,'?', sub_acc_type_desc)


Estimating Space

Converting the query in Listing 1 to an MV is easy, but there are several questions to answer before creating the MV, including how much space it will occupy and how many rows it will contain.

The DBMS_MVIEW package in Oracle Database 10g will help get these answers. To estimate the size of the proposed MV, the script in Listing 2 calls DBMS_MVIEW.ESTIMATE_MVIEW_SIZE.

Code Listing 2: Estimating MV size 

set serveroutput on size 999999
declare
   l_num_rows  number;
   l_num_bytes number;
   l_stmt      varchar2(2000);
begin
   l_stmt := 'select acc_mgr_id,
      <the query shown in Listing 1>
      (a.sub_acc_type,null,''?'', sub_acc_type_desc)';
   dbms_mview.estimate_mview_size
   (
      stmt_id       => 'Est1',
      select_clause => l_stmt,
      num_rows      => l_num_rows,
      num_bytes     => l_num_bytes
   );
   dbms_output.put_line('Number of rows = '||l_num_rows);
   dbms_output.put_line('Size (bytes) = '||l_num_bytes);
end;
/

Number of rows = 2829000
Size (bytes) = 667644000


The output in Listing 2 estimates that the MV will contain 2,829,000 rows and that it's going to be about 667MB in size. These are approximations made from optimizer statistics gathered on the source tables earlier, and the exact values may be different. But it helps plan for space and determine in which tablespace to place this MV.

Checking Capabilities

The DBMS_MVIEW.EXPLAIN_MVIEW procedure checks the features and capabilities of an MV before it is created and writes the results to a table named MV_CAPABILITIES_TABLE . First, the DBA creates this table, by running the utlxmv.sql script in the rdbms/admin directory under Oracle Home.

Listing 3 uses the DBMS_MVIEW.EXPLAIN_MVIEW procedure to see what type of operations the proposed MV can be used for.

After the first part of Listing 3 populates the results in MV_CAPABILITIES_TABLE , the second part (after the -- Now check the capabilities comment) goes on to select from that table, as shown in the SELECT ROWNUM, CAPABILITY_NAME, ... query. To aid in the explanation of the output, the query uses ROWNUM to denote line numbers.

Code Listing 3: Checking capabilities 

           declare
   l_stmt      varchar2(2000);
begin
   l_stmt := 'select acc_mgr_id,
       <the query shown in Listing 1>
      (a.sub_acc_type,null,''?'', sub_acc_type_desc)';
   dbms_mview.explain_mview
   (
      stmt_id => 'MV_Tune1',
      mv      => l_stmt
   );
end;
/
--
-- Now check the capabilities
--
SELECT ROWNUM, CAPABILITY_NAME, POSSIBLE, MSGTXT, RELATED_TEXT
FROM mv_capabilities_table
WHERE STATEMENT_ID = 'Est1'
AND CAPABILITY_NAME LIKE 'REFRESH%'
ORDER BY SEQ
/
-- Output
--  LN    CAPABILITY_NAME                 P     MSGTXT                   RELATED_TEXT
---    ------------------------------     --    -----------------------  --------------- 
  1    REFRESH_COMPLETE                   Y 
  2    REFRESH_FAST                       N 
  3    REFRESH_FAST_AFTER_INSERT          N     agg(expr) requires         AVG_CLEARED_BAL  
                                                correspondingCOUNT(expr)
                                                function                              
  4    REFRESH_FAST_AFTER_INSERT          N     one or more joins present 
                                                in mv                   
  5    REFRESH_FAST_AFTER_INSERT          N     GROUP BY clause in mv 
  6    REFRESH_FAST_AFTER_INSERT          N     aggregate function in mv 
  7    REFRESH_FAST_AFTER_INSERT          N     the detail table does not   ARUP.SUB_ACC_TYPES
                                                not have a materialized 
                                                view log            
  8    REFRESH_FAST_AFTER_ONETAB_DML      N     SUM(expr) without            TOT_TOTAL_BAL   
                                                COUNT(expr)
  9    REFRESH_FAST_AFTER_ONETAB_DML      N     SUM(expr) without            TOT_UNCLEARED_BAL 
                                                COUNT(expr)
 10    REFRESH_FAST_AFTER_ONETAB_DML      N     SUM(expr) without            TOT_CLEARED_BAL 
                                                COUNT(expr)
 11    REFRESH_FAST_AFTER_ONETAB_DML      N     see the reason why 
                                                REFRESH_FAST_AFTER_INSERT 
                                                is disabled 
 12   REFRESH_FAST_AFTER_ANY_DML          N     see the reason why 
                                                REFRESH_FAST_AFTER_ONETAB_DML 
                                                is disabled 
 13    REFRESH_FAST_PCT                   N     PCT is not possible on any 
                                                of the detail tables in the 
                                                materialized view 

                            

The results in Listing 3 tell the story. The CAPABILITY_NAME column lists the different refresh capabilities, and the column POSSIBLE (P) shows via a simple Y (yes) or N (no) whether that capability is possible in this MV. The MV is capable of being completely refreshed, as shown in line 1, but it's not capable of being fast-refreshed, as shown in line 2. Why not? The DBA goes down the list to see why the refresh-related features are not possible. Line 3 indicates that the fast refresh is not possible because the MV does not have a COUNT() expression in the query where aggregation functions such as SUM() are used. If an MV uses aggregation functions, then COUNT() must be in the query to make it fast-refreshable. The column RELATED_TEXT in the output shows which columns in the MV are being referred to in the MSGTXT column. The result agg(expr) —short for aggregation(expression)—in MSGTXT in line 3 refers to the column AVG_CLEARED_BAL in the MV. Because the AVG(CLEARED_BAL) clause is used in the query to build the MV, there must additionally be a COUNT(CLEARED_BAL) or COUNT(*) expression to make the MV fast-refreshable.

Additionally, fast refreshes require the creation of materialized view logs on the base table. These logs record the changes occurring in the base tables. During fast refresh, these logs are read by the refresh process to determine the changes to apply to the MV. In line 7, the MSGTXT column shows the detail table does not have a materialized view log and the RELATED_TEXT column shows the table on which MV logs are not present— ARUP.SUB_ACC_TYPES.

Generating Scripts

After the Acme DBA uses the information in Listing 3 to make the planned MV capable of fast refresh, it's time to generate the file that will generate the MV. The procedure in Listing 4 specifies a task name and passes it and the SQL query (in Listing 1) to the DBMS_ADVISOR.TUNE_MVIEW procedure to generate all necessary SQL statements to create the MV. After the DBA runs the script in Listing 4, running the following script generates the recommendations of the DBMS Advisor: 

CREATE DIRECTORY TMP_DIR AS '/tmp'
/

BEGIN
   DBMS_ADVISOR.CREATE_FILE (
      DBMS_ADVISOR.GET_TASK_SCRIPT 
         ('MV_Tune_Task1'),
   'TMP_DIR', 'mv_tune_task1.sql'
);
END;


Code Listing 4: Generating scripts 

declare
   l_stmt               varchar2(2000);
   l_task_name        varchar2(30);
begin
   l_task_name := 'MV_Tune_Task1';
   l_stmt        := 'create materialized view acc_mgr_view
                       enable query rewrite as
                       ... <the query shown in Listing 1>...';
   dbms_advisor.tune_mview
   (
      task_name        => l_task_name,
      mv_create_stmt => l_stmt
   );
end;
/


This creates a file named mv_tune_task1.sql in the /tmp directory that contains all the SQL statements needed to create the MV logs on all the necessary base tables and create the MV with all the appropriate parameters to make it fast-refreshable. Running the mv_tune_task1.sql file in SQL*Plus will create all the necessary objects. The mv_tune_task1.sql file is available as Listing 8.

Confirming Query Rewrite

Listing 5 contains another popular user query at Acme Bank; it uses column, table, and function information similar to that of the query in the acc_mgr_view MV. Listing 5 also uses the AUTOTRACE feature of SQL*Plus to provide information on whether QR occurs when the query runs. The AUTOTRACE output clearly shows that the query uses all the source tables, not the materialized view, acc_mgr_view . Why?

Code Listing 5: Confirming query rewrite 

                               
SQL> alter session set query_rewrite_enabled = true;
Session altered.

SQL> set autotrace traceonly explain
SQL> select acc_type_desc, sum(cleared_bal)
  2   from balances b, accounts a, acc_types at
  3     where a.acc_no = b.acc_no
  4     and at.acc_type = a.acc_type
  5   group by acc_type_desc;

Execution Plan
-------------------------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=533 Card=4 Bytes=112)
   1    0   SORT (GROUP BY) (Cost=533 Card=4 Bytes=112)
   2    1     HASH JOIN (Cost=523 Card=50000 Bytes=1400000)
   3    2       HASH JOIN (Cost=146 Card=50000 Bytes=850000)
   4    3       TABLE ACCESS (FULL) OF 'ACC_TYPES' (TABLE) (Cost=3 Card=4 Bytes=40)
   5    3       TABLE ACCESS (FULL) OF 'ACCOUNTS' (TABLE) (Cost=140 Card=100000 Bytes=700000)
   6    2     TABLE ACCESS (FULL) OF 'BALANCES' (TABLE) (Cost=106 Card=100000 Bytes=1100000)

                            

Running the utlxrw.sql script in $ORACLE_HOME/rdbms/admin sets up a table named REWRITE_TABLE , whose results will help determine why the optimizer did not use QR on the query in Listing 5. Running the DBMS_VIEW.EXPLAIN_REWRITE procedure, shown in Listing 6, populates the REWRITE_TABLE table with the findings. Querying the MESSAGE column in REWRITE_TABLE provides the reason why: 

SQL> SELECT MESSAGE FROM
REWRITE_TABLE;

QSM-01110: a lossy join in MV, 
ACC_MGR_VIEW, between tables, ACCOUNTS and SUB_ACC_TYPES, not found in query


Code Listing 6: Checking for query rewrite 

truncate table rewrite_table
/

declare
   l_stmt          varchar2(2000);
   l_task_name   varchar2(30);
begin
   l_stmt := 'select 
       acc_type_desc,
       sub_acc_type,
       sum(cleared_bal)  
from 
   balances b, 
   accounts a, 
   acc_types at
where 
   a.acc_no = b.acc_no
and 
   at.acc_type = a.acc_type
group by 
   acc_type_desc, sub_acc_type';

   dbms_mview.explain_rewrite
   (
      query           => l_stmt,
      mv              => 'ACC_MGR_VIEW',
      statement_id  => 'MV_Explain_RW1'
   );
end;
/
commit
/
select message from rewrite_table
/


The result shows that QR will not occur because the joins used in the MV and in the query are different. In the MV, four tables— ACCOUNTS, BALANCES, ACC_TYPES , and ACC_SUB_TYPES —are joined. However, the new query joins only three tables, leaving SUB_ACC_TYPES , which is reported in the MESSAGE column. The absence of this join would have meant inaccurate results if QR were used, so the optimizer decided not to use it.

Rewriting the query in Listing 5 to include another predicate, AND SAT.ACC_SUB_TYPE = A.ACC_SUB_TYPE , and place the table ACC_SUB_TYPE SAT in the FROM clause should make QR possible. This revised query is in Listing 7. After the DBA passes the revised query to the procedure in Listing 6, examining the REWRITE_TABLE provides the good news: 

SQL> SELECT MESSAGE 
         FROM REWRITE_TABLE;

QSM-01033: query rewritten with 
materialized view, ACC_MGR_VIEW


Code Listing 7: Rewrite of Listing 5, confirming query rewrite 

                               
SQL> set autotrace traceonly explain
SQL> select acc_type_desc,
  2           sum(cleared_bal)
  3   from balances b, 
  4          accounts a,
  5          acc_types at,
  6          sub_acc_types sat,
  7     where a.acc_no = b.acc_no
  8        and at.acc_type = a.acc_type
  9        and sat.sub_acc_type = a.sub_acc_type
 10   group by acc_type_desc

SQL> /

Execution Plan
----------------------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=6 Card=4 Bytes=64)
   1    0   SORT (GROUP BY) (Cost=6 Card=4 Bytes=64)
   2    1     MAT_VIEW REWRITE ACCESS (FULL) OF 'ACC_MGR_VIEW'
(MAT_VIEW REWRITE) (Cost=5 Card=1600 Bytes=25600)


Running the revised query in Listing 7 confirms that the optimizer rewrites the query and generates the execution plan, also included in Listing 7. This plan shows that the ACC_MGR_VIEW MV is used instead of the base tables, even though the user specified them in the query. QR is indeed happening.

Code Listing 8: Generated script for creating MV and MV logs 

                               
Rem  SQL Access Advisor: Version 10.1.0.1 - Production
Rem  
Rem  Username:        ARUP
Rem  Task:              MV_Tune_Task1
Rem  Execution date:   
Rem  

set feedback 1
set linesize 80
set trimspool on
set tab off
set pagesize 60

whenever sqlerror CONTINUE

CREATE MATERIALIZED VIEW LOG ON
    "ARUP"."ACCOUNTS"
    WITH ROWID, SEQUENCE("ACC_NO","ACC_TYPE","SUB_ACC_TYPE","ACC_MGR_ID")
    INCLUDING NEW VALUES;

ALTER MATERIALIZED VIEW LOG FORCE ON
    "ARUP"."ACCOUNTS"
    ADD ROWID, SEQUENCE("ACC_NO","ACC_TYPE","SUB_ACC_TYPE","ACC_MGR_ID")
    INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW LOG ON
    "ARUP"."BALANCES"
    WITH ROWID, SEQUENCE("ACC_NO","CLEARED_BAL","UNCLEARED_BAL")
    INCLUDING NEW VALUES;

ALTER MATERIALIZED VIEW LOG FORCE ON
    "ARUP"."BALANCES"
    ADD ROWID, SEQUENCE("ACC_NO","CLEARED_BAL","UNCLEARED_BAL")
    INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW LOG ON
    "ARUP"."ACC_TYPES"
    WITH ROWID, SEQUENCE("ACC_TYPE","ACC_TYPE_DESC")
    INCLUDING NEW VALUES;

ALTER MATERIALIZED VIEW LOG FORCE ON
    "ARUP"."ACC_TYPES"
    ADD ROWID, SEQUENCE("ACC_TYPE","ACC_TYPE_DESC")
    INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW LOG ON
    "ARUP"."SUB_ACC_TYPES"
    WITH ROWID, SEQUENCE("SUB_ACC_TYPE","SUB_ACC_TYPE_DESC")
    INCLUDING NEW VALUES;

ALTER MATERIALIZED VIEW LOG FORCE ON
    "ARUP"."SUB_ACC_TYPES"
    ADD ROWID, SEQUENCE("SUB_ACC_TYPE","SUB_ACC_TYPE_DESC")
    INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW ARUP.ACC_MGR_VIEW
    REFRESH FAST WITH ROWID
    ENABLE QUERY REWRITE
    AS SELECT ARUP.SUB_ACC_TYPES.SUB_ACC_TYPE_DESC C1, ARUP.ACC_TYPES.ACC_TYPE_DESC
C2, ARUP.ACCOUNTS.ACC_MGR_ID C3, ARUP.ACCOUNTS.SUB_ACC_TYPE C4, SUM("ARUP"."BALANCES"."CLEARED_BAL")
M1, COUNT("ARUP"."BALANCES"."CLEARED_BAL") M2, SUM("ARUP"."BALANCES"."UNCLEARED_BAL")
M3, COUNT("ARUP"."BALANCES"."UNCLEARED_BAL") M4, SUM(("ARUP"."BALANCES"."UNCLEARED_BAL"
+ "ARUP"."BALANCES"."CLEARED_BAL")) M5, COUNT(("ARUP"."BALANCES"."UNCLEARED_BAL"
+ "ARUP"."BALANCES"."CLEARED_BAL")) M6, COUNT(*) M7 FROM ARUP.SUB_ACC_TYPES,
ARUP.ACC_TYPES, ARUP.BALANCES, ARUP.ACCOUNTS WHERE ARUP.ACCOUNTS.SUB_ACC_TYPE
= ARUP.SUB_ACC_TYPES.SUB_ACC_TYPE AND ARUP.ACCOUNTS.ACC_TYPE = ARUP.ACC_TYPES.ACC_TYPE
AND ARUP.ACCOUNTS.ACC_NO = ARUP.BALANCES.ACC_NO GROUP BY ARUP.SUB_ACC_TYPES.SUB_ACC_TYPE_DESC,
ARUP.ACC_TYPES.ACC_TYPE_DESC, ARUP.ACCOUNTS.ACC_MGR_ID, ARUP.ACCOUNTS.SUB_ACC_TYPE;

whenever sqlerror EXIT SQL.SQLCODE

begin
  dbms_advisor.mark_recommendation('MV_Tune_Task1',1,'IMPLEMENTED');
end;
/

                            

Preventing Runaway Queries

As demonstrated in Listing 5, if the user writes the query a little differently, the optimizer does not rewrite the query to select from the MV and the query selects from the base tables directly. The user may think that the query is rewritten and that the results are coming from the MV, but in fact, the query is not rewritten and the results are coming from the source tables, extending the response time significantly.

Next Steps


 READ more about DBMS_MVIEW

The REWRITE_OR_ERROR hint in the SQL query will make sure the query is rewritten to select from the MV or at least inform the user if that does not happen. The following adds the REWRITE_OR_ERROR hint to the SQL query in which QR did not occur (Listing 5): 

SELECT /*+ REWRITE_OR_ERROR */
       SELECT acc_type_desc, 
       sub_acc_type_desc,
...


The query returns the following error message if it does not rewrite: 

ORA-30393: a query block in the 
statement did not rewrite


Conclusion

Oracle provides tools for creating, managing, and tuning MVs in the DBMS_MVIEW package. Acme Bank is seeing increased query performance, using QR whenever possible and using error messages from queries that do not rewrite to further optimize its use of MVs and QR.


Arup Nanda (arup@proligence.com) is the manager of Database Systems at Starwood Hotels and Resorts in White Plains, New York. He is the recipient of Oracle Magazine's 2003 DBA of the Year award and the coauthor of Oracle Privacy Security Auditing, from Rampant Press (rampant-books.com), 2003.


Send us your comments