Using the SQL MODEL Clause to Define Inter-Row Calculations
In this tutorial you learn how to use the Oracle Database 10g SQL MODEL clause to perform inter-row calculations.
Approximately 30 minutes
This tutorial covers the following topics:
Place the cursor over this icon to load and view all the screenshots for this tutorial. (Caution: This action loads all screenshots simultaneously, so response time may be slow depending on your Internet connection.)
Note: Alternatively, you can place the cursor over an individual icon in the following steps to load and view only the screenshot associated with that step. You can hide an individual screenshot by clicking it.
With the SQL MODEL clause, you can define a multidimensional array on query results and then apply rules on the array to calculate new values. The rules can be sophisticated interdependent calculations. By integrating advanced calculations into the database, performance, scalability, and manageability are enhanced significantly compared to external solutions. Rather than copying data into separate applications or PC spreadsheets, users can keep their data within the Oracle environment.
The MODEL clause defines a multidimensional array by mapping the columns of a query into three groups: partitioning, dimension, and measure columns. These elements perform the following tasks:
Partitions define logical blocks of the result set in a way similar to the partitions of the analytical functions (described in the chapter titled "SQL for Analysis in Data Warehouses" in the Data Warehousing Guide). MODEL rules are applied to the cells of each partition. |
|
Dimensions identify each measure cell within a partition. These columns identify characteristics such as date, region, and product name. |
|
Measures are analogous to the measures of a fact table in a star schema. They typically contain numeric values such as sales units or cost. Each cell is accessed within its partition by specifying its full combination of dimensions. |
To create rules on these multidimensional arrays, you define computation rules expressed in terms of the dimension values. The rules are flexible and concise, and can use wild cards and FOR loops for maximum expressiveness. Calculations built with the MODEL clause improve on traditional spreadsheet calculations by integrating analyses into the database, improving readability with symbolic referencing, and providing scalability and much better manageability.
The figure below gives a conceptual overview of the model feature using a hypothetical sales table. The table has columns for country, product, year, and sales amount. The figure has three parts. The top segment shows the concept of dividing the table into partitioning, dimension, and measure columns. The middle segment shows two hypothetical rules that forecast sales for Prod1 and Prod2 as the calculated value of product sales from the two previous years. Finally, the third part shows the output of a query applying the rules to such a table with hypothetical data. The black output is data retrieved from the database, whereas the blue output shows rows calculated from rules. Note that the rules are applied within each partition.
Columns mapped to Partition, Dimension, and Measure
COUNTRY | PRODUCT | YEAR | SALES |
Partition | Dimension | Dimension | Measure |
Rules:
sales('prod1', 2002) = sales('prod1', 2000) + sales('prod1', 2001) sales('prod2', 2002) = sales('prod2', 2000) + sales('prod2', 2001) |
Output of the MODEL clause:
COUNTRY | PRODUCT | YEAR | SALES |
Partition | Dimension | Dimension | Measure |
A | prod1 | 2000 | 10 |
A | prod1 | 2001 | 15 |
A | prod2 | 2000 | 12 |
A | prod2 | 2001 | 16 |
B | prod1 | 2000 | 21 |
B | prod1 | 2001 | 23 |
B | prod2 | 2000 | 28 |
B | prod2 | 2001 | 29 |
A | prod1 | 2002 | 25 |
A | prod2 | 2002 | 28 |
B | prod1 | 2002 | 44 |
B | prod2 | 2002 | 57 |
Note that the MODEL clause does not update existing data in tables, nor does it insert new data into tablesto change values in a table, the model results must be supplied to an INSERT, UPDATE, or MERGE statement.
By using the MODEL clause, you can bring spreadsheet calculations into the database. You use the Sales History schema data and issue SELECT statements containing the new MODEL clause to perform spreadsheet like calculations. You identify rules to find out information such as sales forecasting.
Before starting this tutorial, you should:
1. |
Perform the Installing Oracle Database 10g on Windows tutorial. |
2. |
Download and unzip model_clause.zip into your working directory (i.e. c:\wkdir) |
Use the SH schema to create a view. This view provides annual sums for product sales in dollars and units, by country, and aggregated across all channels.
1. |
Start a SQL *Plus session. Select Start > Programs > Oracle-OraDB10g_home > Application Development > SQL Plus. (Note: This tutorial assumes you have an c:\wkdir folder. If you do not, you will need to create one and unzip the contents of model_clause.zip into this folder. While executing the scripts, paths are specified) |
2. |
Log in as the SH user. Enter SH as the User Name and SH as the Password. Then click OK. |
3. |
First, make sure that you have a clean environment. Run the cleanup.sql script from your SQL*Plus session. @c:\wkdir\cleanup.sql The cleanup.sqlscript contains the following: DROP VIEW sales_view;
|
4. |
Now you can create the SALES_VIEW view. From your SQL*Plus session, execute the following script: @c:\wkdir\sample_data.sql The sample_data.sqlscript contains the following: CREATE VIEW sales_view AS SELECT country_name country, prod_name prod, calendar_year year, SUM(amount_sold) sale, COUNT(amount_sold) cnt FROM sales, times, customers, countries, products WHERE sales.time_id = times.time_id AND sales.prod_id = products.prod_id AND sales.cust_id = customers.cust_id AND customers.country_id = countries.country_id GROUP BY country_name, prod_name, calendar_year /
|
5. |
Verify that the view is created correctly and that 3219 rows exist. From your SQL*Plus session, execute the following script: @c:\wkdir\sel_sv.sql The sel_sv.sqlscript contains the following: SELECT COUNT(*) FROM sales_view;
|
6. |
To maximize performance, your system should already have a materialized view built on the data that is used by the above view. The materialized view is created during the installation of the SH schema data. Oracle's summary management system will automatically rewrite any query using the above view so that it takes advantage of the materialized view. |
As an initial example of models, consider the following statement:
SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales FROM sales_view WHERE country IN ('Italy','Japan') MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales) RULES ( sales['Bounce', 2002] = sales['Bounce', 2001] + sales['Bounce', 2000], sales['Y Box', 2002] = sales['Y Box', 2001], sales['2_Products', 2002] = sales['Bounce', 2002] + sales['Y Box', 2002]) ORDER BY country, prod, year;
The results are:
COUNTRY PROD YEAR SALES -------------------- --------------- ---------- ---------- Italy 2_Products 2002 90387.54 Italy Bounce 2002 9179.99 Italy Y Box 2002 81207.55 Japan 2_Products 2002 101071.96 Japan Bounce 2002 11437.13 Japan Y Box 2002 89634.83
This statement partitions by country, so the rules are applied to the data of one country at a time. Note that the data ends with 2001, so any rules defining values for 2002 or later will insert new cells. The first rule defines the sales of Bounce in 2002 as the sum of sales in 2000 and 2001. The second rule defines the sales for Y Box in 2002 as being the same value as they were for 2001. The third rule defines a category called 2_Products, which is the sum of adding the 2002 Bounce and Y Box values together. Note that the values for 2_Products are derived from the results of the two prior rules, so those rules must be executed before the 2_Products rule.
Note that the RETURN UPDATED ROWS clause following the MODEL keyword limits the results to just those rows that were created or updated in this query. Using this clause is a convenient way to limit result sets to just the newly calculated values. You will use the RETURN UPDATED ROWS clause throughout the examples. |
||
The RULES keyword, shown in the examples at the start of the rules, is optional, but recommended for easier reading. | ||
Many of the examples shown do not require ORDER BY on the COUNTRY column. It is included in the specification in case you want to modify the examples and add multiple countries. |
The following examples move through the major features of
the MODEL
clause, building from basic cell references to reference models and iterative
models.
1. |
You want to view the SALES value for the product Bounce in the year 2000, in Italy, and set it to 10. To do so, use a "positional cell reference." The value for the cell reference is matched to the appropriate dimension based on its position in the expression. The DIMENSION BY clause of the model determines the position assigned to each dimension—in this case, the first position is product (PROD) and the second position is YEAR. From your SQL*Plus session, execute the following script: @c:\wkdir\pos_cell1.sql The pos_cell1.sqlscript contains the following: COLUMN country FORMAT a20 COLUMN prod FORMAT a20 SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales FROM sales_view WHERE country='Italy' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales) RULES ( sales['Bounce', 2000] = 10 ) ORDER BY country, prod, year /
|
2. |
You want to create a forecast value of SALES for the product Bounce in the year 2005, in Italy, and set it to 20. Use a rule in the SELECT statement that sets the year value to 2005 and thus create a new cell in the array. From your SQL*Plus session, execute the following script: @c:\wkdir\pos_cell2.sql The pos_cell2.sqlscript contains the following: SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales FROM sales_view WHERE country='Italy' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales) RULES ( sales['Bounce', 2005] = 20 ) ORDER BY country, prod, year / Note: If you want to create new cells, such as values for future years, you must use positional references or FOR loops (discussed later in this tutorial). That is, positional reference permits both updates and inserts into the array. This is called the UPSERT process and is handled with the Oracle SQL MERGE statement.
|
3. |
You want to update the SALES for the product Bounce in all years after 1999 where the values are recorded for Italy and set them to 10. To do so, use a "symbolic cell reference." The value for the cell reference is matched to the appropriate dimension using Boolean conditions. You can use all the normal operators such as <,>, IN, and BETWEEN. In this case the query looks for product value equal to Bounce and any year value greater than 1999. This shows how a single rule can access multiple cells. From your SQL*Plus session, execute the following script: @c:\wkdir\sym_cell1.sql The sym_cell1.sqlscript contains the following: SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales FROM sales_view WHERE country='Italy' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales) RULES ( sales[prod='Bounce', year>1999] = 10 ) ORDER BY country, prod, year / Note: Symbolic references are very powerful, but they are used solely for updating existing cells: they cannot create new cells such as sales projections in future years. |
4. |
You want a single query to update the sales for several products in several years for multiple countries, and you also want it to insert new cells. By placing several rules into one query, processing is more efficient because it reduces the number of times needed to access the data. It also allows for more concise SQL, which supports higher developer productivity. From your SQL*Plus session, execute the following script: @c:\wkdir\pos_sym.sql The pos_sym.sqlscript contains the following: SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales FROM sales_view WHERE country IN ('Italy','Japan') MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales) RULES ( sales['Bounce', 2002] = sales['Bounce', year = 2001] , --positional notation: can insert new cell sales['Y Box', year>2000] = sales['Y Box', 1999], --symbolic notation: can update existing cell sales['2_Products', 2005] = sales['Bounce', 2001] + sales['Y Box', 2000] ) --positional notation: permits insert of new cells --for new product ORDER BY country, prod, year / The example data has no values beyond the year 2001, so any rule involving the year 2002 or later requires insertion of a new cell. The same applies to any new product name defined here. In the third rule, 2_Products is defined as a product with sales in 2005 which equal the sum of Bounce in 2001 and Y Box in 2000. The first rule, for Bounce in 2002, inserts new cells since it is positional notation. The second rule, for Y Box, uses symbolic notation, but since there are already values for Y Box in the year 2001, it updates those values. The third rule, for 2_Products in 2005, is positional, so it can insert new cells, and you will see them in the output.
|
1. |
You want to forecast the sales of Bounce in Italy for the year 2005 to be 100 more than the maximum sales in the period 1999 to 2001. To do so, you need to use the BETWEEN clause to specify multiple cells on the right side of the rule, and these are aggregated to a single value with the MAX() function. From your SQL*Plus session, execute the following script: @c:\wkdir\multi_c.sql The multi_c.sqlscript contains the following: SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales FROM sales_view WHERE country='Italy' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales) RULES ( sales['Bounce', 2005] = 100 + max(sales)['Bounce', year BETWEEN 1998 AND 2002] ) ORDER BY country, prod, year / Note that aggregate functions can appear only on the right side of rules. Arguments to the aggregate function can be constants, bind variables, measures of the MODEL clause, or expressions involving them. |
The CV() function is a very powerful tool that makes rule creation highly productive. CV() is used on the right side of rules to copy the current value of a dimension specified on the left side. It is helpful wherever the left side specifications refer to multiple cells. In terms of relational database concepts, it acts like a join operation.
CV() allows for very flexible expressions. For instance, by subtracting from the CV(year) value you can refer to other rows in the data set. If you have the expression 'CV(year) -2' in a cell reference, you can access data from two years earlier. CV() functions are most commonly used as part of a cell reference, but they can also be used outside a cell reference as freestanding elements of an expression.
1. |
You want to update the sales values for Bounce in Italy for multiple years, using a rule where each year's sales is the sum of Mouse Pad sales for that year, plus 20% of the Y Box sales for that year. From your SQL*Plus session, execute the following script: @c:\wkdir\cvf1.sql The cvf1.sqlscript contains the following: SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales FROM sales_view WHERE country='Italy' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales) RULES ( sales['Bounce', year BETWEEN 1995 AND 2002] = sales['Mouse Pad', cv(year)] + 0.2 * sales['Y Box', cv(year)]) ORDER BY country, prod, year / Note that, in the above results, you see values for only years 1999–2001 although any year in the range 1995–2002 is accepted. This is because the table has data for only those years. The CV() function provides the current value of a DIMENSION BY key of the cell currently referenced on the left side. When the left side of the rule above references the cell Bounce and 1999, the right side expression would resolve to: sales['Mouse Pad', 1999] + 0.2 * sales['Y Box', 1999] Similarly, when the left side references the cell Bounce and 2000, the right side expression evaluates to: sales['Mouse Pad', 2000] + 0.2 * sales['Y Box', 2000] The CV() function takes a dimension key as its argument. It is also possible to use CV() without any argument as in CV() and in which case, positional referencing is implied. The above rule can also be written as: s['Bounce', year BETWEEN 1995 AND 2002] = s['Mouse Pad', cv()] + 0.2 * s['Y Box', cv()] CV() functions can be used only in right side cell references.
|
2. | You want to calculate the year-over-year percent growth in sales for products Y Box, Bounce and Mouse Pad in Italy. From your SQL*Plus session, execute the following script:
@c:\wkdir\cvf2.sql The cvf2.sqlscript contains the following: SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales, growth FROM sales_view WHERE country='Italy' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales, 0 growth) RULES ( growth[prod in ('Bounce','Y Box','Mouse Pad'), year between 1998 and 2001] = 100* (sales[cv(prod), cv(year)] - sales[cv(prod), cv(year) -1] ) / sales[cv(prod), cv(year) -1] ) ORDER BY country, prod, year / Note that the blank cells in the results are NULLs. The rule results in a NULL if there is no value for the product two years earlier. None of the products has a value for 1998, so in each case the 1999 growth calculation is NULL. |
3. | A wild card operator is very useful for cell specification, and you
can use the ANY keyword
for this purpose. You can use it with the previous example to replace
the specification year between 1998 and 2001 as shown below.
ANY can be used in cell references to include all dimension values including NULLs. In symbolic reference notation, use the phrase IS ANY. Note that the ANY wildcard prevents cell insertion when used with either positional or symbolic notation. From your SQL*Plus session, execute the following script: @c:\wkdir\any.sql The any.sqlscript contains the following: SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales, growth FROM sales_view WHERE country='Italy' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales, 0 growth) RULES ( growth[prod in ('Bounce','Y Box','Mouse Pad'), ANY] = 100* (sales[cv(prod), cv(year)] - sales[cv(prod), cv(year) -1] ) / sales[cv(prod), cv(year) -1] ) ORDER BY country, prod, year / This query gives the same results as the previous query because the full data set ranges from 1998 to 2001, and that is the range that was specified in the previous query. |
The MODEL clause provides a FOR construct that can be used inside rules to express computations more concisely. The FOR construct is allowed on both sides of rules. For example, consider the following rules that estimate the sales of several products for year 2005 to be 30% higher than their sales for year 2001:
RULES
(
sales['Mouse Pad', 2005] = 1.3 * sales['Mouse Pad', 2001],
sales['Bounce', 2005] = 1.3 * sales['Bounce', 2001],
sales['Y Box', 2005] = 1.3 * sales['Y Box', 2001]
)
By using positional notation on the left side of the rules, you ensure that cells for these products in the year 2005 will be inserted if they are not previously present in the array. This is rather bulky because you may require as many rules as there are products. If you work with dozens of products, it becomes an unwieldy approach.
You can reword this computation so it is concise and has exactly the same behavior:
SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales FROM sales_view WHERE country='Italy' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales) RULES ( sales[FOR prod in ('Mouse Pad', 'Bounce', 'Y Box'), 2005] = 1.3 * sales[cv(prod), 2001] ) ORDER BY country, prod, year;
This results in:
COUNTRY PROD YEAR SALES
-------------------- --------------- ---------- ----------
Italy Bounce 2005 6407.245
Italy Mouse Pad 2005 6402.63
Italy Y Box 2005 108308.304
If you write a specification similar to the one above, but without the FOR keyword, only cells that already exist would be updated, and no new cells would be inserted. In the SH data, that would mean no rows are returned. Following is that query:
SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales FROM sales_view WHERE country='Italy' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales) RULES ( sales[prod in ('Mouse Pad', 'Bounce', 'Y Box'), 2005] = 1.3 * sales[cv(prod), 2001] ) ORDER BY country, prod, year;
no rows selected
You can view the FOR construct as generating multiple rules with positional references from a single rule, thus enabling creation of new cells (UPSERT behavior).
Note that the MODEL clause has limits on the number of rules supported in a single model, and FOR loops can create enough virtual rules to reach these limits. The limits and techniques to stay within them are described in the Oracle Data Warehousing Guide, chapter 22, SQL for Modeling."
If you know that the needed dimension values come from a sequence with regular intervals, you can use another form of the FOR construct:
FOR dimension FROM value1 TO value2 [INCREMENT | DECREMENT] value3
This specification results in values between value1 and value2 by starting from value1 and incrementing (or decrementing) by value3.
1. |
You want to specify projection sales values for Mouse Pad for the years 2005 to 2012 so that they are equal to 120% of the value in 2001. From your SQL*Plus session, execute the following script: @c:\wkdir\for.sql The for.sqlscript contains the following: SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales FROM sales_view WHERE country='Italy' MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales) RULES ( sales['Mouse Pad', FOR year FROM 2005 TO 2012 INCREMENT 1] = 1.2 * sales[cv(prod), 2001] ) ORDER BY country, prod, year / This kind of FOR construct can be used for dimensions of numeric, date, and datetime data types. The increment/decrement expression value3 should be numeric for numeric dimensions and can be numeric or interval for dimensions of date or datetime types. There are other methods to use the FOR construct, and they are described in detail in the Data Warehousing Guide. The most important of these other methods is to use a SQL subquery as the argument for an IN operator. When using FOR constructs with subqueries, it is essential to examine the total number of rules that the FOR construct may generate and make sure that they will not exceed the 10,000 rule limit. |
By default, rules are evaluated in the order in which they appear in the MODEL clause. An optional keyword, SEQUENTIAL ORDER, can be specified in the MODEL clause to make such an evaluation order explicit. SQL models with a sequential rule order of evaluation are called Sequential Order models.
To have models calculated so that all rule dependencies
are considered and processed in correct order, use the
AUTOMATIC ORDER keywords. When a model has a large number of rules, it
may be more efficient to use the AUTOMATIC
ORDER option than to manually check that the rules are listed in a logically
correct sequence. This enables more productive development and maintenance of
models.
1. |
You can have a model with many rules, which creates new product values based on other products. To ensure that the rules will be executed in correct sequence so that no dependencies are missed, use the AUTOMATIC ORDER keywords. The example below contains three rules to illustrate this concept. From your SQL*Plus session, execute the following script: @c:\wkdir\s_o.sql The s_o.sqlscript contains the following: SELECT SUBSTR(country,1,20) country,
This query should not calculate the values for Bounce and Y Box before 2_Products, and 2_Products are assigned null values. |
Applications using SQL MODELs would not only have to deal with non-deterministic values for a cell measure in the form of stored NULL entries, but also with non-determinism in the form of missing cells. A cell, referenced by a single cell reference, which is missing in the query's data is called a missing cell. The MODEL clause provides a default treatment for nulls and missing cells and also provides options that applications can use to treat non-deterministic values according to their business logic. By default, NULL cell measure values are treated the same way as nulls are treated elsewhere in SQL. Missing cells are treated as cells with NULL measure values. For example, the following query yields a NULL for sales because the data set does not include 2004 values:
SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country='Italy'
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (
sales['Mouse Pad', 2005] =
sales['Mouse Pad', 1999] + sales['Mouse Pad', 2004])
ORDER BY country, prod, year;
COUNTRY PROD YEAR SALES -------------------- --------------- ---------- ---------- Italy Mouse Pad 2005
Because NULL values cause many rules to return nulls, it may be more useful for you to treat nulls and missing values as 0 values. In this way, nulls will not be propagated through a set of calculations. You can use the IGNORE NAV option (NAV stands for Non-Available Values) to default nulls and missing cells to the following values:
0 for numeric data | |
Empty string for character/string data |
|
01-JAN-2001 for date type data | |
NULL for all other data types |
Note that the default behavior is KEEP NAV, which treats nulls in the standard manner and treats missing values as nulls. For more details, see the SQL MODEL chapter in the Data Warehousing Guide.
1. |
Convert the query shown above to return a numeric value for sales even though the value for 2004 is missing. From your SQL*Plus session, execute the following script: @c:\wkdir\i_n.sql The i_n.sqlscript contains the following: SELECT SUBSTR(country,1,20) country, |
In addition to the multidimensional array on which rules operate, which is called the Main SQL MODEL, one or more read-only multidimensional arrays, called a Reference MODEL, can be created and referenced in the MODEL clause to act as look-up tables. Using Reference MODELs, you can relate objects of different dimensionality. Like a Main SQL MODEL, a Reference MODEL is defined over a query block and has the DIMENSION BY and MEASURE clauses to indicate its dimensions and measures respectively. A Reference MODEL is created by the following subclause of the MODEL clause:
REFERENCE model_name ON (query) DIMENSION BY (cols) MEASURES (cols) [reference options
Reference models can be used only in the right side of rules and the PARTITION clause is not available in reference models.
1. |
Convert projected sales figures of different countries, each in their own currency, into US currency and show both figures. You need to create a table with conversion ratios of local currencies to the US dollar. From your SQL*Plus session, execute the following script: @c:\wkdir\cre_dc.sql The cre_dc.sqlscript contains the following: CREATE TABLE dollar_conv(country VARCHAR2(30), exchange_rate NUMBER)
|
2. | Insert two rows into the DOLLAR_CONV
table. From your SQL*Plus session, execute the following script:
@c:\wkdir\ins_dc.sql The ins_dc.sqlscript contains the following: INSERT INTO dollar_conv VALUES('Canada', 0.75)
|
3. | Base the sales on the 2001 figures and project market growth by 2005 to be 22% in Canada and 34% in Brazil.
To convert the projected sales of Canada and Brazil for year 2005 to US dollars, you can use a Reference MODEL. From your SQL*Plus session, execute the following script: @c:\wkdir\rm.sql The rm.sqlscript contains the following: SELECT SUBSTR(country,1,20) country, year, localsales, dollarsales FROM sales_view WHERE country IN ( 'Canada', 'Brazil') GROUP BY country, year MODEL RETURN UPDATED ROWS REFERENCE conv_refmodel ON ( SELECT country, exchange_rate AS er FROM dollar_conv) DIMENSION BY (country) MEASURES (er) IGNORE NAV MAIN main_model DIMENSION BY (country, year) MEASURES (SUM(sale) sales, 0 localsales, 0 dollarsales) IGNORE NAV RULES ( /* assuming that sales in Canada grow by 22% */ localsales['Canada', 2005] = sales[cv(country), 2001] * 1.22, dollarsales['Canada', 2005] = sales[cv(country), 2001] * 1.22 * conv_refmodel.er['Canada'], /* assuming that economy in Brazil grows by 34% */ localsales['Brazil', 2005] = sales[cv(country), 2001] * 1.34, dollarsales['Brazil', 2005] = sales['Brazil', 2001] * 1.34 * er['Brazil'] ) / Note the following:
Growth rates in this example are hard-coded in the rules: growth rate for Canada is 22% and that of Brazil is 34%. Your rules would be much more flexible if they could work with growth values looked up from a separate table of growth rates. Such a table could cover many years and countries. |
4. | Use both exchange rate and growth rate reference models to find the projected sales in local currency and US dollars for 2002. Create a table that stores the percentage growth by country and year. From your SQL*Plus session, execute the following script: @c:\wkdir\cre_gr.sql The cre_gr.sqlscript contains the following: CREATE TABLE growth_rate(country VARCHAR2(30), year NUMBER, growth_rate NUMBER) /
|
5. |
Insert rows into the GROWTH_RATE table. From your SQL*Plus session, execute the following script: @c:\wkdir\ins_gr.sql The ins_gr.sqlscript contains the following INSERT INTO growth_rate VALUES('Brazil', 2002, 2.5)
|
6. | Write a query that calculates sales for Brazil and Canada, applying the 2002 growth figures and converting the values to dollars. Use the reference model shown below in your query. From your SQL*Plus session, execute the following script: @c:\wkdir\rm2.sql The rm2.sqlscript contains the following: SELECT SUBSTR(country,1,20) country, year, localsales, dollarsales Note the following:
|
By using the ITERATE option of the MODEL clause, you can evaluate rules iteratively a specified number of times. The number of iterations is specified as an argument to the ITERATE clause. ITERATE can be specified only for SEQUENTIAL ORDER models. Use iterative models to calculate models where the rules are interdependent.
The syntax of the ITERATE clause is:
ITERATE (number_of_iterations) [ UNTIL (condition) ]
The number_of_iterations argument to ITERATE clause is a positive integer constant. Optionally, you can specify an early termination condition to stop rule evaluation before reaching the maximum iteration. This condition is specified in the UNTIL subclause of ITERATE and is checked at the end of an iteration. So, you will always have at least one iteration when ITERATE is specified.
Iterative evaluation will stop either after finishing the specified number of iterations or when the termination condition evaluates to TRUE, whichever comes first. In some cases you may want the termination condition to be based on the change, across iterations, in the value of a cell. Oracle Database 10g provides a mechanism to specify such conditions by allowing you to access cell values as they existed before and after the current iteration in the UNTIL condition. Use the PREVIOUS function which takes a single cell reference as argument and returns the measure value of the cell as it existed after the previous iteration. You can also access the current iteration number by using the ITERATION_NUMBER system variable. ITERATION_NUMBER starts at value 0 and is incremented after each iteration. By using PREVIOUS and ITERATION_NUMBER, you can construct complex termination conditions.
1. |
You want to do financial planning for a person who earns a salary of $100,000 and has a capital gain of $15,000. His net income will be calculated as salary minus interest payments minus taxes. He pays tax-deductible interest on a loan. He also pays taxes at two rates: 28% for the salary income after interest expense is deducted, and 38% on capital gains. This person would like his interest expense to represent exactly 30% of his income. How can you calculate the taxes, interest expense, and net income that will result? All values of this scenario are stored in a table called LEDGER. The table holds the labels for a financial item in one column and the value of the item in another. From your SQL*Plus session, execute the following script: @c:\wkdir\cre_led.sql The cre_led.sqlscript contains the following: CREATE TABLE ledger (account VARCHAR2(20), balance NUMBER(10,2) )
|
2. | Insert rows into the LEDGER
table. From your SQL*Plus session, execute the following script:
@c:\wkdir\ins_led.sql The ins_led.sqlscript contains the following: INSERT INTO ledger VALUES ('Salary', 100000) / INSERT INTO ledger VALUES ('Capital_gains', 15000) / INSERT INTO ledger VALUES ('Net', 0) / INSERT INTO ledger VALUES ('Tax', 0) / INSERT INTO ledger VALUES ('Interest', 0) /
|
3. |
To perform the calculations, use the ITERATE option to have the calculations repeated as many times as desired. The first pass will insert the values stored in the LEDGER table into the right side of the rules and create a new set of values for NET, TAX, and INTEREST. The second pass will calculate a new set of values for NET , TAX, and INTEREST using the TAX and INTEREST values calculated in the previous pass. This cycle will be repeated a total of 100 times. From your SQL*Plus session, execute the following script: @c:\wkdir\it1.sql The it1.sqlscript contains the following: SELECT b, account |
4. | Write a query to avoid unnecessary processing time in the previous example. Monitor the results after each loop is complete. If the value of certain results have stopped changing by a significant amount, then you can stop the cycles at that point. From your SQL*Plus session, execute the following script: @c:\wkdir\it2.sql The it2.sqlscript contains the following: SELECT b, account Note that:
|
An ordered rule is one that has ORDER BY specified on the left side. It accesses cells in the order prescribed by ORDER BY and applies the right side computation. This is an important issue because, when you have positional ANY and/or symbolic references on the left side of a rule, you might receive an error saying that the rule's results depend on the order in which cells are accessed and hence are non-deterministic. Consider the MODEL below:
SELECT year, sales FROM sales_view WHERE country='Italy' AND prod='Bounce' MODEL DIMENSION BY (year ) MEASURES (sale sales) RULES SEQUENTIAL ORDER ( sales[ANY] = sales[CV(year)-1] ) ORDER BY year;
This query returns an error message because the results are indeterminate: the values depend on the order of cell access. The query attempts to set, for all years, the sales value for a year to the sales value of the previous year. Unfortunately, the results of this rule depend on the order in which the cells are accessed. If cells are accessed in the ascending order of year, the result would be as shown in the third column of the table below. There is no 1998 value, so 1999 would have a NULL assigned to it. This NULL would be carried forward into all following assignments. If the cells are accessed in descending year order, the results would be as shown in the fourth column. 2000 has a valid value which can be assigned to 2001, and the same is true for 2000 and 2001. Therefore only 1999 is assigned a NULL (because there is no value for 1998) when access is in descending year order.
Current Yr Prior Yr Sales Year Sales if ascending if descending 1999 2472.13 NULL NULL 2000 4370.43 NULL 2472.13 2001 NULL 4370.43
1. |
Based on the above information, write a query that ensures the cells in this query are accessed in descending year order and returns non-NULL results. You will need to add the ORDER BY clause to the rule. From your SQL*Plus session, execute the following script: @c:\wkdir\of.sql The of.sqlscript contains the following: SELECT year, sales In general, you can use any ORDER BY specification if it produces a unique order among cells that qualify the left side cell reference. Expressions in the ORDER BY of a rule can involve constants, measures, and dimension keys, and you can specify the ordering options [ASC | DESC] [NULLS FIRST | NULLS LAST] to obtain the order that you want. |
In addition to the topics shown above, with Oracle Database 10g Release 2, more extensions to the SQL Model clause are available. These include:
Support for analytic SQL functions inside the Model clause | ||
Greater flexibility on the number of rules a model can contain | ||
An additional form of Upsert | ||
For more information, see the Oracle Database 10g Data Warehousing Guide.
In this tutorial, you learned how to:
Use positional and symbolic cell references | ||
Use the CV() function and the ANY wildcard | ||
Use FOR loops | ||
Use reference and Iterative models |
Place the cursor over this icon to hide all screenshots.