|
Oracle9i
introduces a family of new features to support more scalable
and efficient ETL (Extraction, Transformation, Load) processing
for data warehouses and business intelligence systems.
One of the most significant ETL additions
is the Multitable Insert feature. The multitable insert feature
allows the INSERT . . . SELECT statement to use multiple tables
as targets. In addition, it can distribute data among target
tables based on logical attributes of the new rows. Multitable
insert thus enables a single scan and transformation of source
data to insert data into multiple tables, sharply increasing
performance.
There are two frequent situations in ETL
processing where the multitable insert feature provides great
performance benefits:
- A process takes subsets of data from a
single source and inserts them into different target tables
based on logical attributes of the source data.
- A process "fans out" the full set of data
from a single source into several target tables. The fan
out may be conditional or unconditional.
Prior to Oracle9i,
such processing could be done in two different ways. It could
be done through SQL with an independent INSERT . . . SELECT
statement for each table. This approach meant processing the
same source data and the transformation workload once for
each target. If the SELECT statement itself used a major transformation
(which it often does, e.g., GROUP BY), then the transformed
data was either recomputed for each scan or materialized beforehand
in a temporary table to improve performance. Alternatively,
this type of work could be performed in procedural programs:
every row would be examined to determine how to handle the
insertion.
Both of these techniques had significant
drawbacks. If the insert was performed through SQL, multiple
scans and transforms of the source data would hurt performance.
Creating a temporary table to hold precomputed transformation
results could consume large amounts of disk space, and the
temporary table would still be scanned multiple times. The
procedural programming approach could not use the high-speed
access paths directly available in SQL, so it also faced performance
handicaps.
The multitable inserts feature of Oracle9i
provides a new, high efficiency approach for these kinds of
tasks. The feature can direct data into one or several targets,
depending on the business transformation rules. In addition
to the major performance improvements, multitable insert also
simplifies transformation programming, speeding up ETL process
development. The feature enables one SQL statement to replace
multiple statements and avoid complex procedural programming.
Example
The following example statement
inserts new customer information from the customers_new
table into two tables, customers
and customers_special.
If a customer has a credit limit greater than 4500, the row
is inserted into customers_special.
All customer rows are inserted into table customers.
INSERT FIRST
WHEN cust_credit_limit >=4500 THEN
INTO customers_special VALUES(cust_id,cust_credit_limit)
INTO customers
ELSE
INTO customers
SELECT * FROM customers_new;
The same operation in Oracle8i could be
implemented as follows:
INSERT
INTO customers_special (cust_id,cust_credit_limit)
SELECT cust_id, cust_credit_limit
FROM customers_new
WHERE cust_credit_limit >=4500;
INSERT
INTO customers
SELECT * FROM customers_new;
Oracle8i requires two separate SQL
commands for the same business task, accessing the source
table customers_new once
per each statement. Note that the performance gain with multitable
insert is directly proportional to the amount and complexity
of the source data involved, as well as the number of insertion
targets. For complex source data with many targets, Oracle9i
multitable insert can increase processing speed over 400%.
Multitable inserts are just one example of
the new ETL features of Oracle9i. The full set of new
ETL functionality creates a powerful framework for handling
all ETL tasks within the Oracle database.
More
Info
Oracle9i
Daily Features
|