|
Oracle9i
introduces a family of new features to support
more scalable and efficient ETL (Extraction, Transformation,
Load) processing for data warehouses and business intelligence
systems. One of the most exciting ETL additions is the Table
Functions feature. The table functions feature enables high
efficiency execution of ETL functions implemented in PL/SQL,
C or Java.
In an ETL process, the data extracted from
a source system passes through a sequence of transformations
before it is loaded into a data warehouse. Complex transformations
are implemented in procedural programs, either outside the
database or inside the database in PL/SQL or Java. Prior to
Oracle9i, developers of ETL processes faced two challenges:
- Data Staging - When the results of a transformation
grow too large to fit into memory, they must be staged:
the results are materialized into database tables or flat
files. The staged data is then read and processed as input
to the next transformation in the sequence. Staging data
complicates development and slows ETL processing.
- Transformation Parallelism - Adding parallelism
to transformations requires complex programming, reducing
developer productivity.
Oracle9i's
table functions eliminate the need to stage data for transformations
implemented in PL/SQL, C or Java. The data flows between
transformations without interruption, a process
called "pipelining." In addition, the table functions enable
transparent parallelism for the transformations. The figure
below shows a sample transformation flow performed with staging
and the same flow performed using the table functions:

In the "transformation with staging" process
of the figure, showing ETL prior to Oracle9i, external
source data is loaded into the database table "stage 1." This
step exposes the source data to the database for further usage.
Then the procedural transformation T1 reads this data, applies
the first transformation and stages the intermediate result
in table "stage 2." Transformation T2 then reads the data
from stage 2 and inserts it into the target table. Note that
any parallel execution of one of those transformations must
be coded manually.
In the "pipelined parallel transformation"
process of the figure, using Oracle9i's
table functions, the two staging tables disappear. The T1
transformation sends data directly to transformation T2 without
the complications and delay of staging. In addition, the transformations
are processed in parallel without any manual coding, as represented
by the multiple transformation symbols for T1 and T2. Note
that the transparent access of the external source data is
done with the External Table feature introduced in Oracle9i.
(To learn about external tables, see the Oracle9i
Database Daily Features Archives, listed below under "More
Info.")
Along with tremendous performance benefits,
table functions provide great ETL flexibility. Table functions
enable use of PL/SQL, C or Java seamlessly with SQL. For instance,
a transformation can SELECT from a table function or pass
results of a SQL query directly into a table function for
further processing. With table functions, transformation components
can be combined and reused for faster deployment for new or
changed data flows. Table functions thus enable you to remodel
your transformation process flow quickly and efficiently.
Table functions are just one example of the
new ETL features of Oracle9i.
The full set of new ETL functionality creates a powerful framework
for handling all ETL tasks within the Oracle database.
More
Info
Oracle9i
Database Daily Features
|
 |