| |
High-speed Data Loading and Rolling Window Operations
Module Objectives
Purpose
In this module, you will learn how to use Oracle9i
for high-speed data loading and leveraging Oracle Partitioning for a rolling
window operation.
Objectives
After completing this module, you should be able to:
 |
Load and transform data in parallel using External Tables,
introduced in Oracle9i Release 1 |
 |
Compare the usage of SQL*Loader to external
tables for a simple load and transformation process |
 |
Perform a Rolling Window Operation leveraging Oracle
Partitioning |
 |
As part of the Rolling Window Operation we will compress
the oldest partitions to save space in our data warehouse environment.
This is new Oracle9i Release 2 functionality |
Prerequisites
Before starting this lesson, you should have:
Reference Material
The following is a list of useful reference materials if you want additional
information about the topics in this module:
 |
Documentation: Data Warehousing Guide
|
Overview
Most of the time, OLTP (Source) systems, which feed the Data Warehouse,
are not directly connected to the data warehousing system for extracting
new data. Commonly, those OLTP systems send data feeds in the form of
external files. This data must be loaded into the Data Warehouse, preferable
in parallel, thus leveraging the existing resources.
Furthermore, due to the business needs of our sample company MyCompany,
only the data of the last 3 years are relevant for the analysis needs.
This means that with the insertion of new data, old data has to be purged.
The maintenance of this so-called rolling window operation is done with
Oracle Partitioning.
External Tables
To load the external files into their data warehouse, MyCompany uses
the Oracle9i
external table feature, which allows external data, such as flat files,
to be exposed within the database just like a regular database table.
External tables can be accessed that SQL, so that external files can be
queried directly and in parallel using the full power of SQL, PL/SQL,
and Java. External tables are often used in the Extraction, Transformation,
and Loading (ETL) process to combine data-transformations (through SQL)
with data-loading into a single step. External tables are a very powerful
feature with many possible applications in ETL and other database environments
where flat files are processed; it is an alternative for using SQL*Loader.
Parallel Execution
Parallel execution dramatically reduces response time for data-intensive
operations on large databases typically associated with decision support
systems (DSS) and data warehouses. You can also implement parallel execution
on certain types of online transaction processing (OLTP) and hybrid systems.
Simply expressed, parallelism is the idea of breaking down a task so that,
instead of one process doing all of the work in a query, many processes
do part of the work at the same time. An example of this is when four
processes handle four different quarters in a year instead of one process
handling all four quarters by itself.
Rolling Window Operations
A very important task in the backoffice of a data warehouse is to keep
it synchronized with the various changes which are taking place in the
OLTP (source) systems. In addition, the life span of the data from an
analysis perspective is very often limited, so that older data must be
purged from the target system while new data is loaded; this operation
is often called a rolling window operation. This should be done as fast
as possible without any implication for the concurrent online access of
the data warehousing system.
Loading Data by Using External Tables
In this section of the lesson, you will load data into the Data Warehouse
using External Tables, introduced in Oracle9i Release 1.
Loading data using external tables
To show you how external tables can be created and used, perform the
following steps:
| 1. |
Create the necessary directory objects.
|
| 2. |
Create the external table.
|
| 3. |
Change external file. |
| 4. |
Select from the external table. |
| 5. |
Parallel access of external tables. |
| 6. |
Oracle9is
new parallel insert capabilities. |
| 7. |
Perform parallel insert. |
1. Create the Necessary Directory Objects
Before you can create the external table, you will need to create a directory
object in the database that will point to the directory on the file system
where the data files will reside. Optionally, you can separate the location
for the log, bad and discard files from the location of the data files.
To create the directory, perform the following:
| 1. |
From a SQL*Plus session logged on to the SH schema, run create_directory.sql,
or copy the following SQL statement into your SQL*Plus session:
@create_directory.sql
DROP DIRECTORY data_dir;
DROP DIRECTORY log_dir;
CREATE DIRECTORY data_dir AS 'D:\HandsOn_STUDENTS\RDBMS\FILES';
CREATE DIRECTORY log_dir AS 'D:\TEMP';

|
2. Create the External Table
When creating an external table, you are defining two parts of information:
| 1. |
The metadata information for the table representation inside the
database
|
| 2. |
The HOW access parameter
definition to extract the data from the external file
|
After the creation of this meta information, the external data can be
accessed from within the database, without the necessity of an initial
load.
To create the external table, you perform the following steps:
| 1. |
From a SQL*Plus session logged on to the SH schema, run @create_external_table.sql.
The results are as follows:
@create_external_table
set echo on
Rem *****
Rem CREATE EXTERNAL TABLE
Rem *****
CREATE TABLE sales_delta_xt
(
prod_id NUMBER(6),
cust_id NUMBER,
time_id DATE,
channel_id CHAR(1),
promo_id NUMBER(6),
quantity_sold NUMBER(3),
amount_sold NUMBER(10,2)
)
ORGANIZATION external
(
TYPE oracle_loader
DEFAULT DIRECTORY data_dir
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII
BADFILE log_dir:'sh_sales_%p.bad'
LOGFILE log_dir:'sh_sales_%p.log_xt'
FIELDS TERMINATED BY "|" LDRTRIM
)
location
(
'salesDelta.dat'
)
)REJECT LIMIT UNLIMITED NOPARALLEL;

|
3. Change External Files
Because an external table points to external data files without storing
any actual data, the files can be changed without affecting the external
table. In our lesson, you will use the same external table definition
twice. You therefore must map the external file containing the data you
want to use with the file name used in the external table definition.
To copy the actual files in your working directory, perform the following
steps:
| 1. |
Make sure the target file does not exist. Open a DOS prompt and
navigate to the directory where your files are located and perform
the following:
dir salesDelta.dat
If it exists, delete it.
|
| 2. |
Copy the source file by performing the following:
copy salesDec00.dat salesDelta.dat
|
| 3. |
You should now see the file:
dir salesDelta.dat

|
4. Select from the External Table
You can now access the data in the external file without any further
action, as shown with the following SQL command:
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following queries:
SELECT COUNT(*) FROM sales_delta_xt;
SELECT MAX(time_id) FROM sales_delta_xt;
If you copied the files correctly, the maximum TIME_ID is 30-DEC-00.

|
5. Parallel Access of External Tables
Unlike SQL*Loader, the access of external tables can be done in parallel
independent of the number of external files. SQL*Loader can only operate
on a per-file bases; so that you have to split large source files manually
if you want to parallelize. With external tables, the degree of parallelism
is controlled exactly the same way than for a normal table. In our case,
you defined the external table NOPARALLEL by default. The following section
will show you how to control the degree of parallelism on statement level
by using a hint.
| 1. |
The script parallel_select_from_ET.sql
contains the SQL statements for the next three steps.
Run the following query, from a SQL*Plus session logged on to the
SH schema, to see the current parallel session statistics:
SELECT *
FROM v$pq_sesstat
WHERE statistic in ('Queries Parallelized',
'Allocation Height');
|
| 2. |
Run the same query you did before to access the External Table
with a parallel degree of 4, controlled with a hint. The select
statement is:
select /*+ parallel(a,4) */ count(*)
from sales_delta_XT a;
Were selecting from our external table in parallel, although
the external table points only to one input source file.
|
| 3. |
Now rerun the session stats again to see
the differences. You will see the parallel session statistics have
changed; it shows that the last query was parallelized, and it also
show the degree of parallelism.
SELECT *
FROM v$pq_sesstat
WHERE statistic in
('Queries Parallelized', 'Allocation Height');

|
6. Oracle9is
New Parallel Insert Capabilities
Prior to Oracle9i,
the direct path INSERT command for partitioned tables was parallelized on
a per-partition base (Direct path insertion into non-partitioned tables
or a single partition could be parallelized unrestricted since Oracle8i).
Oracle9i lifts
this restriction, providing unlimited parallel direct path INSERT capabilities
within each partition.
Lets first look how a serial plan looks like. Since none of the
objects is defined in parallel, we will get serial execution unless we
(a) either change the default degree of parallelism of one of the objects
or (b) use a hint.
| 1. |
To show the execution plan for SERIAL INSERT behavior, run show_serial_exec_plan.sql,
or copy the following SQL statement into your SQL*Plus session:
@show_serial_exec_plan.sql
EXPLAIN PLAN FOR INSERT /*+ APPEND */ INTO sales
(
PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID,
QUANTITY_SOLD, AMOUNT_SOLD
)
SELECT PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID,
sum(QUANTITY_SOLD), sum(AMOUNT_SOLD)
FROM SALES_DELTA_XT
GROUP BY prod_id,time_id,cust_id,channel_id,promo_id;
set linesize 120
select * from table(dbms_xplan.display);

You can see that the 'LOAD AS SELECT' operation (shaded), representing
the INSERT is done in serial.
|
| 2. |
To show the PARALLEL INSERT execution plan you will need to run
show_parallel_exec_plan.sql
logged in to the SH schema. A parallel DML must always be the first
statement of a transaction. Furthermore, a DML operation cannot
run in the presence of primary keyforeign key constraints. We therefore
have to disable the constraints prior to the parallel DML operation:
@show_parallel_exec_plan.sql
ALTER TABLE sales DISABLE CONSTRAINT sales_product_fk;
ALTER TABLE sales DISABLE CONSTRAINT sales_customer_fk;
ALTER TABLE sales DISABLE CONSTRAINT sales_time_fk;
ALTER TABLE sales DISABLE CONSTRAINT sales_channel_fk;
ALTER TABLE sales DISABLE CONSTRAINT sales_promo_fk;
COMMIT;
alter session enable parallel dml;
EXPLAIN PLAN FOR INSERT /*+ APPEND PARALLEL(SALES,4) */ INTO sales
(
PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID,
QUANTITY_SOLD, AMOUNT_SOLD
)
SELECT /*+ parallel (sales_delta_XT,4) */
PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID,
sum(QUANTITY_SOLD), sum(AMOUNT_SOLD)
FROM SALES_DELTA_XT
GROUP BY prod_id,time_id,cust_id,channel_id,promo_id;
set linesize 120
select * from table(dbms_xplan.display);

Unlike in the previous plan, the 'LOAD AS SELECT' operation is
done in parallel.
|
7. Perform a Parallel Insert
In this step of the lesson you will do the parallel insert discussed
before. Note that you not only select the data from the external table
but also do an aggregate as part of the select prior to the insertion;
something that is impossible with SQL*Loader only.
| 1. |
You will now run the sql to perform a parallel insert. Run parallel_insert_file.sql
or copy the following statement into your SQL*Plus session set timing
on
@parallel_insert_file.sql
connect sh/sh@orcl.world
set timing on
set echo on
COLUMN statistic FORMAT a20
Rem *****
Rem PARALLEL INSERT
Rem *****
INSERT /*+ APPEND PARALLEL(SALES,4) */
INTO sales ( PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID,
PROMO_ID, QUANTITY_SOLD, AMOUNT_SOLD )
SELECT /*+ PARALLEL (sales_delta_XT,4) */ PROD_ID, CUST_ID,
TIME_ID, CHANNEL_ID, PROMO_ID, sum(QUANTITY_SOLD),
sum(AMOUNT_SOLD)
FROM SALES_DELTA_XT
GROUP BY prod_id,time_id,cust_id,channel_id,promo_id;
set timing off
Rem we can control the successful parallel DML
SELECT *
FROM v$pq_sesstat
WHERE statistic in ('DML Parallelized','Allocation Height');
Write down the execution time of this statement and compare it
to the total amount of time you will need with SQL*Loader and a
subsequent insertion. Note that we wont see the full benefit
of parallelizing the external table access and combining the transformation
with the loading, since were accessing a very small amount
of data in parallel on a single CPU machine with one disk.

|
| 2. |
Row rollback. In the next lesson the same data is inserted using
SQL*Loader. If you do not rollback now there will be duplicate rows
in the table and other lessons will not work. You can also choose
not to rollback and skip the SQL*Loader lesson.
Rollback;
|
| 3. |
if you plan to SKIP the SQL*Loader section, you should COMMIT your
transaction now:
COMMIT;
You can now continue with the section ROLLING WINDOW OPERATIONS
|
| 4. |
It is important that you either commit or rollback the
operation before you can access the table again; otherwise you will
get an Oracle error message. |
| 5. |
After issuing a commit or rollback, we will enable the constraints
again.
@enable_cons.sql
ALTER TABLE sales
MODIFY CONSTRAINT sales_product_fk ENABLE NOVALIDATE;
ALTER TABLE sales
MODIFY CONSTRAINT sales_customer_fk ENABLE NOVALIDATE;
ALTER TABLE sales
MODIFY CONSTRAINT sales_time_fk ENABLE NOVALIDATE;
ALTER TABLE sales
MODIFY CONSTRAINT sales_channel_fk ENABLE NOVALIDATE;
ALTER TABLE sales
MODIFY CONSTRAINT sales_promo_fk ENABLE NOVALIDATE;

|
You have just loaded and transformed the data as one step. To accomplish
the same functionality with SQL*Loader, you will need two process steps,
which means more work effort and slower performance for the complete process.
Compare SQL*Loader to the Simple Loading and Transformation Process
with External Tables
Before Oracle9i,
you did the same operation by using SQL*Loader. The external table method
you previously performed is the preferred method of data loading and transformation.
However, to demonstrate the benefit of using external tables, you will
perform the tasks to load and transform the data by using SQL*Loader.
To show how to load and transform data by using SQL*Loader, you will
perform the following steps:
| 1. |
Create a staging table.
|
| 2. |
Load the data into the staging table by using SQL*Loader.
|
| 3. |
Load the staging table into the target database. |
| 4. |
Drop the staging table. |
1. Create the Staging Table
You need a staging table to load the data into so that you can transform
it within the database in a second step.
| 1. |
You will now run the sql to create a staging table:
create table sales_dec00 as
select *
from sales
where 1=0;
|
2. Load Data into the Staging Table by Using SQL*Loader
You will now load the data file using sales_dec00.ctl into the staging
table by performing the following steps:
| 1. |
You perform this operation from the OS command line, in the D:\wkdir
directory..
sqlldr sh/sh@orcl.world control=sales_dec00.ctl direct=true
If your database alias is not orcl.world change the alias accordingly.
|
| 2. |
Note that you cannot parallelize this task. Check the SQL*Loader
log file, D:\wkdir\sales_dec00.log.
Write down the execution time for the loading process.

|
Unlike with an External Table, space is consumed in the database for
making the data accessible from within the database. The space consumed
by the staging table is linearly dependent on the amount of data to be
loaded for further transformation.
Also note that it is not reasonably possible to parallelize the loading
with SQL*Loader without having several external files. You can use the
SKIP option for several Loader processes, accessing the same file. However,
this enforces every SQL*Loader process to scan the whole external file;
this is detrimental to the overall performance.
3. Load the Staging Table into the Target Table
After loading the external data - making it accessible for the database
- you can perform your transformation.
| 1. |
You will now run the SQL to transform and insert the external data,
which is already staged in the database, into the sales fact table
by issuing
@load_stage_table.sql
@load_stage_table.sql
Rem Load from staging table into target
set timing on
INSERT /*+ APPEND PARALLEL(SALES,4) */ INTO sales
( PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID,
QUANTITY_SOLD, AMOUNT_SOLD
)
SELECT /*+ parallel (sales_dec00,4) */
PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID,
sum(QUANTITY_SOLD), sum(AMOUNT_SOLD)
FROM sales_dec00
GROUP BY prod_id, time_id, cust_id, channel_id, promo_id;
COMMIT;
Write down the execution time of this statement and add the time
for the loading with SQL*Loader. Compare the total amount of time
you will need with the time for the external table insertion. Note
that we wont see the full benefit of parallelizing the external
table access and combining the transformation with the loading,
since were accessing a very small amount of data in parallel
on a single CPU machine with one disk.

|
Rolling Window Operations
Many data warehouses maintain a rolling window of data. For example,
the data warehouse stores the most recent 12 months of sales data. Just
as a new partition can be added to the sales table, an old partition can
be quickly (and independently) removed from the sales
table. Partitioning provides the ideal framework for those operations.
The above two benefits (reduced resources utilization and minimal end
user impact) are just as pertinent to removing a partition as they are
to adding a partition.
Perform the Steps of the Rolling Windows Operation
To perform the steps of the rolling windows operation and learn about
enhancements in Oracle9i,
perform the following steps:
| 1.1 |
Rolling Window Part 1 |
|
 |
Prepare a standalone table with the new
data |
| |
|
| 1.2 |
Rolling Window Part 2 |
|
 |
Add the new data to the fact table |
| |
|
| 1.3 |
Rolling Window Part 3a |
|
 |
Delete old data from the fact table |
| |
|
| 1.4 |
Rolling Window Part 3b |
|
 |
Instead of purging old data from the fact
table you will use the table compression feature or Oracle9i
Release 2 Enterprise Edition to reduce the space consumption of the
old data |
| |
|
| 1.5 |
Oracle9i
Enhancements for Global Index Maintenance |
| |
 |
Utilize the Oracle9i
enhancement for Global Index Maintenance. |
1.1 Prepare a standalone table with the new data
To perform the rolling window operation, you need to create and load
a standalone table with the new data by performing the following steps.
Note that you are going to use the external table you already defined,
but pointing to a different external file:
| 1. |
Create a link to the sales Q1 data.
|
| 2. |
Create the table for the new sales Q1 data.
|
| 3. |
Load this table. |
| 4. |
Create bitmap indexes for this table. |
| 5. |
Create constraints for this table. |
1. Change External File to the Sales Q1 Data
In this section, you use the external table you already defined. However,
this time you will use a different external file, the sales Q1 data.
You rename the actual files by performing the following steps:
| 1. |
Make sure the target file does not exist. Open a DOS prompt and
navigate to the directory where your files are located (D:\wkdir)
and perform the following:
dir salesDelta.dat
If it exists, delete it.
|
| 2. |
Copy the source file by performing the following:
copy salesQ1_01.dat salesDelta.dat
|
| 3. |
You should now see the file:
dir salesDelta.dat

|
| 4. |
Lets query our external table again and compare the number
of records and MAX(time_id):
SELECT COUNT(*) FROM sales_delta_xt;
SELECT MAX(time_id) FROM sales_delta_xt;

Note that the number os rows as well as the maximum TIME_ID has
changed. If your external table file is correct, the maximum TIME_ID
is 30-MAR-2001.
|
2. Create the Table for the New Sales Q1 Data
You will create an empty table for the new sales Q1 data. This table
will be added to the already existing partitioned sales
table later.
| 1. |
Run the SQL to create the table:
create table sales_delta nologging as
select *
from sales
where 1=0;

|
3. Load This Table
To load this table, you perform the following step:
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following statement:
@load_stage_table2.sql
INSERT /*+ APPEND */ INTO sales_delta
SELECT /*+ PARALLEL (SALES_DELTA_XT,4) */
PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID,
sum(QUANTITY_SOLD) quantity_sold,
sum(AMOUNT_SOLD) amount_sold
FROM SALES_DELTA_XT
GROUP BY prod_id,time_id,cust_id,channel_id,promo_id;

|
After loading the table sales_delta, we should gather statistics for
this newly created table
| |
.From a SQL*Plus session logged on to the SH schema, execute the
following statement:
@gather_stat_stage_table.sql
exec dbms_stats.gather_table_stats('SH', 'sales_delta',
estimate_percent=>20);

|
4. Create Bitmap Indexes for This Table
Because you are going to exchange this standalone table with an empty
partition of the sales
table at a later point in time, you have to build exactly the same index
structure as the existing SALES
table to keep the local index structures of this particular table in a
usable state after the exchange.
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following statements:
@create_static_bitmap_index.sql
CREATE BITMAP INDEX sales_prod_local_bix
ON sales_delta (prod_id)
NOLOGGING COMPUTE STATISTICS ;
CREATE BITMAP INDEX sales_cust_local_bix
ON sales_delta (cust_id)
NOLOGGING COMPUTE STATISTICS ;
CREATE BITMAP INDEX sales_time_local_bix
ON sales_delta (time_id)
NOLOGGING COMPUTE STATISTICS ;
CREATE BITMAP INDEX sales_channel_local_bix
ON sales_delta (channel_id)
NOLOGGING COMPUTE STATISTICS ;
CREATE BITMAP INDEX sales_promo_local_bix
ON sales_delta (promo_id)
NOLOGGING COMPUTE STATISTICS ;
Note that the statistics for those indexes will be created as part
of the index creation.

|
5. Create Constraints for This Table
The same is true for the existing constraints of the sales
table.
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following query:
@create_constraints.sql
set echo on
Rem *****
Rem CREATE CONSTRAINTS
Rem *****
ALTER TABLE sales_delta
ADD ( CONSTRAINT sales_product_delta_fk
FOREIGN KEY (prod_id)
REFERENCES products RELY ENABLE NOVALIDATE
, CONSTRAINT sales_customer_delta_fk
FOREIGN KEY (cust_id)
REFERENCES customers RELY ENABLE NOVALIDATE
, CONSTRAINT sales_time_delta_fk
FOREIGN KEY (time_id)
REFERENCES times RELY ENABLE NOVALIDATE
, CONSTRAINT sales_channel_delta_fk
FOREIGN KEY (channel_id)
REFERENCES channels RELY ENABLE NOVALIDATE
, CONSTRAINT sales_promo_delta_fk
FOREIGN KEY (promo_id)
REFERENCES promotions RELY ENABLE NOVALIDATE
) ;

|
1.2 Add the New Data to the Fact Table
The next task in performing a rolling window operation is to add the
newly loaded and indexed data to the fact table. To do this, you perform
the following steps:
| 1. |
Create a new partition, if one does not already exist.
|
| 2. |
Exchange the partition. This a data dictionary operation only and
does not touch any data
|
| 3. |
Select from the partition to control the success. |
1. Create a New Partition
You need to create a new, empty partition. You can either create the
new partition with a distinct upper boundary or by choosing the keyword
MAXVALUE. The latter one ensures that records violating the potential
upper boundary condition wont be rejected and the INSERT operation
succeeds.
In our business scenario, we will issue a SPLIT PARTITION after the loading
operation to identify any potential violations. All records violating
our upper boundary will be separated into an extra partition.
You need to create a new, empty partition. To do this, you perform the
following step:
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following query:
@create_partition_for_sales.sql
Rem *****
Rem current partitions
Rem *****
select partition_name, high_value
from user_tab_partitions
where table_name='SALES'
order by partition_position;
Rem *****
Rem create ADDITIONAL PARTITION on sales
Rem *****
ALTER TABLE sales
ADD PARTITION sales_q1_2001
VALUES LESS THAN (MAXVALUE);
Rem *****
Rem what is in the partition now?
Rem empty
Rem *****
SELECT COUNT(*)
FROM sales PARTITION (sales_q1_2001);

|
2. Exchange the Partition
You are now adding the newly loaded and indexed data to the real sales
fact table by performing a PARTITION
EXCHANGE command. Note that this is only a DDL command, which doesn't
touch the actual data at all.. To do this, you perform the following step:
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following query:
@exchange_partition_wo_gim.sql
set echo on
Rem *****
Rem EXCHANGE IT
Rem *****
ALTER TABLE sales EXCHANGE PARTITION sales_q1_2001
WITH TABLE sales_delta INCLUDING INDEXES;

|
3. Select from the Partition
Now you can select from the newly added and exchanged partition to experience
how fast you inserted thousands of rows...
Note that, the more data you have to add to your partitioned fact
table, the more time youre saving with this metadata-only operation,
and the more you will experience the benefit of minimal to zero user impact.
You need a logical partitioning such as RANGE Partitioning; HASH Partitioning
cannot be used for the very common Rolling Window Operation.
Note that all indexes of the sales table are maintained and usable.
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following queries. This will show you the number of rows in the
exchanged partition and - now empty - standalone table:
SELECT COUNT(*)
FROM sales PARTITION (sales_q1_2001);
SELECT COUNT(*)
FROM sales_delta;

|
| |
Note that all local indexes of table sales are valid.
@show_sales_idx_status.sql
SELECT ui.index_name,
DECODE(uip.status,null,ui.status,uip.status) status,
count(*) num_of_part
FROM user_ind_partitions uip, user_indexes ui
WHERE ui.index_name=uip.index_name(+)
AND ui.table_name='SALES'
GROUP BY ui.index_name,
DECODE(uip.status,null,ui.status,uip.status);

|
You can also use the WITHOUT VALIDATION clause as part of
the PARTITION EXCHANGE command. Oracle will then suppress the validity
checking of the table which is going to be exchanged; otherwise Oracle
guarantees that all values of the partition key match within the partition
boundaries.
SPLIT THE MOST RECENT PARTITION TO ENSURE (BUSINESS) DATA INTEGRITY
As mentioned before, you decided to load the data into a partition with
no fix upper boundary to avoid any potential errors. To identify any potential
violation, you will the most recent partition, thus creating two partitions,
one with a fix upper boundary (reflecting our business rule).
Prior to Oracle9i
Release 2, the SPLIT PARTITION operation would have caused physical data
movement and the creation of two new segments. With Oracle9i
Release 2 introduces an enhanced fast split operation, where the RDBMS
detects whether or not one of the two new partitions will be empty. If
this is the case, Oracle will not create two new segments; it will only
create one empty one, reflecting the new empty partition.
This enhancement is absolutely transparent. It improves the runtime of
a SPLIT operation, saves system resources, and does not invalidate any
existing local index structures.
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following queries, @fast_split_sales.sql:
@fast_split_sales.sql
ALTER TABLE sales
SPLIT PARTITION sales_q1_2001
AT (TO_DATE('01-APR-2001', 'DD-MON-YYYY'))
INTO (PARTITION sales_q1_2001,
PARTITION sales_beyond_q1_2001);
PROMPT you should recognize that is was fast.
PROMPT Let's control the count in the most recent partition. Empty
SELECT COUNT(*)
FROM sales PARTITION (sales_beyond_q1_2001);
PROMPT now you can drop it
ALTER TABLE sales DROP PARTITION sales_beyond_q1_2001;

|
| 2. |
Note that all local indexes of table sales are still valid.
@show_sales_idx_status.sql
SELECT ui.index_name,
DECODE(uip.status,null,ui.status,uip.status) status,
count(*) num_of_part
FROM user_ind_partitions uip, user_indexes ui
WHERE ui.index_name=uip.index_name(+)
AND ui.table_name='SALES'
GROUP BY ui.index_name,
DECODE(uip.status,null,ui.status,uip.status);

|
1.3 Delete Old Data from the Fact Table
The next task to performing a Rolling Window Operation is to delete old
data from the fact table. You only want to analyze the most recent data
of the last 3 years. Therefore, since you added Q1-2001, you have to delete
the data of Q1-1998.
Without Range Partitioning, you have to perform a DML operation against
the table. With partitioning, you can leverage the PARTITION EXCHANGE
command again to remove the data from the fact table. Like for the adding
of new data, Hash Partitioning does not help you here either.
Note that you're not DELETING the data. Instead you are exchanging (logically
replacing) the partition containing this data from the Sales fact table
with an empty standalone table with the same logical structure. You can
then archive this data or drop the exchanged partition, depending on your
business needs. Alternatively, with Oracle9i Release 2, you can use the
new feature table compression to reduce the consumed space for older,
mostly read-only used, partitions. You will do this in the next step
| 1. |
Create an empty standalone table.
|
| 2. |
Create bitmap indexes for this table.
|
| 3. |
Create constraints for this table. |
| 4. |
Show the data in the partition before the exchange. |
| 5. |
Exchange the empty new table with the existing Q1-1998 partition. |
| 6. |
Show the data in the partition after the exchange. |
1. Create an Empty Standalone Table
You need to create an empty table in which to store the old 1998 data.
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following statements:
DROP TABLE sales_old_q1_1998;
create table sales_old_q1_1998 nologging as
select *
from sales
where 1=0;

|
2. Create Bitmap Indexes for This Table
Now create the local indexes.
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following query:
@create_ndx.sql
set echo on
Rem *****
Rem CREATE LOCAL INDEXES
Rem *****
CREATE BITMAP INDEX sales_prod_old_bix
ON sales_old_q1_1998 (prod_id)
NOLOGGING COMPUTE STATISTICS ;
CREATE BITMAP INDEX sales_cust_old_bix
ON sales_old_q1_1998 (cust_id)
NOLOGGING COMPUTE STATISTICS ;
CREATE BITMAP INDEX sales_time_old_bix
ON sales_old_q1_1998 (time_id)
NOLOGGING COMPUTE STATISTICS ;
CREATE BITMAP INDEX sales_channel_old_bix
ON sales_old_q1_1998 (channel_id)
NOLOGGING COMPUTE STATISTICS ;
CREATE BITMAP INDEX sales_promo_old_bix
ON sales_old_q1_1998 (promo_id)
NOLOGGING COMPUTE STATISTICS ;

|
3. Create Constraints for This Table
Now create the constraints.
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following query:
@create_constraints_old.sql
set echo on
Rem *****
Rem CREATE CONSTRAINTS
Rem *****
ALTER TABLE sales_old_q1_1998
ADD ( CONSTRAINT sales_product_old_fk
FOREIGN KEY (prod_id)
REFERENCES products RELY ENABLE NOVALIDATE
, CONSTRAINT sales_customer_old_fk
FOREIGN KEY (cust_id)
REFERENCES customers RELY ENABLE NOVALIDATE
, CONSTRAINT sales_time_old_fk
FOREIGN KEY (time_id)
REFERENCES times RELY ENABLE NOVALIDATE
, CONSTRAINT sales_channel_old_fk
FOREIGN KEY (channel_id)
REFERENCES channels RELY ENABLE NOVALIDATE
, CONSTRAINT sales_promo_old_fk
FOREIGN KEY (promo_id)
REFERENCES promotions RELY ENABLE NOVALIDATE
) ;

|
4. Show the Data in the Partition Before the Exchange
Before you perform the exchange, you will want to take a look at the
1998 Q1 data that will be aged out of the partition.
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following query:
@show_partition.sql
set echo on
Rem *****
Rem show the actual content of the partition to be aged out
Rem BEFORE exchange
Rem *****
SELECT COUNT(*)
FROM sales PARTITION (sales_q1_1998);

|
5. Exchange the Partition
You now need to exchange the empty table with the existing Q1-1998 partition.
To do this, you perform the following step:
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following SQL statement:
@exchange_old_partition.sql
ALTER TABLE sales
EXCHANGE PARTITION sales_q1_1998
WITH TABLE sales_old_q1_1998
INCLUDING INDEXES;

Note that we could have used a DROP PARTITION instead. Table sales_old_q1_1998
now stores all the data of the first quarter of 1998. We could drop
this table to remove the data entirely from the system.
|
6. Show the Data in the Partition After the Exchange
After you perform the exchange, you will want to take a look at the data
in the partition.
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following queries:
SELECT COUNT(*)
FROM sales PARTITION (sales_q1_1998);
SELECT COUNT(*)
FROM sales_old_q1_1998;

Unlike before the EXCHANGE command, the standalone table now stores
thousands of rows whereas the equivalent partition of SALES is empty.
|
| 2. |
Local Indexes are not affected.
@show_sales_idx_status.sql
SELECT ui.index_name,
DECODE(uip.status,null,ui.status,uip.status) status,
count(*) num_of_part
FROM user_ind_partitions uip, user_indexes ui
WHERE ui.index_name=uip.index_name(+)
AND ui.table_name='SALES'
GROUP BY ui.index_name,
DECODE(uip.status,null,ui.status,uip.status);

|
Exchange Back
Because you are going to perform the same partition maintenance operations
again, you need to reset the environment back to its original state. You
do this by simply issuing the same partition exchange commands again:
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following query:
@exchange_back_to_start.sql
set echo on
Rem *****
Rem let's bring the data back in original state
Rem *****
Rem added partition data
SELECT COUNT(*)
FROM sales PARTITION (sales_q1_2001);
ALTER TABLE sales
EXCHANGE PARTITION sales_q1_2001
WITH TABLE sales_delta INCLUDING INDEXES;
SELECT COUNT(*)
FROM sales PARTITION (sales_q1_2001);
Rem deleted partition data
SELECT COUNT(*)
FROM sales PARTITION (sales_q1_1998);
ALTER TABLE sales
EXCHANGE PARTITION sales_q1_1998
WITH TABLE sales_old_q1_1998
INCLUDING INDEXES;
SELECT COUNT(*)
FROM sales PARTITION (sales_q1_1998);

|
1.4 SAVE SPACE AND COMPRESS OLD DATA IN THE FACT
TABLE INSTEAD OF DELETING IT
Data stored in relational databases keep growing as a result of businesses
requirements for more information. A big portion of the cost of keeping
large amounts of data is in the cost of disk systems, and the resources
utilized in managing that data. Oracle9i
Release 2 Enterprise Edition introduces a unique way to deal with this
cost by compressing data stored in relational tables with virtually no
negative impact on query time against that data, thereby enabling substantial
cost savings.
Commercially available relational database systems have not heavily utilized
compression techniques on data stored in relational tables. One reason
is that the trade-off between time and space for compression is not always
attractive for relational databases. A typical compression technique may
offer space savings, but only at a cost of much increased query time against
the data. Furthermore, many of the standard techniques do not even guarantee
that data size does not increase after compression.
Oracle9i Release
2 Enterprise Edition introduces a unique compression technique that is
very attractive for large data warehouses. It is unique in many ways.
Its reduction of disk space can be significantly higher than standard
compression algorithms, because it is optimized for relational data. It
has virtually no negative impact on the performance of queries against
compressed data; in fact, it may have a significant positive impact on
queries accessing large amounts of data, as well as on data management
operations like backup and recovery. It ensures that compressed data is
never larger than uncompressed data.
CREATE A STANDALONE TABLE USING THE TABLE COMPRESSION FEATURE
You will now run the SQL to create the table:
| 1. |
You will create a compressed table for the first six months of
the 1998 data. To set up the table including its constraints, you
can run create_comp_h1_1998_1.sql.
@create_comp_h1_1998_1.sql
DROP TABLE sales_1_1998_compress;
CREATE TABLE sales_1_1998_compress COMPRESS
AS SELECT *
FROM sales
WHERE time_id >= TO_DATE('01-JAN-1998','dd-mon-yyyy')
AND time_id < TO_DATE('01-JUN-1998','dd-mon-yyyy');
exec dbms_stats.gather_table_stats('SH','sales_1_1998_compress',
estimate_percent=>10);

|
Lets now compare the compression ratio for this newly created compressed
table:
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following statements:
@compression_rate1.sql
select segment_name||
decode(partition_name,
null,
': compressed table',
': partitions SALES_Q1_1998,SALES_Q2_1998') what,
sum(bytes)
from user_segments
where segment_name ='SALES_1_1998_COMPRESS'
or ( segment_name='SALES' and
partition_name in ('SALES_Q1_1998','SALES_Q2_1998'))
and segment_type in('TABLE','TABLE PARTITION')
group by segment_name||decode(partition_name,
null,
': compressed table',
': partitions SALES_Q1_1998,SALES_Q2_1998');

|
Most likely the compression ratio with real world data will be higher
than the one experienced with the sales history schema. The data in the
fact table SALES is artificially generated and does not show the typical
natural sorting you will find in any data warehouse environment
where the data was cleansed, consolidated, or even aggregated prior to
its INSERT.
CREATE CONSTRAINTS FOR THIS TABLE
You will now create the same constraints than the ones for table SALES.
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following statement:
@create_compress_constr.sql
ALTER TABLE sales_1_1998_compress
ADD ( CONSTRAINT sales_product_fk_compress
FOREIGN KEY (prod_id)
REFERENCES products RELY ENABLE NOVALIDATE
, CONSTRAINT sales_customer_fk_compress
FOREIGN KEY (cust_id)
REFERENCES customers RELY ENABLE NOVALIDATE
, CONSTRAINT sales_time_fk_compress
FOREIGN KEY (time_id)
REFERENCES times RELY ENABLE NOVALIDATE
, CONSTRAINT sales_channel_fk_compress
FOREIGN KEY (channel_id)
REFERENCES channels RELY ENABLE NOVALIDATE
, CONSTRAINT sales_promo_fk_compress
FOREIGN KEY (promo_id)
REFERENCES promotions RELY ENABLE NOVALIDATE
) ;

|
Since we merged two current partitions when creating the compressed table
sales_1_1998_compress, we have to drop one of them prior to the EXCHANGE
command to extend the range window for the existing partition to macth
the range window of the compressed standalone table.
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following statement:
@prep_exchange_comp1.sql
PROMPT CURRENT PARTITIONS
Rem *****
SELECT partition_name, high_value
FROM user_tab_partitions
WHERE table_name='SALES'
ORDER BY partition_position;
REM since we compressed data and merged two partitions, we have
REM to drop partition SALES_Q1_1998; this will bring down the
REM lower boundary of the 'new' sales_q2_1998 to the upper
REM boundary of sales_2_1997
ALTER TABLE sales DROP PARTITION sales_q1_1998;
PROMPT CURRENT PARTITIONS
SELECT partition_name, high_value
FROM user_tab_partitions
WHERE table_name='SALES'
ORDER BY partition_position;

|
PREPARE A PARTITIONED TABLE FOR GETTING ITS FIRST COMPRESSED PARTITION
When storing some or all partitions of a partitioned table in a compressed
format, you have to have at least one compressed partition prior to the
creation of bitmap index structures. If you have already existent bitmap
index structures, you have to mark them unusable or drop them -
prior to adding a compressed partition and to rebuild them afterwards.
This is only true for the very first time when a compressed partition
is added to an existing partitioned table that has bitmap indexes. If
you plan to store some or all partitions in a compressed manner, you should
create the partitioned table already with data compression enabled.
The steps for adding a compressed partition to our existing uncompressed
partitioned table with bitmap indexes are:
| 1. |
Create the new compressed standalone table containing data
|
| 2. |
Drop all existing bitmap indexes of the
target partitioned table |
| 3. |
Issue your partition maintenance operation
that involves a compressed partition |
| 1. |
Create the new compressed standalone table
containing data, this step has been completed, sales_data. |
| 2. |
Drop the existing bitmap index structures. From a SQL*Plus session
logged on to the SH schema, execute the following statement:
@drop_bix_for_sales.sql
Rem usable bitmap indexes on a partitioned table before we can
Rem add a compressed partition
Rem ORA-14646: Specified alter table operation involving
Rem compression cannot be performed in the presence of usable
Rem bitmap indexes
DROP INDEX sales_channel_bix ;
DROP INDEX sales_cust_bix ;
DROP INDEX sales_prod_bix ;
DROP INDEX sales_promo_bix ;
DROP INDEX sales_time_bix ;

|
| 3. |
You can now exchange the compressed table with the existing uncompressed
partition.
In this case, exchange the former uncompressed
partition with the compressed standalone table, having the same
logical structure. This could also be an ALTER TABLE MOVE PARTITION
command. We will demonstrate an ALTER TABLE MERGE command later
From a SQL*Plus session logged on to the SH schema, execute the
following statement:
@exchange_comp_part.sql
REM now exchange the newly created and compressed
REM partition back ...
ALTER TABLE sales
EXCHANGE PARTITION sales_q2_1998
WITH TABLE sales_1_1998_compress;
REM rename it
ALTER TABLE sales RENAME
PARTITION sales_q2_1998
TO sales_H1_1998_compress;
REM drop exchanged table
DROP TABLE sales_1_1998_compress;

|
The data dictionary shows the partitions having the COMPRESS attribute
set:
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following statement:
@show_comp_parts.sql
PROMPT control the compress attribute for the partitioned table
SELECT partition_name, compression
FROM user_tab_partitions
WHERE table_name='SALES'
ORDER BY partition_position;

|
Lets now recreate the bitmap indexes for Sales:
| 1. |
From a SQL*Plus session logged on to the SH schema, execute the
following statement:
@recreate_bix_for_sales.sql
PROMPT recreate indexes for table SALES
CREATE BITMAP INDEX sales_prod_bix
ON sales (prod_id) LOCAL
NOLOGGING COMPUTE STATISTICS ;
CREATE BITMAP INDEX sales_cust_bix
ON sales (cust_id) LOCAL
NOLOGGING COMPUTE STATISTICS ;
| |