High-Speed Data Loading and Rolling Window Operations with Partitioning

Purpose

In this tutorial, you learn how to use Oracle10g for high-speed data loading and leveraging Oracle Partitioning for a rolling window operation.

Time to Complete

Approximately 2 hours

Topics

This tutorial covers the following topics:

	Overview
	Scenario
	Prerequisites
	Implement Schema Changes for the Sales History Schema
	Load Data by Using External Tables
	Compare SQL*Loader to the Simple Loading and Transformation Process with External Tables
	Leveraging Table Compression to Save Disk Space and Reduce the TCO
	Perform a Rolling Window Operation Using Oracle Partitioning
	Oracle 10g Enhancements for Local Index Maintenance
	Summary

Viewing Screenshots

Place the cursor over this icon to load and view all the screenshots for this tutorial. (Caution: This action loads all screenshots simultaneously, so response time may be slow depending on your Internet connection.)

Note: Alternatively, you can place the cursor over an individual icon in the following steps to load and view only the screenshot associated with that step. You can hide an individual screenshot by clicking it.

Overview

Most of the time, OLTP (Source) systems, which feed the Data Warehouse, are not directly connected to the data warehousing system for extracting new data. Commonly, those OLTP systems send data feeds in the form of external files. This data must be loaded into the Data Warehouse, preferably in parallel, thus leveraging the existing resources.

For example, due to the business needs and disk space constraints of the sample company MyCompany, only the data of the last three years are relevant for the analysis needs. This means that with the insertion of new data, disk space has to be freed, either by purging the old data or by leveraging Oracle's table compression. The maintenance of this so-called rolling window operation is done with Oracle Partitioning.

Back to Topic List

Prerequisites

Before starting this tutorial, you should:

1.	Perform the Installing Oracle Database 10g on Windows tutorial.
2.	Download and unzip etl.zip into your working directory (i.e. c:\wkdir).

Back to Topic List

Scenario

External Tables

To load the external files into their data warehouse, MyCompany uses the Oracle10g external table feature, which allows external data, such as flat files, to be exposed within the database just like a regular database table. External tables can be accessed by SQL, so that external files can be queried directly and in parallel using the full power of SQL, PL/SQL, and Java. External tables are often used in the Extraction, Transformation, and Loading (ETL) process to combine data-transformations (through SQL) with data-loading into a single step. External tables are a very powerful feature with many possible applications in ETL and other database environments where flat files are processed. It is an alternative to using SQL*Loader.

Parallel Execution

Parallel execution dramatically reduces response time for data-intensive operations on large databases typically associated with decision support systems (DSS) and data warehouses. You can also implement parallel execution on certain types of online transaction processing (OLTP) and hybrid systems. Simply expressed, parallelism is the idea of breaking down a task so that, instead of one process doing all of the work in a query, many processes do parts of the work at the same time. For example, parallel execution can be used when four processes handle four different quarters in a year instead of one process handling all four quarters by itself.

Rolling Window Operations Using Oracle Partitioning

A very important task in the back office of a data warehouse is to keep the data synchronized with the various changes that are taking place in the OLTP (source) systems. In addition, the life span of the data from an analysis perspective is very often limited, so that older data must be purged from the target system while new data is loaded; this operation is often called a rolling window operation. This should be done as fast as possible without any implication for the concurrent online access of the data warehousing system.

Back to Topic List

Implementing Schema Changes for the Sales History Schema

Before starting the tasks for this OBE, you need to implement some changes on the existing Sales History schema. Additional objects are necessary, and additional system privileges must be granted to the user SH. The SQL file for applying those changes is modifySH_10gR2.sql . To use the setup files for the Data Warehousing tutorials, perform the following steps:

Start a SQL *Plus session. Select Start > Programs > Oracle-OraDB10g_home > Application Development > SQL Plus.

(Note: This tutorial assumes you have an c:\wkdir folder. If you do not, you will need to create one and unzip the contents of etl.zip into this folder. While executing the scripts, paths are specified)

Run the modifySH_10gR2.sql script from your SQL*Plus session.

@c:\wkdir\modifySH_10gR2.sql

The bottom of your output should match the image below.

Back to Topic List

Load Data by Using External Tables

In this section of the tutorial, you load data into the Data Warehouse using External Tables.

To show you how external tables can be created and used, perform the following steps:

1.	Create the necessary directory objects.
2.	Create the external table.
3.	Select from the external table.
4.	Provide transparent high-speed parallel access of external tables.
5.	Review Oracle's parallel insert capabilities.
6.	Perform parallel insert.

Back to Topic List

1. Create the Necessary Directory Objects

Before you can create the external table, you need to create a directory object in the database that points to the directory on the file system where the data files will reside. Optionally, you can separate the location for the log, bad and discard files from the location of the data files. To create the directory, perform the following steps:

From a SQL*Plus session logged on to the SH schema, run create_directory.sql, or copy the following SQL statement into your SQL*Plus session:

@create_directory.sql

DROP DIRECTORY data_dir;
DROP DIRECTORY log_dir;
CREATE DIRECTORY data_dir AS 'c:\wkdir';
CREATE DIRECTORY log_dir AS 'c:\wkdir';

Move your mouse over this icon to see the image

The scripts are set up for a Windows system and assume that the Hands-On workshop was
extracted on drive C:\

1.	The metadata information for the table representation inside the database
2.	The HOW access parameter definition to extract the data from the external file

1.	Create a staging table.
2.	Load the data into the staging table by using SQL*Loader.
3.	Load the staging table into the target database.
4.	Drop the staging table.

1.	Prepare a standalone table with the new data
2.	Add the new data to the fact table
3.	Delete old data from the fact table

1.1	Modify the external table to DBMS_STATS the Sales Q1 data.
1.2	Create the table for the new sales Q1 data.
1.3	Load this table.
1.4	Create bitmap indexes for this table.
1.5	Create constraints for this table.

2.1	Create a new partition, if one does not already exist.
2.2	Exchange the partition. This is only a data dictionary operation and does not touch any data.
2.3	Select from the partition to control the success.
2.4	Split the most recent partition to ensure (business) data integrity.

3.1	Create an empty stand-alone table.
3.2	Create bitmap indexes for this table.
3.3	Create constraints for this table.
3.4	Show the data in the partition before the exchange.
3.5	Exchange the empty new table with the existing Q1-1998 partition.
3.6	Show the data in the partition after the exchange.

1.	Utilize Oracle Database 10g Enhancements for Local Index Maintenance
2.	Utilize Oracle's Global Index Maintenance

1.1	Split the most recent partition by using the default placement rules.
1.2	Split a partition by using the extended SQL syntax for local index maintenance.
1.3	Clean up.

2.1	Prepare for global index maintenance.
2.2	Build a global index.
2.3	Exchange a partition with global index maintenance and experience its effect on global indexes.
2.4	Exchange a partition without global index maintenance and experience its effect on global indexes.
2.5	Drop the global index and exchange back (clean up).

	Load data using external tables
	Compare usage of SQL*Loader to external tables
	Perform a table compression to save disk space
	Perform a rolling window operation using Oracle partitioning

Purpose

Topics

Viewing Screenshots

Overview

Scenario

External Tables

Parallel Execution

Rolling Window Operations Using Oracle Partitioning

Load Data by Using External Tables

1. Create the Necessary Directory Objects

2. Create the External Table

3. Select From the External Table

4. Provide transparent high-speed parallel access of external tables

5. Review Oracle's Parallel Insert Capabilities

6. Perform Parallel Insert

Compare SQL*Loader to the Simple Loading and Transformation Process with External Tables

1. Create a Staging Table

2. Load the data into the staging table by using SQL*Loader

3. Load the Staging Table into the Target Database

4. Drop the Staging Table

Leveraging Table Compression to Save Disk Space and Reduce the TCO

1. Compress the Most Recent Partition

Perform a Rolling Window Operation Using Oracle Partitioning

Perform the Steps of the Rolling Windows Operation:

1. Prepare a Stand-Alone Table with the New Data

1.1 Modify the External Table to DBMS_STATS the Sales Q1 Data

1.2 Create the Table for the New Sales Q1 Data

1.3 Load This Table

1.4 Create Bitmap Indexes for This Table

1.5 Create Constraints for This Table

2. Add the New Data to the Fact Table

2.1 Create a New Partition

2.2 Exchange the Partition

2.3 Select From the Partition

2.4 Split the Most Recent Partition to Ensure (Business) Data Integrity

3. Delete Old Data from the Fact Table

3.1 Create an Empty Stand-Alone Table

3.2 Create Bitmap Indexes for This Table

3.3 Create Constraints for This Table

3.4 Show the Data in the Partition Before the Exchange

3.5 Exchange the Partition

3.6 Show the Data in the Partition After the Exchange

Oracle 10g Enhancements for Local Index Maintenance

1.1 Split the Most Recent Partition

1.2 Split a Partition Using the Extended SQL Syntax

1.3 Clean Up

2. Utilize Oracle's Global Index Maintenance

2.1 Prepare for Global Index Maintenance

2.2 Build a Global Index

2.3 Exchange a Partition with Global Index Maintenance

2.4 Exchange a Partition Without Global Index Maintenance

2.5 Drop Global Index and Exchange Back

Summary