Implementing Information Lifecycle Management
Using Oracle Database 10g
Purpose
The goal of this tutorial is to show you how to manage data
during its lifetime using the Information Lifecycle Management strategy with
Oracle Database 10g.
Place
the cursor over this icon to load and view all the screenshots for this tutorial.
(Caution: This action loads all screenshots simultaneously, so response time
may be slow depending on your Internet connection.)
Note: Alternatively, you can place the cursor over an individual icon in the following steps to load and view only the screenshot associated with that step. You can hide an individual screenshot by clicking it.
Overview
Information Lifecycle Management (ILM) is concerned with everything
that happens to data during its lifetime.
Although most organizations have long regarded their stores
of data as one of their most valuable corporate assets, how this data is managed
and maintained varies enormously. Originally, data was used to help achieve
operational goals, run the business and help identify the future direction and
success of the company. However new government regulations and guidelines, such
as Sarbanes-Oxley, HIPAA, DOD5015.2-STD in the US and the European Data Privacy
Directive in the European Union, are a key driving force in how and why data
is being retained, as they are now requiring organizations to retain and control
information for very long periods of time.
Information today comes in a wide variety of types, for example
it could be an email message, a photograph or an order in an Online Transaction
Processing System. The challenge now before all organizations, is to understand
how their data evolves, determine how it grows, monitor how its usage changes
over time, and decide how long it should survive
Today there are two additional objectives IT managers are
trying to satisfy: to store vast quantities of data for the lowest possible
cost; and to meet the new regulatory requirements for data retention and protection.
Oracle Database 10g is Ready for Business ILM
Oracle Database 10g is ready today for Business ILM.
It is capable of storing many different types of data, and storing all of your
data in an Oracle database means that it is much easier to manage, because the
data is in one place, instead of being stored on multiple databases using many
different formats. The Oracle Database is the ideal platform to implement an
Information Lifecycle Management policy because it has a number of unique features
that make it very easy to implement an ILM solution, such as:
Fine-grained:
Managing data at individual
row level
Application
Transparency:
Data classification is transparent
Enforceable
Compliance
Policies:
Enforces data retention,
immutability, privacy, auditing and expiration
Low Cost:
Uses low cost storage options
to reduce operating costs
Implement ILM in Four Easy Steps
ILM can be implemented using Oracle Database 10g by
following these four simple steps:
1.
Define the data classes
2.
Create storage tiers for the data classes
3.
Manage access and migration of data by class
4.
Define and enforce compliance policies
Data can be classified in a variety of ways, such as by frequency
of usage, and storage tiers can be created for the data classes you have identified.
During the lifetime of the data, it might migrate between various data classes
and access to the data controlled in different ways.
Oracle Database 10g is ideal for implementing ILM.
It is simple to use because there are no specialized data stores to manage and
it operates independent of any hardware. It has proven fast performance, meaning
information is quickly retrieved. Oracle Database 10g provides an intelligent
central location to store data and enforce compliance policies. The security
features in the database ensure that data is secure from unauthorized access
and data is always transactionally consistent. The Oracle Database provides
total flexibility and it can rapidly adapt to any change in requirements, an
extremely important consideration due to the continuing evolution of the emerging
regulations. When using Oracle Database 10g, a comprehensive ILM solution
can be implemented for the lowest possible cost.
This tutorial shows how a number of the features in the Oracle
Database can be used to implement an Information Lifecycle Management strategy.
During this tutorial, you create storage tiers for each of the data classes
you define. You migrate data between storage tiers, protect the data from changes
and control which data users are allowed to see.
In this tutorial, you are implementing ILM for a fictional
company that must retain all sales orders placed since 1995. Users are only
allowed to see data from the year 2000 onwards and only a special category of
user is allowed to view the historical data.
The first step in defining data classes is to analyze the
data to determine the most suitable data classes for that data. Perform the
following steps to see how the sales orders data is currently classified:
1.
Open a browser window and enter the following URL:
http://<your host name>:1158/em
Login as system/oracle and click Login.
If the Oracle Database 10g Licensing Information
page is displayed, scroll down to the bottom of the page and click I
agree.
2.
Click Administration tab.
3.
Scroll down to the Schema region.
4.
Under Database Objects, click Tables.
5.
Enter SH for the schema and SALES for
the Object Name and click Go.
6.
Make sure SALES is selected and click Edit.
7.
Select the Partitions tab.
8.
The partitioning method shows that the data has been
classified by date using the column TIME_ID. The partition names and ranges
show the data has been grouped by years, half years, or quarters, depending
on how old the data is. Classifying data by time is a very common method
and in this case is the most suitable method. Click the Database Instance
breadcrumb.
You next need to identify where all of your data is to be
stored. Typically this involves using different types of storage devices such
as high performance disks for frequently accessed data and low-cost devices
for infrequently used data and historical data. In this tutorial, you create
three storage tiers: High Performance, Low Cost and Online Archive. Because
you do not have three types of storage devices available, you create tablespaces
that map to the tiers instead.
On the Administration page, under Storage, select Tablespaces.
2.
Click Create to create a new tablespace.
3.
Enter HIGH_PERFORMANCE as the Name. In the Datafiles
section, click Add.
Note: For the HIGH_PERFORMANCE
tier the data files would typically be placed on a high performance storage
device or in ASM. Because those storage options are not available for
this tutorial, the data file is stored in the same location as the other
data files for the database.
4.
The data file storage defaults to the ORADATA\ORCL
directory. The exact path may be different on your machine.
Enter high_perf for the File Name and specify
3 MB for the File Size. Then click Continue.
5.
Click OK to finish creating the tablespace.
6.
Now you can create the next storage tier tablespace.
Click Create. You repeat the process to create the Low_Cost Storage
Tier.
7.
Enter LOW_COST as the Name. In the Datafiles
section, click Add.
Note: For the LOW_COST
storage tier the data files would typically be placed on low cost ATA
storage devices. Because those devices are not available in this example,
this tutorial stores the data files in the same location as the other
data files for the database.
8.
Enter low_cost for the File Name and specify
10 MB for the File Size. Under Storage, click Automatically
extend datafile when full (AUTOEXTEND) and enter 10KB for the
increment. Then click Continue.
9.
Click OK to finish creating the tablespace.
10.
The final tier is the ONLINE_ARCHIVE
tier where all of the orders prior to the year 2000 are held. Click Create
to create the tablespace for this storage tier.
11.
Enter ONLINE_ARCHIVE as the Name. In the Datafiles
section, click Add.
Note: For the Online Archive tier the data files would
typically be placed on low cost ATA storage devices. Because those devices
are not available in this example, this tutorial stores the data files
in the same location as the other data files for the database.
12.
Enter online_arch for the File Name and specify
10 MB for the File Size. Then click Continue.
13.
Click OK to finish creating the tablespace.
14.
You now see the three tablespaces which correspond to
each Storage Tier listed on the Tablespaces page. Notice that the tablespaces
are currently empty. In a real ILM implementation, the tablespaces would
not have to be named to correspond to the required storage tiers. The
tablespace approach has been used in this tutorial to illustrate the process
because three different types of actual storage devices are unavailable.
Click the Database Instance breadcrumb.
Once the Storage Tiers have been created, the next step is
to assign the partitions holding the data to the appropriate storage tiers.
1.
In the Schema region, under Database Objects, select
Tables.
2.
Enter SH for Schema and SALES for Object
Name. Then click Go.
3.
Select SALES and then click Edit.
4.
Select Partitions tab.
5.
Assign the
HIGH_PERFORMANCE partitions first. The High Performance storage
tier should only contain orders for Q3 to Q4 for 2003. Select the last
set of pages in the drop-down list box.
6.
Select the SALES_Q3_2003 partition, select Move
for the Action and click Go.
7.
Click the flash light icon next to the Tablespaces field.
8.
Select the HIGH_PERFORMANCE tablespace and click
Select.
9.
Click OK to assign the partition to the HIGH_PERFORMANCE
tablespace.
10.
Select the SALES_Q4_2003 partition, select Move
for the Action and click Go.
11.
Enter HIGH_PERFORMANCE in the Tablespace field
and click OK.
12.
Click Apply to commit your changes.
13.
Now you can assign the partitions from Q4 2000
to Q2 2003 to the LOW_COST tablespace. Start by selecting the
SALES_Q1_2002 partition. Then select the Action Move and
click Go.
14.
Enter LOW_COST in the Tablespace field and click
OK.
15.
Repeat steps 13 and 14 for the following partitions and assign all of
them to the LOW_COST
tablespace. After clicking OK for the
last partition, select Previous 10.
The third step in implementing your ILM strategy is to decide
who can access your data and what they can do. There are a variety of different
techniques that can be used to restrict access to the historical data so that
it is not always visible in routine reports. In addition, you can prevent changes
to the historical data and illustrate how to store all of this data using the
minimum storage requirements. Perform the following tasks:
Because all the data is stored in the database, it is necessary
to ensure that data is only viewed when it is really required. This reduces
query execution times and saves system resources.
There are 2 techniques that can be used to restrict access
to the historical data:
Using views
Using a Virtual Private Database (VPD) policy
This section focuses on using views. The use of VPD policies
is demonstrated in a later section.
In this section, you create two views, one that allows standard
users to access data from January 1, 2000 and another view that allows access
to the historical data. Perform the following steps:
1.
From the Administration tab, in the Schema region under
Databsae Objects, click Views.
2.
First you create a view called SALES_HISTORICAL
that only allows access to orders that were placed before December 31,
1999.
Click Create.
3.
Enter SALES_HISTORICAL as the Name, SH
as the Schema and in the Query Text field enter the following text:
SELECT * FROM sales WHERE time_id < '1-Jan-2000'
Click OK.
4.
Now create a second view named SALES_CURRENT
that allows access to all the sales data after January 1, 2000.
Click Create.
5.
Enter SALES_CURRENT as the Name, SH as
the Schema and in the Query Text field enter the following text:
SELECT * FROM sales WHERE time_id > '31-Dec-1999'
Click OK.
6.
The views have been created and are listed at the bottom
of the Views page. The SALES_CURRENT
view is for the recent data and the SALES_HISTORICAL
view is for the old orders.
Now you create the SH_USER
user who uses these views. Click the Database Instance breadcrumb.
To demonstrate that the view SALES_HISTORICAL
only displays data prior to the year 2000, query the SALES_HISTORICAL
view to report the most recent data it holds. This can be done by querying the
TIME_ID column. Perform
the following steps:
1.
Start SQL*Plus for Windows by clicking Start,
selecting All Programs, selecting your Oracle Home installation,
the Application Development group of programs, and then selecting
SQL*Plus.
Enter sh for the User Name and Password, then
click OK.
2.
Enter the following command:
@C:\wkdir\ilm_query01
The ilm_query01.sql
script contains the following text:
SELECT MAX(time_id)FROM sh.sales_historical;
3.
Because this view only contains data up to the end of
the year 1999, you can now easily report which customers purchased the
most products up to the year 2000. Execute the following command in SQL*Plus:
@C:\wkdir\ilm_query02
The ilm_query02.sql
script contains the following text:
COLUMN max_sales FORMAT $999999.99
SELECT cust_last_name, SUM(amount_sold) MAX_SALES
FROM sh.sales_historical h, sh.customers c
WHERE c.cust_id = h.cust_id
GROUP BY cust_last_name
ORDER BY max_sales DESC;
In order to comply with some new regulations, you need to
show that only authorized users can update the sales data. Therefore, you need
to create a new user called SH_USER
to be used by all applications that access the sales data. The SH_USER
user is only permitted to read the sales data. Perform the following steps:
1.
In the Enterprise Manager window, from the Administration
tab, scroll down to the Schema region.
2.
Under Users & Privileges, click Users.
3.
The first step is to create a new user SH_USER.
Click Create.
4.
Enter SH_USER for the Name and Password. Confirm
the password by entering SH_USER
again. Enter EXAMPLE for the Default Tablespace and TEMP
for the Temporary Tablespace, then click the Roles tab.
5.
Only limited resources are granted to new users by default.
Click Edit List to grant additonal privileges to the SH_USER.
6.
Select RESOURCE from the list of Available Roles
and click Move.
7.
Click OK.
8.
To ensure that this user can only read the sales and
customer data, they must be explictly granted the SELECT privilege on
those objects. Click the Object Privileges tab.
9.
Select Table from the Select Object Type drop-down list
and click Add.
10.
In the Select Table Objects field, enter SH.SALES,
SH.SALES_HISTORICAL, SH.CUSTOMERS. In the list of Available Privileges
click Select. Click Move to indicate you want to grant this
privilege, then click OK.
11.
The Create User page now shows the object privileges
that will be granted to the user. Click OK to create the user and
grant the privileges to the new user.
12.
The SH_USER
user has been created. Click the Database Instance breadcrumb.
Some new regulations require that historical data cannot be
changed or there may be a business requirement where records may not be altered
after a certain date. Using the security features within the Oracle Database,
user SH_USER does not have sufficient privilege to update the historical data.
Perform the following steps to test this functionality:
1.
Switch back to the SQL*Plus window and execute the following
SQL commands:
connect sh_user/sh_user
@C:\wkdir\ilm_update01
The ilm_update01.sql
script contains the following text:
UPDATE sh.sales SET prod_id=12 WHERE prod_id=13 AND time_id='10-Jan-98';
There will come a point during the lifetime of the data when
updates to the data will no longer be allowed. In the prior section, you saw
how database security can be used to prevent changes, but an even better approach
is to make the tablespace to read-only. Once a tablespace is read-only, absolutely
no one can update the data, even users with UPDATE
or DELETE privileges
on the sales tables. Perform the following steps:
1.
Switch back to Enterprise Manager Database Control.
From the Administration tab, under Storage, click Tablespaces.
2.
Select the ONLINE_ARCHIVE tablespace, select the Action
Make Readonly and click Go.
3.
The clipboard icon in the status column indicates that
this tablespace is now set to read-only. Click on ONLINE_ARCHIVE.
4.
On the View Tablespace page, the Status value confirms
that the tablespace is read-only. Click the Database
Instance breadcrumb.
There may come a time during the lifetime of the data that
it is no longer needed. Some organizations may prefer to archive the data rather
than actually delete it. You have decided to extract the data in XML so that
if there was ever a need to reload the data back into the database, the format
used at extraction time is described within the data itself. Oracle Database
10g provides a number of XML functions that can be embedded in a SQL statement
to extract data in XML format.
Perform the following steps:
1.
From your SQL*Plus window, execute the following script:
@C:\wkdir\ilm_archive_sales
The ilm_archive_sales.sql script contains the following text:
connect sh/sh SELECT XMLELEMENT("Sales", XMLElement ("timeid", time_id ), XMLElement ("prodid", prod_id ), XMLElement ("Customer", cust_id ), XMLElement ("Channelid", channel_id ), XMLElement ("Promotion", promo_id ), XMLElement ("amount", amount_sold) , XMLElement ("quantity", quantity_sold) ) AS "RESULT" FROM sales WHERE time_id > '1-Dec-01' AND time_id <'1-Jan-02';
Oracle provides an intelligent central location to store data
and enforce compliance policies. There are five elements to a compliance policy:
retention, immutability, privacy, auditing and expiration. In this section you
are shown how some of these elements can be implemented. Perform the following
tasks:
With so much data being retained in the database it is important
to ensure that users only view the data they are allowed to see. Although views
restrict what a user can see in the database, if a user does not use the designated
application to query the data and instead uses a tool such as SQL*Plus, the
user can bypass the view and query the table directly. A security policy can
be defined using Virtual Private Database (VPD) to avoid this potential breach
of security. This security policy defines exactly what data is visible to the
user performing the query regardless of how the data is accessed. In this section,
you use VPD to prevent access to the historical data.
Perform the following steps:
1.
First, count the number of rows visible in the SALES
table as the SH
user. Make a note of returned row count. The SH
user currently has ccess to all the data in the SALES
table. From your SQL*Plus window, run the following SQL script:
@C:\wkdir\ilm_count_sales
The ilm_count_sales.sql
contains the following:
connect sh/sh SELECT COUNT(*) FROM sales;
2.
Next, as the user SYSTEM, add the following VPD Policy
which states that a user can only see data from 1-Jan-2000. From your
SQL*Plus session, run the following SQL script:
@C:\wkdir\ilm_add_policy
The ilm_add_policy.sql
contains the following:
connect system/oracle CREATE OR REPLACE FUNCTION ilm_pf1 (oowner IN VARCHAR2, ojname IN VARCHAR2)
RETURN VARCHAR2 AS con VARCHAR2(200); BEGIN con :='time_id > ''31-DEC-1999'''; RETURN (con); END ilm_pf1; / BEGIN DBMS_RLS.ADD_POLICY (object_schema=>'SH',object_name=>'Sales', policy_name=>'ilm_current_data',function_schema=>'system', policy_function=>'ilm_pf1', sec_relevant_cols=>'time_id'); END; /
3.
Now execute the same the query aginst the SALES
table as the SH
user as you did in Step 1. Fewer rows are returned because now only data
from January, 1 2000 and onwards is available to this user.
From your SQL*Plus window, run the following SQL commands:
To comply with some of the new security regulations, certain
data must be encrypted. Oracle Database 10g Release 2 has the ability
to encrypt individual columns in a table. In this section, you add a new table
to store credit card details for a customer. However, the actual credit card
number is stored in an encrypted form in the database operating system files.
Before using Transparent Data Encryption, you must create
a wallet. Column encyrption uses Oracle wallets to safely store the master encryption
keys. For greater security, you create a new wallet for storing the data encryption
master keys rather than use the default database wallet. Perform the following
steps
1.
First, you need to update your sqlnet.ora
file to include an entry for ENCRYPTED_WALLET_LOCATION.
Click the Start button on your Windows
desktop, click Run... and in the Open
field type in notepad. Then click OK.
2.
In the Notepad application, choose File
from the menu, then Open.
Use the Look In drop-down menu to navigate to your Oracle Home directory.
Make note of this directory path. Then go to the NETWORK/ADMIN sub-directory.
Change the Files of Type selection to All Files, then select sqlnet.ora
from trom the list of displayed files and click Open.
Add the following text to this file, using the
path name of the Oracle Home directory on your system. If the file is
not empty, then add this entry to the end of the file:
ENCRYPTION_WALLET_LOCATION=
(SOURCE=(METHOD=FILE)(METHOD_DATA=
(DIRECTORY=<Oracle Home Directory Path>)))
Save your changes and quit the application.
3.
From the notepad menu, choose File
then Save. Then close the application
window.
4.
You need to make sure your listener is started. Click
the Start button on your Windows desktop,
click Run... and in the Open field type
in cmd. Then click OK.
When the cmd.exe
application window opens, type in the following command at the prompt:
lsnrctl status
The listener should be started, as shown above. If it
is not, execute the following command:
lsnrctl start
5.
Next, open the wallet and generate the master encryption
key. Because a wallet does not yet exist for this database, setting the
master key automatically creates a new wallet at the location specified
in the sqlnet.ora file.
Open a SQL*Plus window. Login in as the SYSTEM
user, using the password oracle. Then,
execute the following command:
@C:\wkdir\ilm_encrypt_setup
The ilm_encrypt_setup.sql script generates the database server master encryption key and sets the Oracle wallet password using the following commands:
connect / as sysdba
ALTER SYSTEM SET KEY IDENTIFIED BY "welcome1";
The master encryption key is
necessary because each table has its own encryption key. These column
keys are stored in the database. Since the wallet can only store a limited
number of keys, the column keys are encrypted with the master key. This
way, you can have as many column keys as needed, with only a small number
of master keys stored in the wallet. By default, the command shown above
generates a key using the Advanced Encryption Standard with 192 bits (AES192).
You only need to generate a master key once. You
should only generate a new master key if the existing one is comprised.
Once the wallet has been opened, it remains open
as long as the database instance is open. When you shutdown and restart
the database instance, instead of generating a new master key, you reopen
the Oracle wallet to access the stored master keys using the following
command:
ALTER SYSTEM SET WALLET OPEN IDENTIFIED BY "welcome1";
Note: Only users with the ALTER SYSTEM privilege can open the wallet; users without proper privileges are not able to use the key, because it is not available to them.
6.
Next you create a table that contains a column to store
encrypted credit card information. From your SQL*Plus session, execute
the following script:
Some of the regulations also require that you audit all changes
made to your data. Fine Grained Access control provides the capability to audit
data access at a very fine level of detail and invoke a procedure when a security
violation occurs. Perform the following steps:
1.
First create a policy that audits changes made to a
credit card number. In theSQL*Plus window, run the following script:
With the policy enabled, you can now insert a row into
the credit card table. An audit record is automatically written to the
dictionary view DBA_FGA_AUDIT_TRAIL.
From your SQL*Plus window, run the following script: