As Published In
Oracle Magazine
November/December 2013

TECHNOLOGY: Database Cloud

  

Automatic Data Optimization

By Arup Nanda Oracle ACE Director

 

Learn how to enable information lifecycle management to automatically move data to lower-cost storage tiers and compress it.

John, the principal database architect at Acme Bank, sips his coffee while listening to his visitors’ woes. The storage demand for databases at Acme is rising rapidly, due to the seemingly endless stream of data points. Because data—however old—can be good for marketing analytics, no one wants to purge old data, though. Although Acme does have an information lifecycle management (ILM) policy to put older and infrequently accessed data on lower-cost storage, the implementation has not been very effective, and unclear identification of “infrequently accessed” data is the primary reason. Everyone agrees on the time period during which the lack of access defines infrequently accessed data but not the exact mechanics of identifying it. The standard approach for identifying the data is to put a trigger on the tables to flag data that hasn’t changed, but Acme’s DBAs eschew such an approach, because it negatively affects performance. Additionally, everyone agrees that an automated system to move old, unchanged data to lower-cost storage would be ideal. Is there, all want to know, a better approach to achieving Acme’s ILM objectives?

There is, responds John: using the Automatic Data Optimization feature of Oracle Database 12c.

Setup

Acme deals with several currencies, including the British pound, the euro, and the Canadian dollar, in addition to its primary currency, the US dollar. It’s vital to record the exact exchange rate of the currencies. The rate can change every day, so the bank stores the rates along with specific dates. Because financial transactions can be made effective as of a different date, past records of exchange rates must be stored and available. The following is what the EXCHANGE_RATE table looks like: 

SQL> desc exchange_rate

NAME        TYPE
——————————  ————————————
FROM_CURR   VARCHAR2(3)
TO_CURR     VARCHAR2(3)
START_DATE  DATE
END_DATE    DATE
RATE        NUMBER


The FROM_CURR and TO_CURR columns show the codes of the source and target currencies, with the RATE column showing the exchange rate. Because the rate changes, the START_DATE and END_DATE columns show the dates during which that rate is effective. The table is partitioned by range on END_DATE, with one partition per month. The partitions are named in the format YyyMmm, where yy is the two-digit year and mm is the two-digit month. For example, the partition y13m12 holds all data for December 2013. The table and its data can be created by the SQL script in the Online Setup Listing. All the partitions of the table exist in the LATEST_TS tablespace. There is another tablespace—MIDTERM_TS—built on lower-cost storage.

Online Setup Listing

connect sys/oracle@localhost:1522/ANL2 as sysdba

create tablespace latest_ts datafile 
‘C:\ORACLE\ORADATA\ANL2\LATEST_TS_01.DBF’ size 111M autoextend on next 1m;

create tablespace midterm_ts datafile 
‘C:\ORACLE\ORADATA\ANL2\MIDTERM_TS_01.DBF’ size 111M autoextend on next 1m;

grant create session, create table to forex identified by forex
/
alter user forex quota unlimited on latest_ts
/
alter user forex quota unlimited on midterm_ts
/

connect forex/forex@localhost:1522/ANL2
create table exchange_rate (
   from_curr       varchar2(3),
   to_curr         varchar2(3),
   start_date      date,
   end_date        date,
   rate            number
)
tablespace latest_ts
/

declare
	l_rand	number;
begin
	for d in 1..365 loop
		l_rand := dbms_random.value(1,2);
		insert into exchange_rate values
		(‘USD’,’GBP’,sysdate-d-1,sysdate-d,l_rand);
		insert into exchange_rate values
		(‘USD’,’CAD’,sysdate-d-1,sysdate-d,l_rand);
		insert into exchange_rate values
		(‘USD’,’EUR’,sysdate-d-1,sysdate-d,l_rand);
	end loop;
	commit;
end;
/ 


Jill, the lead developer, has two objectives:

  1. When the free space in the LATEST_TS tablespace drops to a certain threshold, 15 percent of the total space allocated, the less-used partitions of the EXCHANGE_RATE table should be relocated to the MIDTERM_TS tablespace. It’s important, Jill cautions, that this relocation of partitions should be based on access, not on time. If an older partition is being accessed quite frequently, it should be left in LATEST_TS. Similarly, if a newer partition is accessed less frequently, it should be moved to MIDTERM_TS.

  2. The partitions that do not see their data modified should be compressed to save space. The choice of segments for compression, Jill further cautions, should be independent of the relocation to a different storage tier. For example, a partition can exist in the LATEST_TS tablespace but be compressed because the data there doesn’t get modified. Similarly, a partition may be moved to MIDTERM_TS because it is accessed less frequently, but because the access is mostly write activity, it may not be compressed.


Acme can satisfy both requirements by using Automatic Data Optimization, assures John. This feature, he cautions, is not available in a pluggable database in a multitenant environment, however, and it’s also not available in materialized views and tables with object types.

Heat Map

The first step in using Automatic Data Optimization, John explains, is to enable a new feature in Oracle Database 12c called Heat Map. Simply speaking, it shows the popularity or “hotness” of a segment such as a table or a partition, and it is used by Automatic Data Optimization to decide which segments are candidates for action. If the segment is accessed frequently, it’s considered hot and therefore may not be a candidate for some actions, such as relocation to lower-cost and less-efficient storage.

John enables the Heat Map feature by executing the following command as a DBA user: 

alter system set heat_map = on scope=both; 


It is a one-time activity. Using scope=both, John ensures that the parameter is set in SPFILE and therefore will be in effect when the database is restarted.

To check the heat map of a specific segment, he looks into the DBA_HEAT_MAP_SEGMENT view. The time stamps of the last activity on the table—UPDATE, SELECT, full table scan, and reference constraint lookup—are recorded in this view. He describes the columns of the view for his listeners:

Enabling Automatic Data Optimization

Before starting the Automatic Data Optimization demonstration, John checks a few things, including the presence of segments in the tablespaces, using the following SQL statement: 

select partition_name, 
       tablespace_name
from user_segments
where segment_name = 
  'EXCHANGE_RATE';

PARTITION   TABLESPACE
—————————   ——————————
Y13M07      LATEST_TS
… output truncated …
Y12M07      LATEST_TS 


John directs everyone’s attention to the output to confirm that all the partitions are in the LATEST_TS tablespace as expected. Then he checks the allocated and free space in the tablespace by running the script shown in Listing 1.

Code Listing 1: Checking for free space 

select t.tablespace_name, 100*sum(f.bytes)/sum(t.bytes) free_pct
from (select tablespace_name, sum(bytes) bytes 
         from dba_data_files group by tablespace_name) t,
     (select tablespace_name, sum(bytes) bytes 
         from dba_free_Space group by tablespace_name) f
where f.tablespace_name = t.tablespace_name
group by t.tablespace_name;

TABLESPACE  FREE_PCT
——————————— —————————
LATEST_TS      30.00
MIDTERM_TS     99.33 


The output in Listing 1 shows that only 30 percent of the LATEST_TS tablespace is free space, whereas the MIDTERM_TS tablespace—where no partitions are located—is mostly free, with 99.33 percent free space.

Next, John enables Automatic Data Optimization for the EXCHANGE_RATE table, by executing the following SQL command as the user FOREX (who owns the table): 

alter table exchange_rate ilm add policy tier to midterm_ts; 


In this command, John creates an ILM policy on the EXCHANGE_RATE table that tells Automatic Data Optimization that the second storage tier for the table is the MIDTERM_TS tablespace. Data should be relocated to this tablespace if the free space in the original tablespace drops below the default threshold level of 15 percent. Because the current tablespace is 30 percent free now, the ILM policy has no effect on the table—at least not yet—John explains.

As time passes, Acme DBAs will add more partitions to the table to hold new data, and Listing 2 shows the SQL statements they will use to add the partitions for October, November, and December 2013, named Y13M10, Y13M11, and Y13M12, respectively. To simulate the storage volume growth for that time, John inserts some rows that will go into these partitions, as shown in Listing 2.

Code Listing 2: Adding new partitions 

alter table exchange_rate add partition y13m10 
values less than (to_date('11/01/2013','mm/dd/yyyy')) tablespace latest_ts;
alter table exchange_rate add partition y13m11 
values less than (to_date('12/01/2013','mm/dd/yyyy')) tablespace latest_ts;
alter table exchange_rate add partition y13m12 
values less than (to_date('01/01/2014','mm/dd/yyyy')) tablespace latest_ts;
insert into exchange_rate values ('USD', 'GBP', add_months(sysdate,2), 
add_months(sysdate,2),1);
insert into exchange_rate values ('USD', 'GBP', add_months(sysdate,3), 
add_months(sysdate,3),1);
insert into exchange_rate values ('USD', 'GBP', add_months(sysdate,4), 
add_months(sysdate,4),1);
commit; 


With these additional partitions, John points out, the LATEST_TS tablespace is now more populated. He checks the free space again by executing the SQL shown in Listing 1 and reviewing the result: 

TABLESPACE  FREE_PCT
——————————  —————————
LATEST_TS       8.67
MIDTERM_TS     99.33 


The free space in LATEST_TS has now dropped to 8.67 percent—below the threshold of 15 percent—so the ILM policy should trigger the relocation of some segments from this tablespace to free up space. John waits for the result of the ILM policy and checks the location of the partitions. 

select partition_name, 
       tablespace_name
from user_segments
where segment_name = 
  'EXCHANGE_RATE';

PARTITION  TABLESPACE
—————————  ——————————
Y13M11     LATEST_TS
… output truncated …
Y12M08     LATEST_TS
Y13M07     MIDTERM_TS
Y13M06     MIDTERM_TS
Y12M12     MIDTERM_TS
Y12M07     MIDTERM_TS 


John directs everyone’s attention to the last four partitions—Y12M07, Y12M12, Y13M06, and Y13M07. Earlier, these partitions were located in the LATEST_TS tablespace, but now they are in MIDTERM_TS. This activity resulted in the freeing up of space in the LATEST_TS tablespace. This relocation of the partitions was done automatically without the intervention of the DBAs and only after the free space dropped below the threshold. Jill nods her approval. The DBAs appreciate that the relocation required nothing more than defining the ILM policy on the table. All in all, everyone is happy.

Checking the Heat Map

Jill observes that only a few partitions—not all—were relocated to the spillover tablespace. The segments for relocation, John explains, are based on the heat map of the segments, described earlier. The less recently the segment is accessed, the greater the chance that it will be relocated. John shows everyone the heat map of the EXCHANGE_RATE table, by using the SQL statement shown in Listing 3.

Code Listing 3: Checking the last access time for a table 

select
   subobject_name “Part Name”,
   to_char(segment_write_time,'mm/dd/yy hh24:mi:ss') write_time,
   to_char(segment_read_time,'mm/dd/yy hh24:mi:ss')  read_time,
   to_char(full_scan,'mm/dd/yy hh24:mi:ss')          fts_time
from dba_heat_map_segment
where owner = 'FOREX'
and object_name = 'EXCHANGE_RATE'
order by full_scan desc;

Part Name   WRITE_TIME         READ_TIME          FTS_TIME
—————————  ——————————————————  —————————————————  —————————————————
Y13M03      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y12M09      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y13M04      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y12M11      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y13M10      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y12M08      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y13M09      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y13M02      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y13M05      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y13M01      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y12M10      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y13M11      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 19:48:44
Y13M07      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 00:57:12
Y12M07      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 00:57:12
Y12M12      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 00:57:12
Y13M06      07/14/13 12:23:01  07/14/13 12:24:34  07/15/13 00:57:12 


Jill examines the output carefully and observes that for the partitions Y13M07, Y12M07, Y12M12, and Y13M06, the full table scan occurred at 07/15/13 00:57:12, compared to 07/15/13 19:48:44 for the rest of the partitions. These are exactly the same partitions relocated to the MIDTERM_TS tablespace by Automatic Data Optimization. “Was that why these partitions were candidates for relocation?” she asks.

Exactly, John confirms. When Automatic Data Optimization had to choose the least recently accessed segment in the LATEST_TS tablespace, it looked at the heat map of the objects in that tablespace and selected these partitions, because they were the least recently accessed.

Administration

The group—now suitably impressed—wants to know more about managing the ILM policies. “How can I check the threshold at which Automatic Data Optimization kicks in?” asks Jill. To see that and other parameters ILM uses, John looks at the DBA_ILMPARAMETERS view: 

select * from dba_ilmparameters;

NAME                  VALUE
———————————————————   —————
ENABLED                   1
JOB LIMIT                10
EXECUTION MODE            3
EXECUTION INTERVAL       15
TBS PERCENT USED         85
TBS PERCENT FREE         25 


The TBS PERCENT USED parameter specifies the percentage at which the tablespace is considered to be full. The value in this example is 85 percent, so the LATEST_TS tablespace was considered full when the free space earlier fell to 8.67 percent, John explains. This triggered Automatic Data Optimization’s relocation of segments to the next tier of storage. “When does it stop relocating?” asks Jill. This is where, John responds, another parameter—TBS PERCENT FREE—comes into play. The relocation of segments out of the tablespace continues until the free space percentage reaches this parameter value, 25 percent in this case. However, John cautions, this is just an estimate; Automatic Data Optimization does not guarantee that there will be that much free space.

“Suppose I want to change the 85 percent used to consider the tablespace full,” Jill muses. “Can I change it to, say, 90 percent?” Of course, John answers, and executes the following SQL to change the parameter to 90 percent: 

begin
 dbms_ilm_admin.customize_ilm (
   dbms_ilm_admin.tbs_percent_used,
   90);
end;
/ 


Likewise, John explains, all the properties listed in the DBA_ILMPARAMETERS view can be changed with the CUSTOMIZE_ILM procedure in the DBMS_ILM_ADMIN package.

Next Steps 


READ more about Automatic Data Optimization Oracle Database
 Oracle Database VLDB and Partitioning Guide 12c Release 1 (12.1)

 LEARN more about Automatic Data Optimization

Someone else has another question: “After a period of time, we would like to know what tables are under ILM policies and how the second tier of storage has been defined. Is there a way to get that information?” “Yes, there is,” John assures. He executes the SQL statements shown in Listing 4 to get that information. From the output, he demonstrates that the policy named P21 is defined on the table EXCHANGE_RATE, with the spillover tablespace being MIDTERM_TS. All the partitions of the table inherit the policy from the table (as shown in the INHERITED_FROM column). The last row in the result shows that the table itself is under the policy but is not inherited.

Code Listing 4: Identifying ILM data movement policies 

select policy_name, action_type, scope, tier_tablespace, condition_days
from user_ILMDataMovementPolicies;

POLICY_NAME  ACTION_TYPE  SCOPE    TIER_TABLESPACE 
———————————  ———————————  ——————   ———————————————
P21          STORAGE      SEGMENT  MIDTERM_TS          

select * from user_ILMObjects where object_name = 'EXCHANGE_RATE';

POLICY_NAME  SUBOBJECT_NAME  OBJECT_TYPE      INHERITED_FROM        ENA
———————————  ——————————————  ———————————————  ——————————————        ————
P21          Y12M01          TABLE PARTITION  TABLE                 YES
P21          Y12M02          TABLE PARTITION  TABLE                 YES
P21          Y12M03          TABLE PARTITION  TABLE                 YES
… output truncated …
P21                          TABLE            POLICY NOT INHERITED  YES 


“What actually moves the segments to a different tablespace?” Jill wants to know. Fair question, John concedes and explains that there is a job that moves the segments when the time comes. The job kicks in during the maintenance window in the database and calls the EXECUTE_ILM procedure in the DBMS_ILM package. To see details of this job, John queries DBA_ILMTASKS, as shown in Listing 5. It shows him the start and end time of the task. To get more details on the task, he executes the second query in Listing 5.

Code Listing 5: Checking ILM tasks 

select task_id, task_owner,
   to_char(start_time, 'mm/dd/yy hh24:mi:ss') start_time,
   to_char(completion_time, 'mm/dd/yy hh24:mi:ss') completion_time
from dba_ilmtasks
where task_owner != 'SYS'

   TASK_ID  TASK_OWNER  START_TIME         COMPLETION_TIME
——————————  ——————————  —————————————————  —————————————————
         3  FOREX       07/15/13 10:41:37  07/15/13 10:41:43

 -- Find the task details

select job_name, job_state,
   to_char(start_time, 'mm/dd/yy hh24:mi:ss') start_time,
   to_char(completion_time, 'mm/dd/yy hh24:mi:ss') completion_time
from dba_ilmresults
where task_id = 3;

JOB_NAME    JOB_STATE               START_TIME         COMPLETION_TIME
————————    ——————————————————————  —————————————————  —————————————————
ILMJOB18    COMPLETED SUCCESSFULLY  07/13/13 10:41:41  07/13/13 10:41:43 


Compression

Jill is happy to see her first requirement—partition relocation—successfully met. She asks about her second request—to compress unused rows in tables. Although the exchange rates set in the table are usually immutable, they may occasionally be changed due to a mistake in data entry or a miscommunication. However, changes to data more than seven days old are rare. Therefore, Jill wants to automatically compress rows in the EXCHANGE_RATE table that have not changed for the last seven days.

To accomplish that objective, John adds another ILM policy to the table: 

alter table exchange_rate
ilm add policy row store 
compress advanced segment
after 7 days 
of no modification; 


With this ILM policy in place, the segment is automatically compressed after a day passes without modification of any row in a segment. To check on the new policy, John uses the SQL query shown in Listing 6. The result shows a new policy—named P41—that has the COMPRESSION action type, which indicates that the policy acts by compressing data in the table. CONDITION_TYPE shows how Automatic Data Optimization should decide on the applicability of the policy (data not modified, in this case), and CONDITION_DAYS shows the number of days after which the policy should choose the object for compression (seven, in this case).

Code Listing 6: Checking ILM policies on user objects 

select policy_name, action_type, scope, compression_level, 
condition_type, condition_days
from user_ilmdatamovementpolicies
order by policy_name;

POLICY_NAME ACTION_TYPE SCOPE   CO...VEL CONDITION_TYPE         CONDITION_DAYS
——————————— ——————————— ——————— ———————— —————————————————————— ——————————————
P21         STORAGE     SEGMENT                                              0
P41         COMPRESSION SEGMENT ADVANCED LAST MODIFICATION TIME              7 


After the policy has been in place and seven days have passed, John could fire up the query shown in Listing 7 to check on the compression status of the partitions. The result would demonstrate how various partitions have been compressed, as shown by an ENABLED value in the COMPRESSION column. If the partition is not compressed, this column would report DISABLED.

Code Listing 7: Checking compression of partitions 

select partition_name, tablespace_name, compression, compress_for
from dba_tab_partitions
where table_owner = 'FOREX'
and table_name = 'EXCHANGE_RATE'
order by partition_name;

PARTITION_NAME TABLESPACE COMPRESSION COMPRESS_FOR
—————————————— —————————— ——————————— ————————————
Y12M01         MIDTERM_TS ENABLED     ADVANCED
Y12M02         MIDTERM_TS ENABLED     ADVANCED
Y12M03         MIDTERM_TS ENABLED     ADVANCED
Y12M04         MIDTERM_TS ENABLED     ADVANCED
Y12M05         LATEST_TS  DISABLED
Y12M06         LATEST_TS  DISABLED
Y12M07         MIDTERM_TS ENABLED     ADVANCED
Y12M08         LATEST_TS  ENABLED     ADVANCED
Y12M09         LATEST_TS  ENABLED     ADVANCED
Y12M10         LATEST_TS  ENABLED     ADVANCED
Y12M11         LATEST_TS  ENABLED     ADVANCED
Y12M12         MIDTERM_TS ENABLED     ADVANCED
Y13M01         LATEST_TS  ENABLED     ADVANCED
Y13M02         LATEST_TS  ENABLED     ADVANCED
Y13M03         LATEST_TS  ENABLED     ADVANCED
Y13M04         LATEST_TS  ENABLED     ADVANCED
Y13M05         LATEST_TS  ENABLED     ADVANCED
Y13M06         MIDTERM_TS ENABLED     ADVANCED
Y13M07         MIDTERM_TS ENABLED     ADVANCED
Y13M08         MIDTERM_TS ENABLED     ADVANCED
Y13M09         LATEST_TS  DISABLED
Y13M10         LATEST_TS  DISABLED
Y13M11         LATEST_TS  DISABLED
Y13M12         LATEST_TS  DISABLED 


Conclusion

Revisiting the original requirements, John shows how they have been addressed by the new Advanced Data Optimization feature, introduced in Oracle Database 12c. Under the storage-tier-based ILM policy, a segment is moved to a different tablespace (on lower-cost storage) when the free space in the original tablespace falls below a defined threshold. Under the compression-based ILM policy, segments are compressed when the data has not been modified for a specified number of days. And all this is done automatically without DBA intervention. Agreeing on the high value of this feature, the DBAs as well as the developers thank John and leave happy.


Arup Nanda Headshot


Arup Nanda
(arup@proligence.com) has been an Oracle DBA since 1993, handling all aspects of database administration, from performance tuning to security and disaster recovery. He was Oracle Magazine’s DBA of the Year in 2003 and received an Oracle Excellence Award for Technologist of the Year in 2012.

 

Send us your comments