Understand your partition choices, and choose the best partition and key for your data.
The partitioning feature of Oracle Database enables you to partition stored data segments such as tables and indexes for easier management and improved performance. Oracle Database provides many types of partitioning options, including range, list, hash, range/list, and range/hash. The real challenge for database designers is not creating the partitioned objects; rather, it's the decision that precedes it—what type of partitioning to use and what column to use as a partition key. This article describes how to choose the best partitioning strategy, including the best partition type and best partition key column.
Partitioning, in a nutshell, stores a data segment such as a table as multiple segments while retaining a logically monolithic structure. The most popular partitioning option is range partitioning, with which you define a range of values for each partition. For example, in a table of customer transactions called TRANS, you can range-partition it by using the TRANS_DT (transaction date) column as the partition key so that a first partition holds records in which the TRANS_DT value is between January 1 and March 31, 2005; the second partition holds records in which the TRANS_DT value is between April 1 and June 30; and so on. Listing 1 creates this table.
Code Listing 1: Script for creating the TRANS table with range partitions
create table trans ( trans_id number, trans_dt date, product_code number, store_id number, trans_amount number(12,2) ) partition by range (trans_dt) ( partition y05q1 values less than (to_date('04/01/2005','mm/dd/yyyy')) tablespace y05q1, partition y05q2 values less than (to_date('07/01/2005','mm/dd/yyyy')) tablespace y05q2, partition y05q3 values less than (to_date('10/01/2005','mm/dd/yyyy')) tablespace y05q3, partition y05q4 values less than (to_date('01/01/2006','mm/dd/yyyy')) tablespace y05q4, partition pmax values less than (maxvalue) tablespace users )
After you've created the TRANS table, you will see—if you query the DBA_TABLES dictionary view—that the PARTITIONED column value is YES. You can get further details about the partitioning, such as the type of partitioning scheme and the number of partitions, from the DBA_PART_TABLES view. You can get the details of each partition, such as the name and the upper boundary of the partition, from the DBA_TAB_PARTITIONS view, as shown in Listing 2. Note that this view shows the high value of a partition, which is actually the lower boundary of the next partition. A partition holds records up to but not including its high value. For instance, the upper boundary of the Y05Q2 partition is "2005-07-01 00:00:00", meaning this partition contains values up to "2005-06-30 23:59:59". Records with TRANS_DT values of 2005-07-01 00:00:00 and later will be stored in the next partition-—Y05Q3.
Code Listing 2: TRANS partition details
col partition_position format 999 head "Pos" col partition_name format a10 head "Name" col high_value format a50 head "High Value" select partition_position, partition_name, high_value from dba_tab_partitions where table_name = 'TRANS' order by 1; Pos Name High Value --- ----- ----------------------------------------------------------------------- 1 Y05Q1 TO_DATE(' 2005-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GRE... 2 Y05Q2 TO_DATE(' 2005-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GRE... 3 Y05Q3 TO_DATE(' 2005-10-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GRE... 4 Y05Q4 TO_DATE(' 2006-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GRE... 5 PMAX MAXVALUE
In the TRANS table example, you could find a column that can be grouped into ranges of values such as dates, but it may not be possible to group all columns that way. For example, a column holding names of U.S. states contains a finite and small number of values. This type of column calls for list partitioning, in which the partitions hold discrete values instead of ranges. Here is an example:
create table sales ( product_id number, trans_amt number, sales_dt date, state_code varchar2(2) ) partition by list (state_code) ( partition ct values ('CT'), partition ca values ('CA'), partition def values (default) );
Sometimes it may not be possible to define the ranges of—or a set of known values for—each partition. A typical example is a table, CUST, containing customers, with the CUST_ID column as the primary key. The CUST_ID value is an incrementally increasing but rather meaningless number, so a range partition based on these values may also be rather meaningless.
You can partition the CUST table by using hash , where a hash function is applied to the partition key of each row and, based on the output, the row is placed in an appropriate partition. All the hash partitions hold an equal number of rows. Here is how you create the CUST table with four hash partitions:
create table cust ( cust_id number, cust_name varchar2(20) ) partition by hash (cust_id) partitions 4;
You can create two types of indexes on partitioned tables:
Local. The index is partitioned in exactly the same way as the base table. For instance, you can create a local index on the TRANS_AMOUNT column of the TRANS table as follows:
create index in_trans_01 on trans (trans_amount) local;
This creates a range-partitioned index, on the TRANS_DT column, the same way the TRANS table is partitioned. All the index entries for a specific partition, such as Y05Q1, will exist only inside the corresponding partition of the index.
Global. The index can span all partitions in the base table. For example, suppose there is a primary key on TRANS, on the TRANS_ID column. The primary key can be anywhere inside the table, across all the partitions. In this case, the index entries of a partition of the table may exist outside the corresponding partition of the index. For such a situation, create a global index, as follows:
alter table trans add constraint pk_trans primary key (trans_id) using index global;
Now that you've seen the basics of partitioning, here comes the tough part: How do you decide on the type and columns used for partitioning? Your choice of partitioning has to address key partitioning objectives for manageability and performance. How you partition will be based on the priorities you assign to your objectives. I break down the objectives into the following categories:
Ease of administration
Data lifecycle management
Efficiency of backup
Performance. This, of course, is the primary objective of many partitioning schemes. Performance advantages come from partition pruning or partitionwise joins, so if your queries do a lot of full-table scans, partitioning will help immensely, because partitions will limit the scope of the search. Consider a scenario involving the TRANS table. Suppose everyone is interested in the aggregate sales figures but product managers (PMs) are interested in the sales figures of their respective products only, not in all the sales. So the PM of product code "8" issues this SQL statement several times an hour:
select trans_dt, sum(trans_amount), avg (trans_amount), max(trans_amount), min(trans_amount) from trans where product_code = 8 group by trans_dt;
You can improve the performance of this query by building an index on the PRODUCT_CODE column, but it will increase the execution time on INSERT statements and might also adversely influence the execution plans of other queries on the same table. Therefore, it might be judicious not to create the index but to list-partition the table on the PRODUCT_CODE column, with one partition per PRODUCT_CODE value. The PM's query will still do a full-table scan, but this time the scope of the "full table" is actually the partition , not the entire table. This enhances the performance immensely. Recall that list partitioning was preferred over range partitioning in this case because the PRODUCT_CODE is a discrete value.
If two rather large tables are often joined together in SQL statements, consider the potential benefits of partitionwise joins. Suppose you have another table called ADVERTISING, which contains detailed information on the company's advertising. Here is a description of the ADVERTISING table:
Name Null? Type --------- --------- ------------- AD_ID NUMBER AD_DT DATE PRODUCT_CODE NUMBER AD_COST NUMBER CHANNEL VARCHAR2(10)
Suppose that PMs often query the ADVERTISING and TRANS tables to see the revenue vis-à-vis various advertising amounts spent to check the effectiveness of the advertising campaigns. Here is the query they use to find out the revenue numbers and advertising expenditures for a specific period for each product:
select t.product_code, sum(trans_amount), sum(ad_cost) from trans t, advertising a where t.trans_dt = a.ad_dt and t.product_code = a.product_code and trans_dt between '1-aug-05' and '1-sep-05' group by t.product_code;
Here the tables are joined on the columns TRANS_DT and AD_DT, both date columns. To enhance performance, you might decide to partition both tables in the same way. In this case, range partitioning on the columns being joined and partitioning on the same ranges is the most effective. Listing 3 shows the script for creating the ADVERTISING table with this range partition.
Code Listing 3: Script for creating the ADVERTISING table
create table advertising ( ad_id number, ad_dt date, product_code number, ad_cost number, channel varchar2(10) ) partition by range (ad_dt) ( partition y05q1 values less than (to_date('04/01/2005','mm/dd/yyyy')) tablespace y05q1, partition y05q2 values less than (to_date('07/01/2005','mm/dd/yyyy')) tablespace y05q2, partition y05q3 values less than (to_date('10/01/2005','mm/dd/yyyy')) tablespace y05q3, partition y05q4 values less than (to_date('01/01/2006','mm/dd/yyyy')) tablespace y05q4, partition pmax values less than (maxvalue) tablespace y05q4 );
Similarly, if you often join these tables on the PRODUCT_CODE column, then that column should be a strong candidate for the partition key. And because product code values are in a specific set, the partitioning scheme should be list partitioning.
If the ADVERTISING and TRANS tables are not joined on some specific date or date fields but rather on some sequential number such as the AD_ID or TRANS_ID value, you should reevaluate the partitioning strategy. Because records have unique values in those columns, you can't choose a list-partitioning strategy (which requires discrete, finite values). A range partition would be ineffective, because the sequentially increasing values in these two columns would require numerous partitions to be created that might not have any specific meaning.
In such a case, the hash partition is useful. It makes sure the records are evenly spread over all the partitions. You can create the tables as follows:
create table trans ( trans_id number, trans_dt date, product_code number, store_id number, trans_amount number(12,2) ) partition by hash (trans_id) ( partition trans1 tablespace trans1, partition trans2 tablespace trans2, partition trans3 tablespace trans3, partition trans4 tablespace trans4 ) / create table advertising ( ad_id number, ad_dt date, product_code number, ad_cost number, channel varchar2(10) ) partition by hash (ad_id) ( partition ad1 tablespace ad1, partition ad2 tablespace ad2, partition ad3 tablespace ad3, partition ad4 tablespace ad4 ) /
After the tables are created, you can use the DBMS_XPLAN tool to check to make sure the partitions are used properly. First you have to generate a plan for the PM's product query:
explain plan for select t.product_code, sum(trans_amount), sum(ad_cost) from trans t, advertising a where t.trans_id = a.ad_id and t.trans_id = 101 and t.product_code = a.product_code group by t.product_code;
Then you can see what the plan looks like:
select * from table(dbms_xplan.display);
The output is shown in Listing 4. Note the PSTART and PSTOP columns, which indicate which table partitions were selected. For the line with Id=4, the values of these columns are 4 and 4, which indicates that the fourth partition of the TRANS table was selected, where the record with TRANS_ID=101 resides. Note the line with Id=6, which shows that the optimizer will search only partition 4 of the joined ADVERTISING table as well, not the entire table. In the absence of an index, this hash-partition strategy makes the PM's query much faster than a full-table scan on the entire table, for both the TRANS and ADVERTISING tables.
Ease of administration. Partitioning is always introduced as a technique for managing large objects. Although it's applicable to objects of any size, the advantages are more obvious in large tables. When you rebuild an index on a nonpartitioned table, your only option is to build the entire index in one statement. If the table is partitioned, however, you can rebuild partitions of local indexes one at a time. For instance, in the case of the TRANS table, you can rebuild the Y05Q1 partition of the IN_TRANS_01 local index:
alter index in_trans_01 rebuild partition y05q1;
as opposed to rebuilding the IN_TRANS_01 index as a whole. The advantage of this approach is even more pronounced when you load a specific partition and the index needs to be rebuilt only on that partition, not on the rest of the index. In addition to doing index rebuilds, you can also move tables across tablespaces, export tables, delete data, and so on—one partition at a time.
Data purge. Suppose that TRANS is a table in an OLTP system and you want to remove old records, such as the ones with TRANS_DT values in the first quarter of 2005. The conventional method is to delete the records, but deletion causes undo and redo information to be generated, which impacts database performance. This can also be a very time-consuming operation, depending on the volume of data being purged. An alternative to deleting records is to drop a partition. In the case of the TRANS table, if you decided to drop all the records from the first quarter of 2005, you'd issue
alter table trans drop partition y05q1;
That's it. After you execute this command, the partition disappears from the table. This statement does not actually delete anything from the table; it merely updates the data dictionary to indicate that the Y05Q1 partition no longer belongs to the table. Because no data moves (except a small amount of information in the data dictionary), undo and redo generation is minimal, making the operation extremely fast compared to the deletion approach and virtually impact-free in terms of performance.
Note that a regular DELETE statement deletes table rows from wherever they exist, leaving empty spaces in the blocks. Although the table data may be sparse, the overall size of the segment may still be large and have a very high high-water mark (HWM, the largest size the table has ever occupied). A high HWM slows full-table scans, because Oracle Database has to search up to the HWM, even if there are no records to be found. When a partition is dropped, the segment itself is gone and the HWM of other segments remains the same, so the query runs faster.
While the DELETE statement deletes the rows, the corresponding index entries are also adjusted, causing fragmentation, which may impact performance.
When a partition is dropped, the corresponding partition of any local index is also dropped. If there is a global index, however, that index can become unusable when you drop the partition. To prevent the index from becoming unusable, in Oracle9i Database and later, you can update the global index when you drop the partition. The following drops the Y05Q1 partition and updates the global index:
alter table trans drop partition y05q1 update global indexes;
If you decide not to do an automatic index update as part of dropping the partition, that global index will become unusable. You will have to rebuild the index to make it usable, but you can defer that task to later, after you've dropped other partitions or done some other operations.
You can manually rebuild the global index on TRANS, by using
alter index pk_trans rebuild;
Data archiving. If you decide to purge data without retaining it, you can use the technique described in the previous section. Data to purge without retention includes log or debug information, which can simply be dropped. Most information, however, probably needs to be archived for future access. In the TRANS table example, if you wanted to store the contents of the Y05Q1 partition before purging it, you could use the following partition exchange technique:
1. Create a table that is almost identical in structure to the TRANS table, except that it is not partitioned:
create table trans_y05q1 as select * from trans where 1=2;
2. This creates an empty unpartitioned table TRANS_Y05Q1, whose structure is identical to that of TRANS. The 1=2 clause returns false, so no rows of the TRANS table are transferred. Exchange the contents of the partition with this new table:
alter table trans exchange partition y05q1 with table trans_y05q1;
This operation makes the data inside the Y05Q1 partition appear inside the TRANS_Y05Q1 table and empties the partition. The data does not physically move from the partition to the new table. This exchange partition statement merely updates the data dictionary to reset a pointer from the partition to the table and vice versa. Because there is no physical movement of data, this exchange does not generate redo and undo, making it faster and far less likely to impact performance than traditional data-movement approaches such as INSERT.
After the EXCHANGE PARTITION command is executed, the partition is empty and you can drop it, as shown earlier. After the table is created, you can transport it out of the database and archive it for future use.
Data movement. One of the most challenging aspects of data warehouse administration is the development of ETL (extract, transform, and load) processes that load data from OLTP systems into data warehouse databases. Other instances in which large amounts of data are moved include replicating data from production to QA databases, populating data marts from main data warehouses and vice versa, and populating summary tables from transaction tables. Partitions may prove to be extremely valuable in these processes. Take, for instance, the TRANS table and another table—TRANS_SUMMARY—which records the total transaction amount for a particular date. The following describes the TRANS_SUMMARY table:
SQL> desc trans_summary Name Null? Type --------- --------- --------------------- PRODUCT_CODE NUMBER TOT_AMOUNT NUMBER(12,2)
A traditional approach to populating this data warehouse table every day might be as follows:
insert into trans_summary select product_code, sum(trans_amount) from trans group by product_code;
This approach works but is plagued by some problems:
1. The INSERT statement will generate a large amount of undo and redo, which will affect the performance of the database. You can reduce the undo and redo generation by introducing multiple commits in a loop while loading data, but the overall elapsed time will increase significantly. You can use the direct path INSERT (using the APPEND hint) with NOLOGGING, as follows:
insert /*+ APPEND NOLOGGING */ into trans_summary select product_code, sum(trans_amount) from trans group by product_code;
Data being loaded into the TRANS_SUMMARY table will remain unavailable, however, until the data load is complete.
2. A daily process will have to wipe the TRANS_SUMMARY table clean before loading fresh data; otherwise, it will load duplicates. This table is unavailable for general use from the time it is truncated until it is finally loaded. On some very large systems—depending on the complexity of the query, the size of the table, and the general load on both the source and target databases—this process can take hours, during which the users cannot see even the old data (which has been deleted prior to loading). If the INSERT statement fails, due to lack of space or data errors, the users will have to wait until the new data is loaded, which again can be hours.
The use of partitions eliminates or largely mitigates these two issues. Because the summary table is grouped by PRODUCT_CODE, you can create the TRANS_SUMMARY table list partitioned on the PRODUCT_CODE column:
create table trans_summary ( product_code number, tot_amount number(12,2) ) partition by list (product_code) ( partition p1 values (1), partition p2 values (2), partition p3 values (3), partition p4 values (4), partition p5 values (5), partition p6 values (6), partition p7 values (7), partition p8 values (8), partition p9 values (9), partition p10 values (10) )
During the loading process, you can load the data from TRANS into TRANS_SUMMARY partition by partition instead of loading the whole table. Here is how you would load the data for PRODUCT_CODE value=1:
1. First create a temporary table whose structure is identical to that of the TRANS table except that it's not partitioned:
create table trans_summary_temp nologging as select cast(1 as number) product_code, cast(sum(trans_amount) as number(12,2)) tot_amt from trans where product_code = 1 group by 1;
Note that the table is created with the NOLOGGING option, which significantly reduces the undo and redo generation. This is a dynamically created table, and while it is being created, the original TRANS and TRANS_SUMMARY tables are fully accessible to users.
2. Once the table has been created, you can exchange the TRANS_SUMMARY table with the p1 partition:
alter table trans_summary exchange partition p1 with table trans_summary_temp including indexes;
This operation instantly makes the data in the TRANS_SUMMARY_TEMP table available as the data in the p1 partition of the TRANS_SUMMARY table. This "switch" merely updates the pointers in the data dictionary; no data is actually moved. So this process is extremely fast (usually a few seconds), and the table is locked for that period only. The table data is available to users at all other times in the process. So this approach has four key advantages over loading the table directly:
1. The performance impact is low.
2. The main table is always available (except for a few seconds).
3. In case of data-loading failure, the old data is available.
4. You can refresh data selectively.
Data lifecycle management. Most business data follows a predictable lifecycle: In the beginning, it is accessed moderately; then it is updated heavily; and finally access slows down to almost nothing. Regulatory requirements may mandate retention in the database of even this least accessed data. However, because the older data is accessed with decreasing frequency, you can use partitioning to develop a disk access strategy that lowers the total cost of ownership without sacrificing performance. Put partitions with the most accessed and updated data on the fastest disk storage, and put older, less accessed data on slower—and cheaper—disk storage.
For instance, in the TRANS table, the records of more-recent transactions, such as in the current partition, Y05Q4, are accessed heavily, followed by those of the partition immediately preceding it—Y05Q3—and so on. Because the older partitions are not accessed that frequently, it might be worthwhile to put them on a storage tier that is slower than the one on which the current (and most-accessed) partitions reside. One way to save money by using inexpensive storage is to create a new tablespace on the cheaper storage and move the partition to this new tablespace:
alter table trans move partition y05q1 tablespace y05q1_inexpensive;
During the partition move, the partition will be available for using SELECTs but not for UPDATEs.
Efficiency of backup. Consider the example at the beginning of this article—the TRANS table. In that example, each partition is located in a different tablespace named for the partition. If you can say with certainty that the records with a TRANS_DT value earlier than today (or earlier than some specific date) will not change, then you can also consider the corresponding partition to be read-only. In that case, you can convert the tablespace the partition resides in to read-only, as follows:
alter tablespace y05q1 read only;
When you make the tablespace read-only, the Oracle Recovery Manager (RMAN) backup can exclude it while making backups, because the tablespace will not change over time; one backup is enough. The more tablespaces you can make read-only, the shorter the duration of the RMAN job, which also reduces the load the RMAN job puts on the database. This benefit is most visible in databases containing historical data, especially data warehouses, in which the total amount of data to be backed up is usually quite high and you can make numerous tablespaces read-only.
To make an informed partitioning decision, first prioritize the objectives for your own partitioning setup. For example, using some of this article's alternatives, suppose you decide on the following order of priority:
1. Data archiving
2. Data purge
3. Efficiency of backup
4. Ease of administration
You have deliberately chosen performance after the other objectives; in a different partitioning situation, however, performance may be the top priority.
Data archiving and data purge. For the data archiving and data purge priorities, you have to choose a differentiating column. Is it time-based, as in the case of TRANS_DT in the TRANS example table? If so, which column differentiates the records to be archived and purged? The TRANS table example uses the TRANS_DT column as the partition key in a range-partitioned table, but suppose you have a similar table with a column named EXPIRY_DT, which indicates the date after which the record will definitely not be updated. In that case, to better meet the data archiving and data purge priorities, make EXPIRY_DT the partitioning key in the range-partitioning scheme.
Suppose the differentiating column contains discrete values, such as CAMPAIGN_CODE (in the ADVERTISING table), denoting advertising campaigns. After each campaign is completed, its records are archived and purged. To meet the data archiving and data purge priorities in this situation, use list partitioning with the CAMPAIGN_CODE column as the partitioning key. A range scheme would be useless here, because you would need to drop partitions of a specific CAMPAIGN_CODE, not the range that contains it.
Now suppose that most queries against the TRANS table do not choose TRANS_DT in the WHERE clause. Choosing this column as a partitioning column does not help performance. Is it a good decision to choose a range-partition scheme, with TRANS_DT as the partition key? The answer lies in your prioritizing. In this example, you list performance after data archiving and data purge, so this scheme offers the best solution. If you had prioritized performance over data archiving and data purge, you would have chosen a different scheme.
Efficiency of backup. If efficient backup is your primary objective, ask yourself this question: Which column makes a record read-only? Suppose the answer is the TRANS_DT column. A record with a TRANS_DT value that is more than a month old will not be updated and should be considered read-only. In such a case, choose range partitioning on TRANS_DT, with each partition residing on a different tablespace, as shown in Listing 1. When the records on a partition will never be updated again, make the corresponding tablespace read-only. The RMAN backups will skip this tablespace during the backup, making the backup process faster.
Ease of administration. If ease of administration is the primary objective, your partition decision must be based on how the data is modified. Suppose you do a lot of index rebuilds on tables, due to data loads. You should note the column that determines which data is to be loaded. Is it a date—as in the TRANS_DT column in the TRANS table? If so, a range-partitioned table with TRANS_DT as the partitioning key is most appropriate. Is this column a discrete value, as in PRODUCT_CODE? If so, list partitioning will be useful. If no such pattern exists, you can use hash partitioning on the table and rebuild the indexes of the generated partitions.
READ more about partitioning
Performance. If performance is the highest priority, choose the column that is used mostly in WHERE conditions to filter rows and in joins with other tables. If it's a time-based column, then range partitioning is very useful, as in the case of TRANS_DT in the TRANS table. Even if it's not time-based, the presence of some logical range will help. For instance, suppose the PRODUCT_CODE column (a number) follows a pattern such as 1000 to 1999 for consumer products, 2000 to 2999 for industrial products, and so on. When users query the table, are they focused on only a specific type of product, such as the industrial category? If that is the case, range partitioning can be used on the PRODUCT_CODE column, with partitions in the ranges 1000-1999, 2000-2999, and so on.
Combining partitioning strategies. You can also combine several of your most important goals with Oracle Database's composite partitioning strategies. For example, you can create a range/list-partitioned ADVERTISING table, using range partitioning on the AD_DT column to satisfy the performance requirement and list partitioning on the PRODUCT_CODE column to satisfy the data purge requirement.
The most important decisions in partition design are choosing the partitioning scheme and the column(s) to partition on, and these decisions depend heavily on the priorities of potentially conflicting objectives. Once you are familiar with the advantages of each type of partitioning scheme, you will be able to choose the scheme and column(s) appropriate for your partitioning needs.