11g

Oracle Database 11g:
The Top Features for DBAs and Developers

by Arup Nanda Oracle

Manageability

Leave the management of all memory to Oracle; automatically grab available patches only for those features you use; collect optimizer statistics and examine them before making them public, let the optimizer know the relationship between different columns, and more.

See Series TOC

Memory Management

Deep down, an Oracle Database Instance is a collection of processes such as PMON, SMON, and memory areas such as System Global Area (SGA) and Program Global Area (PGA). Within the SGA, there are several sub areas such as database cache, the large pool and so on. How do you decide how big each area should be? Memory in the host is bounded at an upper limit, of which some parts must go to the Operating System. Deciding how much to allocate where could be tough.

In Oracle Database 10g, the issue is greatly simplified by setting a value for SGA_TARGET, which allocates a specified value for the overall SGA memory area. The sub-areas such as cache and shared pool are subsequently auto-tuned.

However, in 10g some components, such as db_cache_keep_size, are not; you still have to manually tune them. The memory area PGA is entirely outside the SGA, so the former is not touched by the Automatic SGA Management feature at all. So, you still have to make a few decisions, such as the sizes of SGA and of PGA.

What happens if you allocate too much to PGA, starving the SGA? Well, you waste memory while making performance suffer due to undersized SGA. But what if the boundary between PGA and SGA were fluid and the memory allowed to flow freely between them as and when needed? That would be a very desirable feature indeed.

In Oracle Database 11g, that precise functionality is provided. Instead of setting SGA_TARGET and PGA_AGGREGATE_TARGET, you specify MEMORY_TARGET. From the Enterprise Manager Database homepage, choose Advisor Central > Memory Advisor:

 




Click Enable to enable Automatic Memory Management:

 




Click OK. You will need to restart the database, as the parameter for Max Memory Target is a static one.

You can also do this via the command line:

 
SQL> alter system set memory_max_target = 1G scope=spfile;
 
System altered.

SQL> alter system set memory_target = 1G scope = spfile;
 
System altered.

In UNIX-based systems, the max memory should be less than the size of the /dev/shm shared memory file system. This functionality has been implemented in Linux, Solaris, HPUX, AIX and Windows.


[oracle@oradba3 dbs]$ df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      36316784  19456684  15015264  57% /
/dev/hda1               101086      9632     86235  11% /boot
none                    517448    131448    386000  26% /dev/shm

This example shows that you have only about 500MB available, so you will use 404MB as the MEMORY_TARGET. After setting the value and recycling the database, you will notice that the screen changes as shown below:

 




This screen shows the breakdown between SGA and PGA over a period of time. The second half of the screen shows how the sub-pools in the SGA have been allocated and how their allocation has changed over time.

 




This allocation keeps changing as the apps demands from the database. This relieves you from managing memory, which may be a best-case guess anyway.

As was the case of the SGA components, the memory components are governed by a minimum rule if the individual components are also specified. For instance, if you specify the following:

memory_target = 10G
sga_target = 2G
pga_aggregate_target = 1G

In this case, the memory_target is set to 10GB. As you have specified the two other parameters, they will serve as minimum values. So, the minimum sizes of SGA and PGA will be 2G and 1G respectively. When the demand for the memory goes up and down, the memory areas are shrunk and expanded. You can check it from the view V$MEMORY_DYNAMIC_COMPONENT.


select
        component,
        current_size,
        min_size,
        max_size,
        user_specified_size user_spec,
        oper_count,
        last_oper_type,
        last_oper_mode,
        last_oper_time
from v$memory_dynamic_components;

Here is the output:


COMPONENT                 CURRENT_SIZE   MIN_SIZE   MAX_SIZE  USER_SPEC OPER_COUNT LAST_OPER_TYP LAST_OPER LAST_OPER
------------------------- ------------ ---------- ---------- ---------- ---------- ------------- --------- ---------
shared pool 117440512 4194304 117440512 0 27 GROW IMMEDIATE 28-AUG-07
large pool 4194304 0 4194304 0 1 GROW IMMEDIATE 28-AUG-07
java pool 12582912 4194304 12582912 0 4 GROW IMMEDIATE 28-AUG-07
streams pool 0 0 0 0 0 STATIC
SGA Target 134217728 20971520 134217728 0 548 GROW DEFERRED 29-AUG-07
DEFAULT buffer cache 4194304 4194304 16777216 0 12 SHRINK IMMEDIATE 28-AUG-07
KEEP buffer cache 0 0 0 0 0 STATIC
RECYCLE buffer cache 0 0 0 0 0 STATIC
DEFAULT 2K buffer cache 0 0 0 0 0 STATIC
DEFAULT 4K buffer cache 0 0 0 0 0 STATIC
DEFAULT 8K buffer cache 0 0 0 0 0 STATIC
DEFAULT 16K buffer cache 0 0 0 0 0 STATIC
DEFAULT 32K buffer cache 0 0 0 0 0 STATIC
Shared IO Pool 0 0 0 0 0 STATIC
PGA Target 0 0 104857600 104857600 29 GROW IMMEDIATE 28-AUG-07
ASM Buffer Cache 0 0 0 0 0 STATIC

Another very useful view is v$memory_resize_ops, which stores the information about the resize operations that occurred.


select
start_time,
end_time,
status,
component,
oper_type Op,
oper_mode,
parameter,
initial_size,
target_size,
final_size
from v$memory_resize_ops
order by 1,2
/

START_TIM END_TIME STATUS COMPONENT Op OPER_MODE PARAMETER INITIAL_SIZE TARGET_SIZE FINAL_SIZE
--------- --------- --------- --------------- ------- --------- ----------------- ------------ ----------- ----------
28-AUG-07 28-AUG-07 COMPLETE SGA Target GROW IMMEDIATE sga_target 125829120 130023424 130023424
28-AUG-07 28-AUG-07 ERROR DEFAULT buffer SHRINK IMMEDIATE db_cache_size 16777216 12582912 16777216
cache

28-AUG-07 28-AUG-07 ERROR shared pool GROW IMMEDIATE shared_pool_size 96468992 100663296 96468992
28-AUG-07 28-AUG-07 ERROR DEFAULT buffer SHRINK IMMEDIATE db_cache_size 16777216 12582912 16777216
cache

28-AUG-07 28-AUG-07 ERROR shared pool GROW IMMEDIATE shared_pool_size 96468992 100663296 96468992
28-AUG-07 28-AUG-07 PENDING shared pool GROW IMMEDIATE shared_pool_size 96468992 100663296 100663296
28-AUG-07 28-AUG-07 COMPLETE PGA Target GROW IMMEDIATE pga_aggregate_targ 0 0 0
et

28-AUG-07 28-AUG-07 PENDING shared pool GROW IMMEDIATE shared_pool_size 100663296 104857600 104857600
28-AUG-07 28-AUG-07 COMPLETE SGA Target GROW IMMEDIATE sga_target 130023424 134217728 134217728

 

Adaptive Thresholds

Consider this scenario: The parse-to-hit ratio in your database is usually a certain value, say x. Now it has gone up by 50 percent and has become 1.5x. After further investigation you see that there are a lot of DSS type queries that cause the parse-hit ratio to go up. You also learn that this behavior (running DSS type queries at night) is perfectly acceptable. So, 1.5x is a good number at that time; perhaps it can even go up to 2x. So, you want the threshold to be somewhat time-aware—it should be 1.2x during the day when activities are primarily OLTP based, and 2x during night when they're not. How do you accomplish such contradictory objectives?

In Oracle Database 11g, this task is fairly trivial through a feature called Adaptive Thresholds. The thresholds are adaptable to a specific time based on the past behavior and are based on the AWR snapshots calculated earlier. As in many features of Oracle Database 11g, this task can be done two ways: via command line and Oracle Enterprise Manager. Let's see how it's done in the GUI.

First, you have to define an AWR Baseline, which is similar to a Preserved Snapshot in Oracle Database 10g. The baseline sort of establishes a starting point for the performance metrics. You can use the baseline to compare how the metrics come up over time.

From the main database home page, choose the Performance tab and scroll down to see Additional Monitoring Links:

 




From this page, click on AWR Baselines in the third column, which brings up a screen like the following:

 




This screen already contains a baseline: SYSTEM_MOVING_WINDOW, which is enabled out-of-the-box in Oracle Database 11g. Click on Edit to see the change any of the thresholds associated with that baseline. A screen shown below comes up:

 




This screen shows how the thresholds are defined for the metric Cumulative Logons per second, you can define the critical threshold (defined here as 60%) of the baseline. Once this threshold is set, any time cumulative logons are more that 60% of their maximum for the same day and time as observed over the past week, and this happens at least twice, a critical alert will be triggered. (Sixty percent is a very low limit; I just used it for demonstration purposes.)

Over time as the number of users grows, the alert thresholds will automatically adapt. You are not limited to just percentages of the maximum; you may define specific values as critical and threshold too. Another parameter, Occurrences, allows you to trigger the alert only if the threshold is breached that many times.

The small figure on the right is where the metrics are shown over a period of time. Note the small square box; it's the zoom tool. You can move and place that square on any area of the figure and the details will be shown.

Pending Statistics

Consider this situation: You have carefully gathered statistics on all the tables, indexes, and so on and all the queries have perfect execution plans. But now, someone kicks off the stats collection job and suddenly the executions plans are all different. True; the plans could be better (but could be worse as well, albeit with lesser probability).

One way to prevent such a mishap is to use SQL Plan Management, which allows you to create SQL plan baselines that lock in the plan for a SQL statement. But that is not exactly a universal solution; SQL statements that have not been seen will not have a SQL plan baseline and hence will not be under SQL Plan Management's protective umbrella.

Another issue is when to collect the stats. Stats collection is a fairly CPU and I/O intensive process; you may want to do it during relatively quiet periods such as at night. But when the statistics are gathered, the plan is immediately affected. You may not want to have the possibility of a new plan first thing in the morning. Rather you want to have the stats available during the evening. Paradoxically, you need the stats to be collected now but effective later.

This complex goal is achievable in Oracle Database 11g. You can define the statistics for a table or index to be pending, which means the freshly gather statistics will not be published or used by the optimizer—giving you an opportunity to test the new statistics before you publish them. To accomplish this, you will need to set the publish property of the table's stats to FALSE, through the procedure set_table_prefs in the supplied package dbms_stats.

Here is an example how you can make the stats of the table SALGRADE in the schema ARUP unpublished:

begin
  dbms_stats.set_table_prefs ( 
    ownname => 'ARUP', 
    tabname => 'SALGRADE', 
    pname   => 'PUBLISH', 
    pvalue  => 'FALSE' 
  );     
end;

Once you set the publish preferences to false, the stats collected will be marked pending. Here are the stats on the table now:


SQL> select num_rows, to_char(last_analyzed,'mm/dd/yy hh24:mi:ss') 
  2  from user_tables
  3  where table_name = 'SALGRADE';
 
  NUM_ROWS TO_CHAR(LAST_ANAL
     ---------    -----------------
         6 09/10/07 22:04:37


Today is 9/21/07; so the stats were collected a while ago. Let's collect the stats again:


begin
        dbms_stats.gather_table_stats (
                ownname         => 'ARUP',
                tabname         => 'SALGRADE',
                estimate_percent=> 100
);
end;
/
 

If you see the stats again:


  NUM_ROWS TO_CHAR(LAST_ANAL
     ---------- -----------------
         6 09/10/07 22:04:37

You can see that the number of rows is still the same and the last_analyzed is still the old value. So, what happened to the stats we just collected? The stats are marked pending. The pending stats are visible in a view USER_TAB_PENDING_STATS:


select num_rows, to_char(last_analyzed,'mm/dd/yy hh24:mi:ss')
from user_tab_pending_stats
where table_name = 'SALGRADE';


Here is the output:


  NUM_ROWS TO_CHAR(LAST_ANAL
     ---------- -----------------
         9 09/21/07 11:03:35

Now the table has nine rows, which shows up as pending stats. The time of collection is also shown. Suppose you want to make these pending stats visible to the optimizer. You can publish them:


begin
    dbms_stats.publish_pending_stats('ARUP', 'SALGRADE');
end;
/

If you check the view user_tab_pending_stats, you will see it is empty now. If you check the USER_TABLES now, you will see the most up to date stats:


  NUM_ROWS TO_CHAR(LAST_ANAL
     ---------- -----------------
         9 09/21/07 11:03:35

This technique of decoupling of collecting and publishing stats can also be used with partitioned tables. Suppose you are loading a table partition by partition. You don't want to feed partial information to the optimizer; you rather want the stats of all partitions to be visible to the optimizer at the same time. But you also want to take advantage of the time right after the partition is loaded. So, you can collect the stats on a partition right after it is loaded but not publish it. After all partitions are analyzed, you can publish them all at once.


Checking Pending Statistics Before Publishing

A great use of pending statistics is to collect them and then test the queries before publication. To make the optimizer use the pending statistics, not published ones, set the session parameter optimizer_use_pending_statistics to true.

Consider a table called RES and suppose there is an index on the column STATUS.

SQL> create index in_res_status on res (status);

Gather the stats for the table with the cascade option.


begin
        dbms_stats.gather_table_stats (
                ownname         => 'ARUP',
                tabname         => 'RES',
                estimate_percent=> 100,
                cascade         => true
);
end;

The values of the column STATUS are distributed as shown below:


STATUS    COUNT(1)
  -------       ----------
INVALID          6
VALID        68858

If you issue a query like select res_type from res where status = 'VALID'; and check the execution plan:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 68858 |  1075K|   290   (3)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| RES  | 68858 |  1075K|   290   (3)| 00:00:04 |
--------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(STATUS='VALID')

It's doing a full table scan on table RES, as expected, because the overwhelming number of rows have that value. Suppose the composition changes in the future:


STATUS    COUNT(1)
  -------       ---------
CLOSED       68857
OK               6
VALID            1

Note that the number of rows containing VALID dropped to just one. In this case the query should not use full table scan; it should use the index. But if you run the query and check the execution plan:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 68858 |  1075K|   290   (3)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| RES  | 68858 |  1075K|   290   (3)| 00:00:04 |
--------------------------------------------------------------------------

You will see that the query still does a full table scan. It happens because the statistics are not updated (stale). The optimizer still thinks that the number of rows containing the value VALID are in majority and hence generates a full table scan plan. You can generate the statistics again, which will correct the optimizer's thought process and use an index. But here is a big question: Will it break some other query?

Instead of speculating, change the PUBLISH preference for the table RES to false, using the set_table_prefs procedure the supplied package dbms_stats. Then gather the statistics on the table as usual. Since the statistics are now pending, the optimizer will not use them. However you can confirm their impact on the optimizer. A special parameter enables the optimizer to use pending stats instead of published stats.

SQL> alter session set optimizer_use_pending_statistics = true;

Now, if you check the plan:


---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |     1 |    16 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| RES           |     1 |    16 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IN_RES_STATUS |     1 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

The optimizer chose the index, even if the published statistics are different. The pending statistics made the process only better, not worse. Using this approach you can check other queries for the impact from these pending statistics. You can even use the power of SQL Plan Management to create SQL plan baselines for your queries using the pending statistics and use these plans later, even if the cost-based plan changes. After you are completely satisfied, you can publish the pending statistics.

If you decide to delete the pending stats, you can do so using the following:

begin
  dbms_stats.delete_pending_stats('ARUP','RES');
end;

Suppose you published the above stats and they wreak havoc in some other cases. You wish you hadn't published them. Well, nothing is lost; you can reinstate the statistics as of a certain time in the past. Oracle Database 11g maintains the history of the statistics gathered, which is visible in a view DBA_TAB_STATS_HISTORY. The history is maintained by default for 31 days.


SQL> select to_char(stats_update_time,'mm/dd/yy hh24:mi:ss')
  2  from dba_tab_stats_history
  3  where owner = 'ARUP'
  4  and table_name = 'RES';
 
TO_CHAR(STATS_UPD
-----------------
09/07/07 11:56:26
09/14/07 11:59:43
09/21/07 13:58:31

It shows the statistics were collected over the past few days. Suppose the last collection was bad and you want to reinstate the collection made on 9/14.


begin
   dbms_stats.restore_table_stats (
       ownname         => 'ARUP',
       tabname         => 'RES',
       as_of_timestamp => '14-SEP-07 11:59:00 AM'
   );
end;
/

This replaces the current stats with the stats collected on 9/14.


Extended Statistics

Function-Based Statistics

Traditional statistics gathering collects and stores patterns of data on columns. Consider a slightly different case: functions on the column values, e.g. UPPER() function.

Suppose you have a column called CUST_NAME that stores the names of customers in regular mixed case style. Here is a sample of the customer names as they are stored:

McArthur
MCArthur
mcArthur
MCARTHUR
mcarthur

It's probably the same name but entered in five different variations. The cardinality of the column is high because each value is unique for Oracle. If you use a predicate like where upper(cust_name) = 'MCARTHUR', then all these values are collapsed into just one: MCARTHUR. Suddenly the column becomes a low cardinality one. If there are just five rows in the table, each value would be unique; so selectivity would be 1/5. But applying the upper() function converts the columns to the same value and selectivity becomes 1.

Selectivity plays an important role in the optimizer's execution plan selection, so the change in selectivity due to the upper function should be made clear to the optimizer. In Oracle Database 11g, you can create extended statistics on expressions of values, not just columns.

Here is an example. First collect the stats on the table CUSTOMERS:

begin
        dbms_stats.gather_table_stats (
                ownname         => 'ARUP',
                tabname         => 'CUSTOMERS',
                estimate_percent=> 100,
                method_opt      => 'FOR ALL COLUMNS SIZE AUTO',
                cascade         => true
);
end;

Then consider the following query:


select * from customers where upper(cust_name) = 'MCARTHUR'

Here is the execution plan:

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |   689 | 65455 |   290   (3)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| CUSTOMERS |   689 | 65455 |   290   (3)| 00:00:04 |
-------------------------------------------------------------------------------

Note the number of rows estimated to be retrieved by the query - 689. Let's see how many rows should be retrieved:


SQL> select count(1) from customers where upper(cust_name) = 'MCARTHUR'
  2  /
 
  COUNT(1)
    ----------
     29343

Clearly the optimizer underestimated the row count, as it should have estimated something close to 29343 and not 689 as it did. Next, re-gather statistics on the table and collect the extended statistics on the expression upper(cust_name).


begin
  dbms_stats.gather_table_stats (
     ownname    => 'ARUP',
     tabname    => 'CUSTOMERS',
     method_opt => 'for all columns size skewonly for columns (upper(cust_name))'
  );
end;
/

This creates extended statistics on the customer table. Now the plan looks like:


-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           | 29570 |  3263K|   291   (3)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| CUSTOMERS | 29570 |  3263K|   291   (3)| 00:00:04 |
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(CUSTOMERS.SYS_STUI2GNVTNUK3O8Y#$YT73GA0J='MCARTHUR')

Note the value under the "Rows" column (29570) is a more accurate representation of how many rows will be returned. How did the optimizer get that number? From the extended statistics you just collected on the table, on the expression UPPER(CUST_NAME). At the bottom of the output, note the predicate information is also recorded, it shows a filter based on the column SYS_STUI2GNVTNUK3O8Y#$YT73GA0J. This is the system generated name for the expression UPPER(CUST_NAME). Since the optimizer has collected stats on the expression, it can accurately predict how many rows will be returned.

Why is this important? Well, since the efficiency of the execution path generated by the optimizer depends on how accurately it can predict the estimated number of rows, this expression statistic helped tremendously.

You can extend this model to any table where the columns could be correlated. For instance, comparing sales of snowshoes in Greenland vs. Africa is the perfect example of how data patterns could be collocated. Other patters are more fact based. Connecticut is a state in the US, not Germany; so the optimizer will know "country='Germany' and state='Connecticut'" will not satisfy any value and will take decisions accordingly.

To find out what expressions have statistics gathered on them, check the view DBA_STAT_EXTENSIONS, as shown below:

SQL> select extension_name, extension
  2> from dba_stat_extensions
  3> where table_name='CUSTOMERS';

EXTENSION_NAME                 EXTENSION
------------------------------                --------------------
SYS_STUI2GNVTNUK3O8Y#$YT73GA0J (UPPER(CUST_NAME))

To drop an extend stat on an expression, use the drop_extended_stats procedure.


begin
   dbms_stats.drop_extended_stats (
      ownname => 'ARUP',
      tabname => 'CUSTOMERS',
      extension => '(UPPER(CUST_NAME))'
   );
end;
/


Multi-Column Statistics

Consider the database for a hotel company where a table called BOOKINGS holds the information on the room bookings made. Two particular columns in that table are of interest to us: HOTEL_ID (the unique identifier of the hotel) and RATE_CATEGORY (the code for the daily room rate). There is an index on these columns.

Let's look at the data distribution on these two columns:

  HOTEL_ID RATE_CATEGORY   COUNT(1)
     ----------       -------------      ----------
        10            11      19943
        10            12      39835
        10            13      20036
        20            21       5106
        20            22      10041
        20            23       5139


Examine the data carefully: hotel_id 10 has in rate_category column only 11, 12 and 13, while hotel_id 20 has only 21, 22 and 23 (and none of 11, 12 and 13). Why? The reason could be the star rating of the hotel. Hotel 20 is a higher priced hotel and the rate categories 11, 12, and 13 are low rates, so they do don't apply to a high cost hotel. Similarly 21, 22, and 23 are higher cost rates and they don't apply to hotel 10, an economy hotel. Hotel 10 also has more number of bookings compared to Hotel 20. Note the combination of values: hotel_id = 20 and RATE_CATEGORY = 21 have the least number of rows: 5106. So, what happens when you execute a query like this?


select min(booking_dt)
from bookings
where hotel_id = 20
and rate_category = 21;

It should use the index, not a full table scan. You can test that by setting autotrace as shown below:


SQL> set autot on
SQL> ... execute the query ...

Here is the execution plan:

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |    14 |   142   (5)| 00:00:02 |
|   1 |  SORT AGGREGATE    |          |     1 |    14 |            |          |
|*  2 |   TABLE ACCESS FULL| BOOKINGS | 16667 |   227K|   142   (5)| 00:00:02 |
-------------------------------------------------------------------------------

What happened? It did a full table scan, ignoring the index. Note the value under "Rows" (16667), which is the number of rows the optimizer expects to be in the table. Clearly that is wrong; we should expect only 5106 rows and no doubt the inflated number coaxed the optimizer to lean toward full table scan. The number of expected rows is high because the optimizer evaluates each column, HOTEL_ID and RATE_CATEGORY, independently to calculate the estimated number of rows because it assumes no relationship between the two columns. Had it taken into consideration both columns together, it would have know that the pattern and arrived at a better plan.

How can we resolve the problem? In Oracle Database 11g, there is a new kind of statistic, called multi-column statistics, which is a type of extended stat. Using this feature, you can actually create associations between different columns (a column group) to help the optimizer make better decisions. Let's see how we can use that feature in this example. You define multi-column statistics in two ways:

  • First you define which columns should be in the column group. In this case, we want HOTEL_ID and RATE_CATEGORY. The new function create_extended_stats in the supplied package dbms_stats does that job. This function returns the name of the column group, which you may want to display. Once the column group has been created the next time you gather statistics on the table the multi-column statistics will be automatically gathered for your column group.
    SQL> var ret varchar2(2000)
    SQL> exec :ret := dbms_stats.create_extended_stats('ARUP', 'BOOKINGS','(HOTEL_ID, RATE_CATEGORY)');
    SQL> print ret
    
  • Alternatively you can define the column group as part of the gather statistics command. You do that by placing these columns in the method_opt parameter of the gather_table_stats procedure in dbms_stats as shown below:
    
    begin
       dbms_stats.gather_table_stats (
          ownname         => 'ARUP',
          tabname         => 'BOOKINGS',
          estimate_percent=> 100,
          method_opt  => 'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS       (HOTEL_ID,RATE_CATEGORY)',
           cascade         => true
       );
    end;
    /
    


After you have gathered the multi-column statistics, execute the query again and examine the autotrace output:


-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |     1 |    14 |   132   (1)| 00:00:02 |
|   1 |  SORT AGGREGATE              |                |     1 |    14 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| BOOKINGS       |  5106 | 71484 |   132   (1)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN          | IN_BOOKINGS_01 |  5106 |       |    13   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access(HOTEL_ID=20 AND RATE_CATEGORY=21)

This output clearly show that the index IN_BOOKINGS_01 was used. Why now? Note the value under "Rows" columns (5106); the optimizer correctly determined the estimated number of rows for the combination of values, not each value in isolation.

Just to make sure that the optimizer still makes correct decisions, try the combination hotel_id = 10 and rate_category = 12, a combination that should do a full table scan:

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |    14 |   143   (5)| 00:00:02 |
|   1 |  SORT AGGREGATE    |          |     1 |    14 |            |          |
|*  2 |   TABLE ACCESS FULL| BOOKINGS | 39835 |   544K|   143   (5)| 00:00:02 |
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(RATE_CATEGORY=12 AND HOTEL_ID=10)

As you can see, the optimizer accurately predicted the number of rows and hence chose the correct plan of using full table scan. This is also helpful to the optimizer if you enter a combination that does not exist (hotel_id=10 and rate_category=21).

You can see the extended stats defined in the database in the view DBA_STAT_EXTENSIONS:

SQL> select extension_name, extension
  2  from dba_stat_extensions
  3  where table_name='BOOKINGS';
 
EXTENSION_NAME                 EXTENSION
------------------------------                ----------------------------
SYS_STUW3MXAI1XLZHCHDYKJ9E4K90 (HOTEL_ID,RATE_CATEGORY)

To drop a multi-column stat, use the same approach shown in extended stats.

 

Online Patching

In today's business environment, there is increasing pressure to reduce planned as well as unplanned downtime (still the major variety) for IT systems. Downtime for applying patches and maintaining application objects can now be decreased further. In Oracle Database 10g, although fine-grained dependency tracking makes most changes pretty much online, patches still need to be applied while the database is down.

Oracle Database 11g offers a new feature in this area that is truly revolutionary. Some (but not all) patches in Oracle Database 11g, specifically diagnostic patches, can now be applied online without taking any downtime. When you apply them, they modify the code loaded into the host's memory online without requiring it to shut down.

Features-Based Patching

In the near future several new patches will made available for Oracle Database 11g. If you use the features involved, such as Partitioning, these patches will be of great interest to you; otherwise they will be unnecessary. (In fact, applying unnecessary patches leads to expensive, unnecessary downtime. Note however that security patches should always be applied regardless of features involved, because even idle bits are vulnerable to exploits.)

So, how can you determine if there are patches available only for the features you are using? And, more important, how can you download only those patches and then apply them at a convenient time?

Fortunately, Features-Based Patching in Oracle Database 11g allows you to do all those things. Let's see how it is done using Enterprise Manager Database Control.

First you have to let EM know your My Oracle Support credentials by clicking on the Setup link at the top right of your homepage; it brings up a screen as shown below:




Click on Patching Setup on the left hand pane. Enter credentials such as username and password here. After you have set up these credentials, all patches will be automatically downloaded during a patching session.

But what if you want Oracle to recommend the necessary patches? This is where Feature-Based Patching really shines. A Patch Advisor gathers the information on available patches based on the features you are using. On the Software and Support page, click on the Patch Advisor link.

 




Note the drop down list box view, which shows "All". If you would rather see the patches for the features you are using, then select Feature-Based Patches from the box and press Go. If any patches are found, they will be listed below.

Click on Patch Prerequisites to provide preferences where the patch should be downloaded and applied.

 



Click on the button Add to add a patch, which brings up a screen shown below:

 



From this screen you can search My Oracle Support for any available patches; it will be automatically downloaded and stored in the Patch Cache. You can then re-apply from there it in future manual patching operations if you wish.

Feature-based patching can also work when EM is not connected to My Oracle Support. You can download the XML metadata from My Oracle Support and apply it in the Patching Setup screen shown above. Choose the Offline Patching Setting tab.

Back to Series TOC