Automatic indexing with Oracle Database

Oracle Database’s Automatic Indexing feature can detect the need for indexes, create them, and drop them automatically—without DBA intervention.

By Arup Nanda | May 2021


Automatic indexing with Oracle Database

[We have updated this article, first published in 2019 in Oracle Magazine, for the latest release of Oracle Database. —Ed.]

Joan, the chief architect of Acme Travels, walks into the office of the CTO, Tanya, and can almost smell the frostiness in the air. Acme Travels has been operating for more than two decades, and during that time, the company’s data model has evolved—as any 20-year-old company’s data model would.

The company’s developers create new tables, alter the existing tables, write new SQL statements, and modify the existing SQL constantly. To improve performance, they create indexes on the columns proactively, but more often they add indexes later, only after a performance issue has been detected.

The DBA manager, Betty, points out that adding indexes to address performance issues is a double-edged sword. It may improve performance for some queries, she says, but indexes might slow down INSERT statements and may negatively impact UPDATE and DELETE statements, often enough to negate any perceived performance gains elsewhere.

Because of those concerns, Betty’s team doesn’t create indexes willy-nilly but only after a very careful analysis of the impact on all queries, sometimes with the use of the SQL Performance Analyzer. Even then, those indexes are almost always created after the SQL statements have been issued against the database.

When the SQL statements that use the new indexes cause performance issues, DBAs are blamed and DBAs, in turn, blame the developers for writing the poorly performing queries in the first place. Similarly, because of changing queries, indexes created earlier are sometimes no longer needed. The DBAs check for the usage of indexes and drop them if they’re unused, which is also a time-consuming—not to mention risky—exercise for the DBAs.

Debbie, the development manager, expects the DBAs to perform this analysis 24 hours a day, seven days a week, along with everything else they are doing.

“You’ve got to be kidding to suggest something that preposterous, especially with our lean staff!” says an enraged Betty.

“Well, we certainly can’t do it!” responds Debbie. “We follow the Agile methodology for development. We make small changes, perhaps 100 times a week. There is no way for us to slow down and check the impact of each change line by line. The analysis and changes must come from your team.”

The tension inside the office is so thick you could cut it with a knife. Obviously, Tanya doesn’t like it and wants to end it with a solution acceptable to everyone and beneficial to Acme. Tanya does agree that with agile development practices and the continuous integration/continuous deployment (CI/CD) philosophy, it’s important to maintain the velocity of development. She also understands that these indexing issues must be dealt with after development, not before.

At the same time, Tanya understands that expecting the DBAs to do the time-consuming analysis and make the risky changes 24/7 is impractical unless she quadruples the DBA staff, which she can’t do.

Tanya wonders out loud if the teams should slow down development considerably and include a thorough analysis before deploying applications.

“No way,” Debbie responds. “That would be catastrophic for application delivery.”

This time Betty agrees with her, saying that it would not solve the problem entirely, because there will still be indexes to be analyzed—just more in a bunch rather than spread out through the releases. So, although the original problem would still exist, Acme would lose the benefit of the CI/CD approach. It’s a lose-lose for everyone.

“Well, so much for solutions,” sighs Tanya. “That’s why I asked Joan to be here,” she announces. “I’m hoping she knows if there’s a way to get us out of this bind.”

All eyes turn to Joan. “Yes, there is,” smiles Joan, trying to warm the chill.

Prior indexing solutions

Joan, who until recently was the lead DBA at Acme, reminds everyone how indexes are being managed now. Most of them are created as a part of the performance issues captured and solutions recommended by the advisory tools included with Oracle Database. These advisors alert the DBA to the potential issues and advise if a new index would help. However, the responsibility for determining the overall impact and implementing the indexes—both onerous tasks—still lies with the DBAs. For example, the DBAs must determine

  • If the recommended index will help or hurt performance, based on multiple metrics such as data blocks retrieved, elapsed time savings, and so on
  • If the recommended index will affect the data manipulation language (DML) SQL statements negatively
  • If the indexes created earlier, once useful, are still useful
  • If a fresh analysis of the need for new indexes is required, due to data and structural changes

The last point is particularly tricky, Joan points out. The table and columns, and even the SQL statements, may not have changed, but if the data pattern has changed, once-useful indexes may not be useful anymore while also being a drag on the performance of DML statements. Similarly, some indexes not needed earlier could help after data pattern changes. Because data pattern changes are almost impossible to track down, the battle for the ideal indexes is usually a losing one.

Autoindexing to the rescue

This is where a feature introduced a few years ago in Oracle Database 19c, Automatic Indexing, comes to the rescue, explains Joan. Automatic Indexing acts like a DBA inside the database that evaluates the need for new indexes and the need for existing indexes, creates new ones if they’re needed, and drops them when they are no longer needed. Both single-column and concatenated indexes are considered by the feature, so it covers most of the indexes typically used in a database.

The best part, Joan says, is that Automatic Indexing is software: It works 24 hours a day, seven days a week; does not get tired; and does not take sick leave. It will do everything Tanya is asking for.

Everyone is ready for a demonstration.

Connected to Oracle Database, Joan starts by executing this SQL statement to enable Automatic Indexing.



begin
  dbms_auto_index.configure ('AUTO_INDEX_MODE', 'IMPLEMENT');
end;
/

Betty is a bit queasy about running a statement like this because it implements Automatic Indexing in the database without her knowing it and being able to assess its performance impact during normal operations.

“That’s understandable,” Joan agrees. She runs another SQL statement to prepare the database for the feature, but she specifies that SQL statements should not use the new automatically created indexes.



begin
   dbms_auto_index.configure ('AUTO_INDEX_MODE','REPORT ONLY');
end;
/

This sets in motion the process of identifying the possible indexes. Automatic Indexing, Joan explains, captures the SQL statements, identifies those from the list that may be helped by indexing, and then creates those indexes automatically. These autocreated indexes are named with the prefix SYS_AI to differentiate them from manually created indexes. A new column, named AUTO in the DBA_INDEXES view, shows a YES value for these indexes.

“Wait a minute,” interrupts Betty. “The database simply creates an index without checking in a representational test system to determine whether the index will actually help or hurt something else? That sounds like an incredibly stupid idea, especially for a machine.”

“Not quite,” assures Joan. Creating an index on a whim is not smart. That’s why the Automatic Indexing feature creates the index as invisible—it is not known to the database optimizer. She performs a simple demonstration to explain the concept of an invisible index to the audience.

She creates an index with the invisible keyword.



SQL> create index ix_region_id_01 on regions (region_name) invisible;

Index created.

Then she checks the plan of a very simple SQL statement against that table.



SQL> explain plan for
  2  select * from regions where region_name = 'Europe';

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------
Plan hash value: 3077898360

----------------------------------------------------------------------------
|Id | Operation         | Name    | Rows| Bytes | Cost (%CPU) | Time       |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT  |.        |  1  |	14    |	  3   (0)  | 00:00:01   |
|*1 |  TABLE ACCESS FULL| REGIONS |  1  |	14    |	  3   (0)  | 00:00:01   |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------
   1 - filter("REGION_NAME"='Europe')

13 rows selected.

Joan calls the audience’s attention to the output and the fact that the SQL statement did not use the just-created index where it could have. The reason is simple: The index is marked as invisible to the optimizer.

She then alters the index to make it visible to the optimizer and checks the execution plan of the same SQL statement.



SQL> alter index ix_region_id_01 visible;

Index altered.

SQL> explain plan for select * from regions where region_name = 'Europe';

Explained.

SQL> select * from table(dbms_xplan.display());

Plan hash value: 3897602228

-----------------------------------------------------------------------------------------------
|Id| Operation.                          | Name           | Rows| Bytes| Cost (%CPU)| Time    |
-----------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT                    |                |    1|    14|     2 (0)  | 00:00:01|
| 1|  TABLE ACCESS BY INDEX ROWID BATCHED| REGIONS        |    1|    14|     2 (0  )| 00:00:01|
|*2|   INDEX RANGE SCAN                  | IX_REGION_ID_01|    1|      |     1 (0)  | 00:00:01|
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("REGION_NAME"='Europe')

Pointing to the output that shows that the IX_REGION_ID_01 index is used this time, Joan highlights that this happens because the optimizer now recognizes the presence of this index. Automatic Indexing autocreates indexes as invisible first and then tests the index impact against SQL statements. If the impact is positive—if the SQL statements perform better with an index—the index is made visible, as Joan showed; if not, that index is marked unusable. To demonstrate that concept, Joan marks the index unusable and checks the execution plan once again.



SQL> alter index IX_REGION_ID_01 unusable;

Index altered.

SQL> explain plan for select * from regions where region_name = 'Europe';

Explained.

SQL> select * from table(dbms_xplan.display());

Plan hash value: 3077898360

--------------------------------------------------------------------------
|Id | Operation         | Name    | Rows  | Bytes | Cost(%CPU)| Time     |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT  |         |     1 |    14 |    3   (0)| 00:00:01 |
|*1 |  TABLE ACCESS FULL| REGIONS |     1 |    14 |    3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Pointing to the output, Joan shows that the index is not used now, even though it’s visible. She confirms that by using the following SQL statement:



SQL> select VISIBILITY, STATUS from user_indexes where index_name = 'IX_REGION_ID_01';

VISIBILITY STATUS
---------- --------
VISIBLE    UNUSABLE

The Automatic Indexing feature, Joan explains, checks the SQL statements, and it evaluates whether they can perform better via new indexes. It then creates the indexes as invisible. After that, it checks the execution plans of all the SQL statements using that new index. If all the statements show improvement (or no impact), it will make the index visible. If all the SQL statements show degraded performance, the index will remain invisible.

Debbie seems unconvinced. A more likely scenario, she opines, is that some SQL statements will show improvements and some will show degradation due to this autocreated index. What will the fate of the index be then?

Joan agrees that scenario is most likely and assures Debbie that the feature handles such a situation quite well. In this case, the index will be made visible, but Automatic Indexing will create a SQL plan baseline to prevent a SQL statement that regressed in performance from using the index. The other SQL statements—statements that did not regress in performance—will continue to use the index. Debbie nods in satisfaction.

Oracle advisor jobs and index administration

“Which component of Oracle Database actually performs this autoindexing?” Tanya asks.

“Starting with Oracle Database 11g, the database automatically executes multiple tasks,” Joan answers. For example, Oracle’s advisors run as automatic tasks and the DBA_ADVISOR_TASKS view shows information about these tasks. Joan pulls up the view data.



select *
from dba_advisor_tasks
where owner='SYS'
order by task_id;

TASK_ID TASK_NAME                          ADVISOR_NAME
      2 SYS_AUTO_SPM_EVOLVE_TASK           SPM Evolve Advisor
      3 SYS_AI_SPM_EVOLVE_TASK             SPM Evolve Advisor
      4 SYS_AI_VERIFY_TASK                 SQL Performance Analyzer
      5 SYS_AUTO_INDEX_TASK                SQL Access Advisor
      6 AUTO_STATS_ADVISOR_TASK            Statistics Advisor
      7 INDIVIDUAL_STATS_ADVISOR_TASK      Statistics Advisor

Joan points out the SYS_AUTO_INDEX_TASK task and two tasks with _AI_ in their name. These are the tasks behind the Automatic Indexing feature.

Betty’s interest is piqued now. “How do the DBAs administer this feature?” she asks.

Joan reminds her that the DBAs do not need to implement anything; the system detects the need to create indexes automatically and creates them, and it even drops them when they are not needed. A DBA’s involvement is limited to

  • Configuring the properties and parameters
  • Getting reports on Automatic Indexing usage

To set the default tablespace of the autocreated indexes to USER_AI, Joan uses the following statement:



dbms_auto_index.configure ('AUTO_INDEX_DEFAULT_TABLESPACE', 'USER_AI');

Joan explains that autocreated indexes may overwhelm the tablespace. To prevent that, Joan executes the following to ensure that only 50% of the tablespace is used:



dbms_auto_index.configure ('AUTO_INDEX_SPACE_BUDGET', '50');

Sometimes third-party application vendors don’t allow index creation or deletion that’s outside of their control. “Will this feature break those applications?” asks one of the DBAs. “Not at all,” assures Joan. To exclude a specific schema, Joan uses



dbms_auto_index.configure ('AUTO_INDEX_EXCLUDE_SCHEMA', 'SCOTT');

“But I am not comfortable that all these important activities would be performed by the system under the covers without our knowing about it,” says Betty. Others chime in as well.

The actions are hardly opaque, Joan assures everyone. All the current and historical actions are available in several data dictionary views. In Table 1, she points out to her audience the types of information and the views containing them.

Table 1. Automatic indexing information that views can show

To get . . . Use this view
The history of Automatic Indexing task executions DBA_AUTO_INDEX_EXECUTIONS
Statistics related to automatic indexes DBA_AUTO_INDEX_STATISTICS
Actions performed on automatic indexes DBA_AUTO_INDEX_IND_ACTIONS
Actions performed on SQL to verify automatic indexes DBA_AUTO_INDEX_SQL_ACTIONS
The history of configuration settings related to automatic indexes DBA_AUTO_INDEX_CONFIG

The easiest way to monitor Automatic Indexing is with Oracle Database’s built-in report feature. The DBMS_AUTO_INDEX.REPORT_ACTIVITY function returns a character large object (CLOB) containing all the relevant details. Joan pulls the data for activities between April 20 and April 21.



declare
  report clob := null;
begin
  report := DBMS_AUTO_INDEX.REPORT_ACTIVITY (
              activity_start => TO_TIMESTAMP('2021-04-20', 'YYYY-MM-DD'),
              activity_end   => TO_TIMESTAMP('2021-04-21', 'YYYY-MM-DD'),
              type           => 'TEXT',
              section        => 'SUMMARY',
              level          => 'BASIC');
end;

This produces a report containing all the needed information. Here is part of the report.



-------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------
Activity start              : 20-APR-2021 00:00:00
Activity end                : 20-APR-2021 00:00:00
Executions completed.       : 27
Executions interrupted      : 2
Executions with fatal error : 0
--------------------------------------------------------------------------
SUMMARY (AUTO INDEXES)
--------------------------------------------------------------------------
Index candidates                             : 98
Indexes created (visible / invisible)        : 21/0
Space used (visible / invisible).            : 312.23 MB (312.23 MB / 0 MB)
Indexes dropped                              : 2
SQL statements verified                      : 312
SQL statements improved (improvement factor) : 115 (3x)
SQL statements disallowed from auto indexes  : 34
Overall improvement factor                   : 3x
--------------------------------------------------------------------------
SUMMARY (MANUAL INDEXES)
--------------------------------------------------------------------------
Unused indexes (visible / invisible) : 9 (6 / 3)
Space used (visible / invisible)     : 281 MB (183 MB / 98 MB)
Unusable indexes                     : 0

INDEX DETAILS
-------------------------------------------------------------------------------
1. The following indexes were created:*: invisible
-------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
| Owner | Table       | Index                | Key               | Type   | Properties |
----------------------------------------------------------------------------------------
| HR    | DEPARTMENTS | SYS_AI_0urcv8chmxu20 | LOCATION_ID       | B-TREE | NONE       |
| HR    | LOCATIONS   | SYS_AI_1hgrs7xdghs31 | CITY              | B-TREE | NONE       |
... report truncated to save space ...
-------------------------------------------------------------------------------
VERIFICATION DETAILS
----------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------
1. The performance of the following statements improved:
----------------------------------------------------------------
Schema Name         : HR
SQL ID              : 3dfa28psdfe73
SQL Text            : select * from regions where region_name = 'Europe';
Improvement Factor  : 2x

PLANS SECTION
---------------------------------------------------------------------------------
Original
-----------------------------
Plan hash value: 3077898360
---------------------------------------------------------------------------
| Id | Operation          | Name    | Rows| Bytes | Cost (%CPU) | Time    |
---------------------------------------------------------------------------
|  0 | SELECT STATEMENT   |	      |    1|    14 |.     3   (0)| 00:00:01|
| *1 |  TABLE ACCESS FULL | REGIONS |    1|    14 |      3   (0)| 00:00:01|
---------------------------------------------------------------------------
With Auto Indexes
-----------------------------
Plan hash value: 3897602228
-------------------------------------------------------------------------------------------
|Id | Operation                 | Name                | Rows| Bytes| Cost (%CPU)| Time    |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT          |                     |    1|    13|     2(0)   | 00:00:01|
| 1 |  TABLE ACCESS BY INDEX ...| REGIONS             |    1|    13|     2(0)   | 00:00:01|
|*2 |   INDEX RANGE SCAN        | SYS_AI_8hjkdfss93kdf|    1|      |     1(0).  | 00:00:01|
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

   2 - access("REGION_NAME"='Europe')
... report truncated to save space ...

The report, Joan explains, provides vital information on the activities of the Automatic Indexing feature, such as when it ran, how many SQL statements it considered, how many automatic indexes were created, how much space they consumed, and so on. In the subsequent sections, it also shows the indexes automatically created on columns.

The last part of the report shows the SQL statements and the execution plans before and after the autocreated indexes, to confirm the improvements. This is the report both Debbie and Betty can examine to see the effectiveness of the Automatic Indexing feature at both a high and detail level.

Safety mechanisms

Debbie is still somewhat skeptical. During the next evaluation of SQL statements by this feature, she explains, an existing SQL statement will likely inspire the automatic creation of an index that has already been created but not used. This means Automatic Indexing will once again be forced to evaluate that index and again discard it as not useful. Isn’t this a vicious cycle of wasted resources, she asks, one which a human DBA would have avoided?

“Absolutely,” agrees Joan. “To keep re-creating indexes would be a vicious cycle.” To prevent it, she explains, the feature marks those SQL statements, and subsequent executions exclude these SQL statements from consideration and avoid getting into vicious cycles of repeated evaluation, creation, and invisibility of indexes.

Debbie is relieved but has another doubt. Sometimes the system simply won’t know the negative impact of an autocreated index as well as a human performance tuner would. In this case, the SQL statement using the automatic index will simply be detrimental. “Is there a way to suppress the use of automatic indexes for a query proactively?” she asks.

“Yes, there is,” replies Joan. If Debbie wants the optimizer not to use autocreated indexes for a specific SQL statement, she can simply include a hint.



select /*+ NO_USE_AUTO_INDEXES */ from regions where region_name = 'Europe';

Likewise, the USE_AUTO_INDEXES hint acts just the opposite way: It forces the SQL statement to use the autocreated indexes if they are available.

“That’s not all,” Joan continues. “Automatic Indexing includes several additional safety features to protect the database from damage.” These include the following:

  • The system ignores autocreated indexes for any SQL statements run for the first time. This prevents evaluation of one-off SQL statements that are never issued again and wouldn’t have been able to benefit from the indexes anyway.
  • DBAs can disable the autoindex job for specific periods of time, so as not to affect normal processing.
  • By using the Resource Manager feature of Oracle Database, DBAs can limit the job to a limited number of CPUs, to reduce any negative effect of autoindex jobs.
  • If an autoindex job is not completed by a certain time, the next run will be skipped. This prevents proliferation of runaway jobs.

Also, the indexes created automatically are deleted after a specific number of days, which defaults to 373. Betty can set the retention period of the unused autoindexes to, say, 100 days, as follows:



DBMS_AUTO_INDEX.CONFIGURE ('AUTO_INDEX_RETENTION_FOR_AUTO', '100')

However, the unused manually created indexes are never deleted by the automatic indexing process. They can be deleted automatically, if needed, but they are never deleted by default. Betty can set another property, AUTO_INDEX_RETENTION_FOR_MANUAL, to specify after how many days the unused manual indexes can be dropped.

This seems to allay everyone’s concern that something could go terribly out of control.

Conclusion

Joan points out the big difference between the advisors available earlier and the Automatic Indexing feature: The advisors identify the need for and suggest possible indexes, but the onus of deciding whether those indexes will help or not lies with the DBAs.

By contrast, Automatic Indexing takes that responsibility away; it implements needed indexes automatically and constantly checks for their usefulness. DBA tasks are limited to setting the configuration parameters such as the default tablespace or the number of days to retain the unused indexes and getting reports on automatic indexing activities. Everyone is impressed with this new feature, thanks Joan, and leaves the room a little warmer and with their usual smiles back on.

Dig deeper

Illustration: Wes Rowell

Arup Nanda

Arup Nanda

Arup Nanda has been an Oracle DBA since 1993, handling all aspects of database administration, from performance tuning to security and disaster recovery. He received an Oracle Excellence Award for Technologist of the Year in 2012.