Pre-Loading Oracle Text indexes into Memory

Introduction

But what about the first search? Can we pre-load the database cache in order to make sure that most of what we need is already in memory before the first search?

This paper explores some techniques which make this possible.

Prerequisites

this

The Keep Pool

A table or index may be forced to reside in the keep cache using the commands

                               
ALTER TABLE <tablename> STORAGE (BUFFER POOL KEEP);
ALTER INDEX <indexname> STORAGE (BUFFER_POOL KEEP);

                            

What to cache?

If we have unlimited memory available, then we would want to put all of the following into the keep pool:

  • $I token table
  • $X index on $I table
  • $R table
  • The base table itself (assuming we are selecting more than just ROWID, or use a sort on a real column).

If our table uses local partition indexes, we would need to store these tables for each partition. Additionally, if our application uses functional lookups in the text table we would need the $K table, and if we're using prefix indexing we would need the $P table.

However, in real life it very likely that all of these tables will exceed the amount of memory we can afford to allocate to the keep pool. We would therefore need to select only certain tables. If we are using local partition indexes, and our queries are normally sorted by the partition key, it might make sense to just cache the tables for the first few partitions.

Alternatively, if our table isn't partitioned, it might be better to cache certain tables and not others.

In decreasing order of importance, we could rank them as

  • $X
  • $R
  • equivalence ($I, $K, base table)
  • $P (but way below)

$I, base table and $K are perhaps equally important, - though the sizes are clearly not comparable, and different usage scenarios would tend to favor different approaches. In the normal case (with little or no functional invocation), we would suggest base-table, $I, $K in that order. If the index is badly fragmented, but we're still not using functional invocation, then we'd flip the base table and $I. If there's a large amount of functional invocation being used, $K goes to the head of the list.

Determining the size of an index

For testing purposes, I built a local partition index on a table called TEST with one million rows - each row consisting of just the unique key ID, and a filename pointing to an external XML file.

The output of CTX_REPORT for this table is as follows:

                               
SQL> set long 50000
SQL> select ctx_report.index_size('test_idx') from dual;

CTX_REPORT.INDEX_SIZE('TEST_IDX')
--------------------------------------------------------------------------------
===========================================================================
                     INDEX SIZE FOR TESTUSER.TEST_IDX
===========================================================================

...

TOTALS FOR INDEX TESTUSER.TEST_IDX
---------------------------------------------------------------------------

CTX_REPORT.INDEX_SIZE('TEST_IDX')
--------------------------------------------------------------------------------
TOTAL BLOCKS ALLOCATED:                                             206608
TOTAL BLOCKS USED:                                                  198235
TOTAL BYTES ALLOCATED:                         1,692,532,736 (1,614.13 MB)
TOTAL BYTES USED:                              1,623,941,120 (1,548.71 MB)

                            

Determining what is in the buffer pool

                               
SQL> connect sys as sysdba
SQL> alter system set db_cache_size = 1G;
SQL> shutdown
SQL> startup
SQL> connect testuser/testuser
SQL> select /* FIRST_ROWS */ * from (
        select id, filename from test
        where contains (filename, 'district and region) > 0
        order by id desc )
     where rownum <= 20;
 ...

                            

After running this query, we can find out what's in the buffer cache by running the following SQL as a dba:

                               
COLUMN OBJECT_NAME FORMAT A30
COLUMN OBJECT_TYPE FORMAT A15
COLUMN OWNER FORMAT A15
COLUMN NUMBER_OF_BLOCKS FORMAT 999,999,999,999

SELECT o.OBJECT_NAME, o.OBJECT_TYPE, o.OWNER, COUNT(*) NUMBER_OF_BLOCKS
     FROM DBA_OBJECTS o, V$BH bh
    WHERE o.DATA_OBJECT_ID = bh.OBJD
      AND o.OWNER NOT IN 
            ('SYS', 'SYSTEM', 'SYSMAN', 'XDB', 'IX', 'WMSYS', 'CTXSYS')
    GROUP BY o.OBJECT_NAME, o.OWNER, o.OBJECT_TYPE
    ORDER BY COUNT(*);

                            

After running the previous CONTAINS query, this gives me:

                               
OBJECT_NAME                    OBJECT_TYPE     OWNER           NUMBER_OF_BLOCKS
------------------------------ --------------- --------------- ----------------
DR#TEST_IDX0009$R              TABLE           TESTUSER                       2
DR#TEST_IDX0010$R              TABLE           TESTUSER                       2
SYS_LOB0000056864C00002$$      LOB             TESTUSER                       4
DR#TEST_IDX0009$X              INDEX           TESTUSER                      12
DR#TEST_IDX0010$X              INDEX           TESTUSER                      12
SYS_LOB0000056876C00002$$      LOB             TESTUSER                      17
TEST                           TABLE PARTITION TESTUSER                      22
DR#TEST_IDX0010$I              TABLE           TESTUSER                     459
DR#TEST_IDX0009$I              TABLE           TESTUSER                     494

9 rows selected.

                            

Firstly, we can see from the last two lines that the majority of blocks are from two $I tables - those for the last two partitions of the table (since it has ten partitions). Then, moving up a bit, we see 22 blocks from the base table TEST. Note that this doesn't tell us which partitions of the table these blocks come from, but it's reasonable to assume that they come from the last two partitions as well. There are also a few blocks from the $R table.

We'll be covering the SYS_LOB... objects in a moment. For now, we're going to ignore them as tracking them down will be a useful exercise coming up.

Pre-Loading the tables

We need to load both the tables, and the indexes for indexed tables. We can do this by running queries with appropriate hints. Rather than actually fetching every row which would require lots of traffic between the kernel and client, we can make use of aggregate functions so that the rows are fetched only to the kernel:

                               
SELECT /*+ FULL(ITAB) */ SUM(TOKEN_COUNT),  SUM(LENGTH(TOKEN_INFO)) FROM DR#TEST_IDX0010$I ITAB;
SELECT /*+ FULL(ITAB) */ SUM(TOKEN_COUNT),  SUM(LENGTH(TOKEN_INFO)) FROM DR#TEST_IDX0009$I ITAB;

SELECT /*+ INDEX(ITAB) */  SUM(LENGTH(TOKEN_TEXT)) FROM DR#TEST_IDX0010$I  ITAB; 
SELECT /*+ INDEX(ITAB) */  SUM(LENGTH(TOKEN_TEXT)) FROM DR#TEST_IDX0009$I  ITAB; 

SELECT SUM(ROW_NO) FROM DR#TEST_IDX0010$R;
SELECT SUM(ROW_NO) FROM DR#TEST_IDX0009$R;

SELECT /*+ FULL(BTAB) */ SUM(ID) FROM TEST BTAB WHERE ID >= 900000;

                            
know

Also note that the $R loading is incomplete. This is related to those system LOB objects we saw earlier - and we'll be coming back to that later.

Running these queries on my system took just over a minute. That's to load one-fifth of the partitions for a million-row table. Here's the output after rebooting the machine to avoid any effects of disk caching:

                               
SUM(TOKEN_COUNT) SUM(LENGTH(TOKEN_INFO))
---------------- -----------------------
         6963707                37472389

Elapsed: 00:00:27.31

SUM(TOKEN_COUNT) SUM(LENGTH(TOKEN_INFO))
---------------- -----------------------
         6587719                35349528

Elapsed: 00:00:27.77

SUM(LENGTH(TOKEN_TEXT))
-----------------------
                8994078

Elapsed: 00:00:03.74

SUM(LENGTH(TOKEN_TEXT))
-----------------------
                8855116

Elapsed: 00:00:03.21

   SUM(ID)
----------
9.5001E+10

Elapsed: 00:00:01.74

SUM(ROW_NO)
-----------
        231

Elapsed: 00:00:00.12

SUM(ROW_NO)
-----------
        231

                            
query against V$BH
                               
OBJECT_NAME                    OBJECT_TYPE     OWNER           NUMBER_OF_BLOCKS
------------------------------ --------------- --------------- ----------------
DR#TEST_IDX0009$R              TABLE           TESTUSER                       2
DR#TEST_IDX0010$R              TABLE           TESTUSER                       2
TEST                           TABLE PARTITION TESTUSER                     612
DR#TEST_IDX0009$X              INDEX           TESTUSER                   5,905
DR#TEST_IDX0010$X              INDEX           TESTUSER                   6,018
DR#TEST_IDX0009$I              TABLE           TESTUSER                  14,607
DR#TEST_IDX0010$I              TABLE           TESTUSER                  15,090

7 rows selected.

                            

We're going to try to put all those tables into the keep pool, so we need to know how much space they're taking to decide how big to make our keep pool. We can sum up the NUMBER_OF_BLOCKS column above, or use:

                               
SELECT COUNT(*) TOTAL_BLOCKS, COUNT(*)*8/1024 MEGABYTES
     FROM DBA_OBJECTS o, V$BH bh
    WHERE o.DATA_OBJECT_ID = bh.OBJD
      AND o.OWNER NOT IN
            ('SYS', 'SYSTEM', 'SYSMAN', 'XDB', 'IX', 'WMSYS', 'CTXSYS')
    ORDER BY COUNT(*);

TOTAL_BLOCKS  MEGABYTES
------------ ----------
       49503 386.742188

                            
                               
ALTER SYSTEM SET DB_CACHE_SIZE = 500M SCOPE=SPFILE;
-- Down from 1G before
ALTER SYSTEM SET DB_KEEP_CACHE_SIZE = 400M SCOPE=SPFILE;

                            

Forcing Objects to the KEEP POOL

                               
alter table DR#TEST_IDX0010$I storage (buffer_pool keep);
alter table DR#TEST_IDX0009$I storage (buffer_pool keep);

alter index DR#TEST_IDX0010$X storage (buffer_pool keep);
alter index DR#TEST_IDX0009$X storage (buffer_pool keep);

alter table DR#TEST_IDX0010$R storage (buffer_pool keep);
alter table DR#TEST_IDX0009$R storage (buffer_pool keep);

alter table DR#TEST_IDX0010$R storage (buffer_pool keep);
alter table DR#TEST_IDX0009$R storage (buffer_pool keep);

alter table TEST modify partition p10 storage (buffer_pool keep);
alter table TEST modify partition p9  storage (buffer_pool keep);

                            
pre-loading queries

Using TKPROF to monitor disk reads

                               
@?/rdbms/admin/utlxplan.sql  -- avoid tkprof plan table error

alter session set sql_trace=true;

-- Use a bind variable to cut down on recursive SQL from optimizer:

variable bnd varchar2(4000)

exec :bnd := 'district and region'

select /* FIRST_ROWS */ * from (
        select id, filename from test
        where contains (filename, :bnd) > 0
        order by id desc )
        where rownum <= 20;

                            
                               
tkprof mysid_ora_01234.trc traceout.txt sort=exeela table=testuser.plan_table

                            
                               
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.02       0.03          0        402          0           0
Fetch        3      0.00       0.11         34         65          0          20
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        5      0.02       0.15         34        467          0          20

                            

Now if we look at the end of the file, we will see:

                               
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.00       0.00          0          0          0           0
Execute      3      0.03       0.06          0        402          0           1
Fetch        3      0.00       0.11         34         65          0          20
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        8      0.03       0.17         34        467          0          21

Misses in library cache during parse: 2
Misses in library cache during execute: 2

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       41      0.02       0.01          0          0          0           0
Execute     66      0.08       0.08          0        543          0           7
Fetch       59      0.00       0.02          2        190          0          61
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      166      0.10       0.12          2        733          0          68

                            
                               
select /*+ rule */ bucket, endpoint, col#, epvalue
from
 histgrm$ where obj#=:1 and intcol#=:2 and row#=:3 order by bucket

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      2      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.01          2          6          0          11
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        5      0.00       0.01          2          6          0          11

                            

That leaves us with 32 unexplained disk reads from our main SQL statement.

To figure out where these come from, we need to dig a bit deeper.

We'll bounce the database again, re-preload the cache, but this time set an event flag before running our query:

                               
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8'; 

                            

To extract just the reads, we could do:

                               
grep WAIT sid_ora_1234.trc | grep read

                            
                               
WAIT #15: nam='db file sequential read' ela= 53 p1=1 p2=55741 p3=1
WAIT #15: nam='db file sequential read' ela= 29 p1=1 p2=55740 p3=1
WAIT #1: nam='db file sequential read' ela= 35 p1=7 p2=168890 p3=1
WAIT #1: nam='db file scattered read' ela= 63 p1=7 p2=191027 p3=2
WAIT #1: nam='db file scattered read' ela= 53 p1=7 p2=191030 p3=3
WAIT #1: nam='db file scattered read' ela= 68 p1=7 p2=191034 p3=4
WAIT #1: nam='db file scattered read' ela= 50 p1=7 p2=191038 p3=3
WAIT #1: nam='db file sequential read' ela= 7 p1=7 p2=191067 p3=1
WAIT #1: nam='db file sequential read' ela= 6 p1=7 p2=206475 p3=1
WAIT #1: nam='db file sequential read' ela= 23 p1=7 p2=206506 p3=1
WAIT #1: nam='db file scattered read' ela= 59 p1=7 p2=206552 p3=3
WAIT #1: nam='db file scattered read' ela= 104 p1=7 p2=206555 p3=6
WAIT #1: nam='db file sequential read' ela= 16 p1=7 p2=150156 p3=1
WAIT #1: nam='db file sequential read' ela= 8 p1=7 p2=150162 p3=1
WAIT #1: nam='db file sequential read' ela= 8 p1=7 p2=169084 p3=1
WAIT #1: nam='db file sequential read' ela= 9 p1=7 p2=184356 p3=1
WAIT #1: nam='db file sequential read' ela= 9 p1=7 p2=184461 p3=1
WAIT #1: nam='db file sequential read' ela= 23 p1=7 p2=77 p3=1
WAIT #1: nam='db file sequential read' ela= 21 p1=7 p2=5434 p3=1
WAIT #1: nam='db file sequential read' ela= 19 p1=7 p2=5545 p3=1
WAIT #1: nam='db file sequential read' ela= 20 p1=7 p2=5614 p3=1

                            
  • ela: time waited in miliseconds (?)
  • p1: file number (used in dba_files)
  • p2: block number (used in dba_segments)
  • p3: number of blocks read

In order to translate these file and block numbers into useful information, we need to run check DBA_EXTENTS, like this:

                               
column owner format a30
column segment_name format a30
select owner, segment_name
from   dba_extents
where  &1 = file_id 
and    &2 between block_id and block_id + blocks - 1
/

                            
seg.sql
                               
cat $ORACLE_HOME/admin/$ORACLE_SID/udump/`ls -tr $ORACLE_HOME/admin/$ORACLE_SID/udump | tail -1` \
| sed -n -e 's/^WAIT.*read.*p1=\(.*\).*p2=\(.*\) .*/\1 \2/p' \
| xargs -l1 sqlplus -s "/ as sysdba" @seg.sql

                            

The start of the output might look like this:

                               
OWNER      SEGMENT_NAME
---------- ------------------------------
SYS        C_OBJ#_INTCOL#

OWNER      SEGMENT_NAME
---------- ------------------------------
TESTUSER   SYS_IL0000056876C00002$$

OWNER      SEGMENT_NAME
---------- ------------------------------
TESTUSER   SYS_LOB0000056876C00002$$

                            
query against V$BH
                               
select segment_name, segment_type from dba_segments
where segment_name in ('SYS_IL0000056876C00002$$', 'SYS_LOB0000056876C00002$$');

SEGMENT_NAME                   SEGMENT_TYPE
------------------------------ ------------------
SYS_IL0000056876C00002$$       LOBINDEX
SYS_LOB0000056876C00002$$      LOBSEGMENT

                            
                               
select table_name from dba_indexes where owner = 'TESTUSER' and index_name = 'SYS_IL0000056876C00002$$';

TABLE_NAME
------------------------------
DR#TEST_IDX0010$R

select table_name from dba_lobs where owner = 'TESTUSER' and segment_name = 'SYS_LOB0000056876C00002$$';

TABLE_NAME
------------------------------
DR#TEST_IDX0010$R

                            

In order to understand that we need to know a bit more about LOBs. LOBs can be either In-Line or Out-of-Line, meaning that the actual LOB is stored either in the same segment as the rest of the table blocks, or is stored its own segment. By default, lobs are stored in-line. However, if the LOB size exceeds 4K, it will automatically be stored out-of-line. You can force all blocks to be stored out-of-line using the LOB storage clause "DISABLE STORAGE IN ROW".

In the case of the $I tables, the LOB lengths are restricted by the Text code to 4K, to make sure they can always be stored in-line to minimize I/O requirements. Hence there should never be any requirement for separate LOB segments for the $I tables.

However, in the case of $R tables, things are different. The $R table consists of a BLOB (or BLOBs) which are basically a list of ROWIDs for the base table, ordered by DOCID. In a small table with less than about 220 rows, this BLOB will be less than 4K and will be stored in-line. However when it exceeds that size, it will me moved to out-of-line storage in a new segment.

To ensure that we get the $R LOB segments pre-cached in the keep pool, we need to do two things. First, we need to change our ALTER TABLE statements for the $R tables:

                               
alter table DR#TEST_IDX0010$R storage (buffer_pool keep) modify lob (data) (storage (buffer_pool keep));
alter table DR#TEST_IDX0009$R storage (buffer_pool keep) modify lob (data) (storage (buffer_pool keep));

                            

Secondly, we need to force the LOB blocks to be loaded, as well as the main table blocks for the $R. This is a little more complicated. We might think we could do:

                               
SELECT SUM(ROW_NO), SUM(LENGTH(DATA)) FROM DR#TEST_IDX0010$R;  /* WRONG!! */

                            
                               
set serveroutput on size 1000000

create or replace procedure loadAllDollarR (idx_name varchar2) is
  v_idx_name varchar2(30) := upper(idx_name);
  type c_type is ref cursor;
  c2 c_type;
  s varchar2(2000);
  b blob;
  buff varchar2(100);
  siz number;
  off number;
  cntr number;
begin
  for c1 in (select table_name t from user_tables
             where table_name like 'DR_'||v_idx_name||'%$R') loop
    dbms_output.put_line('loading from table '||c1.t);
    s := 'select data from '||c1.t;
    open c2 for s;
    loop
       fetch c2 into b;
       exit when c2%notfound;
       siz := 10;
       off := 1;
       cntr := 0;
       if dbms_lob.getlength(b) > 0 then
         begin
           loop 
             dbms_lob.read(b, siz, off, buff);
             cntr := cntr + 1;
             off := off + 4096;
           end loop;
         exception when no_data_found then
           if cntr > 0 then
             dbms_output.put_line('4K chunks fetched: '||cntr);
           end if;
         end;
       end if;
    end loop;
  end loop;
end;
/

                            
                               
exec LoadAllDollarR('test_idx')

                            

Finishing Up

                               
-- This script preloads the last two partitions from a 10-way
-- partitioned table and its text index. The table is called TEST,
-- the index is called TEST_IDX. It is owned by user TESTUSER.

connect testuser/testuser

--------------------------------------------------------
-- First make sure all tables go into the keep buffer --
--------------------------------------------------------

alter table DR#TEST_IDX0010$I storage (buffer_pool keep);
alter table DR#TEST_IDX0009$I storage (buffer_pool keep);

alter index DR#TEST_IDX0010$X storage (buffer_pool keep);
alter index DR#TEST_IDX0009$X storage (buffer_pool keep);


alter table DR#TEST_IDX0010$R storage (buffer_pool keep);
alter table DR#TEST_IDX0009$R storage (buffer_pool keep);

alter table DR#TEST_IDX0010$R storage (buffer_pool keep)
      modify lob (data) (storage (buffer_pool keep));
alter table DR#TEST_IDX0009$R storage (buffer_pool keep)
      modify lob (data) (storage (buffer_pool keep));

alter table TEST storage (buffer_pool keep);

set timing on

--------------------------------------------------------------
-- Then perform the necessary queries to preload the tables --
--------------------------------------------------------------

SELECT /*+ FULL(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO))
       FROM DR#TEST_IDX0010$I ITAB;
SELECT /*+ FULL(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO))
       FROM DR#TEST_IDX0009$I ITAB;

SELECT /*+ INDEX(ITAB) */  SUM(LENGTH(TOKEN_TEXT))
       FROM DR#TEST_IDX0010$I  ITAB;
SELECT /*+ INDEX(ITAB) */  SUM(LENGTH(TOKEN_TEXT))
       FROM DR#TEST_IDX0009$I  ITAB;

SELECT /*+ FULL(BTAB) */ SUM(ID)
       FROM TEST WHERE ID >= 900000;

SELECT SUM(ROW_NO) FROM DR#TEST_IDX0010$R;
SELECT SUM(ROW_NO) FROM DR#TEST_IDX0009$R;

exec LoadAllDollarR('test_idx')

                            

Conclusion

The author welcomes feedback and corrections on this paper. Email him at roger.ford@oracle.com


Last modified: 31 August 2004 by Roger Ford





In-Memory Replay Banner