Using CTX_REPORT to generate XML output of Index Information

CTX_REPORT was introduced in Oracle 9i in order to help users and Oracle Support find out information about an existing Oracle Text index.

In Oracle 10g, CTX_REPORT has been enhanced with the option to produce XML output for many of its procedures. This makes it much easier to incorporate information gained from CTX_REPORT into other applications - such as perhaps a custom system-monitoring application.

This paper walks through the process of generating the report, and processing the XML output using some of Oracle's XML capabilities.

The paper assumes a basic knowledge of Oracle Text and XML, but little or no knowledge of Oracle's XML capabilities. It is hoped that the paper will double as a primer on using some of Oracle's XML capabilities to convert from XML to SQL.

The reader is strongly encouraged to try all of the examples as they are presented. In general, the output of the examples is not shown in this document, in the assumption that the readers will be trying the examples for themselves. You will need an Oracle 10g database, and a SQL*Plus session - no extra software or products are necessary.

The examples are run on a Text index called "TI", owned by the user "TUSER". You could create a suitable test index by running the following SQL as a DBA user:

create user tuser identified by tuser default tablespace users
  temporary tablespace temp;
grant connect,resource,ctxapp to tuser;
connect tuser/tuser
create table testtab(text varchar2(2000));
create index ti on testtab(text) indextype is ctxsys.context;


Getting the Total Index Size

To find out the size of the index, we can use the function CTX_REPORT.INDEX_SIZE. This is a function returning a CLOB. Probably the simplest way to read it is in SQL*Plus using SELECT ... FROM DUAL, as follows:

SQL> select ctx_report.index_size('ti') from dual;

SQL> set long 50000

TOTAL BYTES USED:                                311,296 (304.00 KB)


So let's ask for an XML report instead:

SQL> select ctx_report.index_size('ti', null, 'XML') from dual;


This will give us a heap of XML output, including sizes for each object.

Here's a section of the output:



To do that bit by bit:

First, rather than directly selecting a CLOB, force it into an XMLTYPE value:

SQL> select xmltype(ctx_report.index_size('ti', null, 'XML')) from dual;


Next we'll use a simple XPATH expression to specify what we want to fetch:

SQL> select extract(xmltype(ctx_report.index_size('ti', null, 'XML')), '//SIZE_OBJECT') from dual;

tutorial at w3Schools

Notice that there are new-line characters around each element in the output. This is actually incorrect - it's a bug. We'll fix that now by using the SQL "replace" function to remove all line feeds (character 10). XMLType will automatically reformat our XML to look OK on output.

  extract(xmltype(replace(ctx_report.index_size('ti', null, 'XML'),chr(10),'')), 
from dual;

  extract(xmltype(replace(ctx_report.index_size('ti', null, 'XML'),chr(10),'')), 
from dual;

  extract(xmltype(replace(ctx_report.index_size('ti', null, 'XML'),chr(10),'')), 
from dual;


Using text():

  extract(xmltype(replace(ctx_report.index_size('ti', null, 'XML'),chr(10),'')), 
  as "Table Size"
from dual;

Using extractValue:

  extractValue(xmltype(replace(ctx_report.index_size('ti', null, 'XML'),chr(10),'')), 
  as "Table Size"
from dual;


We've now managed to extract a single value from the XML into a SQL environment. But what if we want to fetch ALL the sizes, perhaps to add them together to get a summary? To do that, we can use XMLSequence, which returns a collection of XMLType values. We can then use this function in a TABLE clause to unnest the collection values into multiple rows. Now that we're generating a "table", we no longer need to reference DUAL.

To get just the sizes:

select *
  from table(xmlsequence(
    extract(xmltype(replace(ctx_report.index_size('ti', null, 'XML'),chr(10),'')), 

select * 
  from table(xmlsequence(
    extract(xmltype(replace(ctx_report.index_size('ti', null, 'XML'),chr(10),'')), 


Set the column formatting for nice output

column "Name" format a33
column "Tablespace" format a30
column "Bytes" format a12

   extractValue(Column_Value, '/SIZE_OBJECT/SIZE_OBJECT_NAME') as "Name",
   extractValue(Column_Value, '/SIZE_OBJECT/SIZE_TABLESPACE_NAME') as "Tablespace",
   extractValue(Column_Value, '/SIZE_OBJECT/SIZE_BYTES_USED') as "Bytes"
   from table(xmlsequence(
    extract(xmltype(replace(ctx_report.index_size('ti', null, 'XML'),chr(10),'')), 

   sum(extractValue(Column_Value, '/SIZE_OBJECT/SIZE_BYTES_USED')) "Total"
   from table(xmlsequence(
    extract(xmltype(replace(ctx_report.index_size('ti', null, 'XML'),chr(10),'')), 



drop table output;
create table output(report XMLType);

  tlob clob;
       index_name=>'ti', report=>tlob, 
       list_size=>20, report_format=>'XML');
    insert into output values (xmlType (replace(tlob,chr(10),'')) );

select extractValue(report, 
         '//STAT_STATISTIC[@NAME="estimated row fragmentation"]')
       as "Fragmentation"
from output;

select extract(value(d), '//STAT_TOKEN_TEXT')
from output, table(xmlsequence(extract(report,
      '/CTXREPORT/INDEX_STATS/STAT_TOKEN_STATS/STAT_TOKEN_LIST[@NAME="most frequent tokens"]/STAT_TOKEN[position()<4]'))) d;