Concatenated Datastore - Performance
Improvements
Here are some real-world examples of the improvements that can be made using
the Concatenated Datastore Sample Code.
These queries were sent to me, together with the base data, by a
customer who was unhappy with the performance of "mixed" queries under
8.1.5. I indexed his data using the concatenated datastore, and re-ran
his queries against the new single index.
| Test Info |
| Table Size | 226135 rows |
| Hardware | 550MHz NT workstation, 512MB memory |
| Database block size | 2K (NT default - 4K recommended) |
| db_block_buffers | 8192 |
All queries were run with a "warm" cache - the database was restarted, then a
similar but unrelated query was run first, and then each of the
following queries was run once in turn. Pre-loading the cache would have
a significant effect.
| Query | Description | Hits | Old Time | New Time | Improvement |
| 1 | Prime cache | 10694 | 4.03s | 3.94s | N/A |
| 2 | Simple mixed query | 13253 | 24.5s | 2.64s | 9.28x |
| 3 | Two contains | 124031 | 24.6s | 1.22s | 20.1x |
| 4 | Mixed query with range | 192611 | 4m 56s | 7.71s | 38.4x |
| 5 | Contains and range | 13085 | 37.9s | 3.87s | 9.8x |
The Queries
Query 1 : seed cache - standard version
select count(*) from wcenter1 where contains(text, 'new york') > 0;
Query 1 : seed cache - concatenated datastore version
select count(*) from wcenter1 where contains(text, 'new york') > 0;
Query 2 : mixed query - standard version
select count(*) from wcenter1
where contains(text,'london') > 0 and type = 'W';
Query 2 : mixed query - concatenated datastore version
select count(*) from wcenter1
where contains (concat, 'london within text and w within type') > 0;
Query 3 : two contains - standard version
select count(*) from wcenter1
where contains(text,'sport') > 0 or contains(title,'sport') > 0;
Query 3 : two contains - concatenated datastore version
select count(*) from wcenter1
where contains (concat, 'sport within text or sport within title') > 0;
Query 4 : mixed query with range - standard version
select count(*) from wcenter1
where contains(text,'london') > 0 or time > 500 and type != 'D';
Query 4 : mixed query with range - concatenated datastore version
declare
result number;
qstring varchar2(4000);
begin
qstring := ctx_cd.int_contains ('hermes_cd', 'time', 501, null, 'G');
qstring := '( (london within text) or ( ( ' || qstring || ') not (d within type)) )';
dbms_output.put_line ('Query string is '||qstring);
select count(*) into result from wcenter1 where contains (concat, qstring) > 0;
dbms_output.put_line ('Count is ' || to_char(result));
end;
/
Query 5 : contains and range - standard version
select count(*) from wcenter1 where contains(text,'london') > 0
and time < 2000;
Query 5 : contains and range - concatenated datastore version
declare
result number;
qstring varchar2(4000);
begin
qstring := ctx_cd.int_contains ('hermes_cd', 'time', 1999, null, 'L');
qstring := '(london within text) and ' || qstring;
dbms_output.put_line('Query string is ' || qstring);
select count(*) into result from wcenter1 where contains (concat, qstring) > 0;
dbms_output.put_line ('Count is ' || to_char(result));
end;
/
Preloading the cache for better performance
On the other hand, there are a limited number of them. The index
entries all take the form 'H01', 'H02', etc or 'L01', 'LO2', continuing up to
the maximum set by the range in use.
If your system has enough memory, and the init.ora parameter DB_BLOCK_BUFFERS
is set high enough, it should be possible to get all of these entries cached
in the SGA.
This will make a significant improvement to the above figures.
I increased my DB_BLOCK_BUFFER setting from the default 8192 to 120000 (taking
240MB of memory with the default 2K block size). I then ran:
ALTER TABLE dr$hermes_cd$i CACHE;
This has the effect of NOT flushing the BLOB column of the table from the cache
at the first opportunity.
I then ran the following query (which took several minutes) in order
to load all the index data for range searches into the SGA. 0
SELECT COUNT(*) FROM wcenter1 WHERE CONTAINS (concat, 'l__ and h__') > 0;
Improvement
Query 5 above was run straight after this, and the time to execute reduced
from 3.87s to 0.94s - a major improvement!
Last updated Tuesday May 9th, 2000 by Roger
Ford
|