As Published In
Oracle Magazine
July/August 2003

TECHNOLOGY: Ask Tom


On Procedures, Flushes, and Writes

By Tom Kyte Oracle Employee ACE

Our expert compares triggers, procedures, flushing, and not flushing.

Suppose a piece of code can be implemented in either trigger or procedure. Which is better in terms of execution speed, and why?

The answer to the first part is to call a packaged procedure, either directly or in the trigger itself. The SQL you code in packages and procedures will be cached for your entire session, thus reducing the number of parse calls. The SQL inside a trigger is cached only for the duration of the triggering statement, causing additional soft parses whenever the trigger is executed. The bottom line here is to put the logic into packaged PL/SQL procedures and call them from the trigger.

I can show this behavior by setting up a small simulation. In the real world, I would use a PL/SQL package, but here, for demonstration purposes, I'll just use a small standalone procedure. I will also create a table and a trigger on this table. The trigger will execute a SQL query that I can easily identify in a TKPROF report, and the trigger will call the standalone procedure as well. The standalone procedure will execute SQL that is similar to that in the trigger, but again, the SQL will be identifiable in the TKPROF report as having come from the procedure. The example, therefore, begins by creating the table and the code for the procedure I'll be calling: 

SQL> create table t ( x int );
Table created.


SQL> create or replace procedure p
  2  as
  3     l_cnt int;
  4  begin
  5     select count(*) into l_cnt
  6       from dual IN_PROCEDURE;
  7  end;

  8  /
Procedure created.


Note how I "tagged" the SQL query by aliasing DUAL to be IN_PROCEDURE . That will help me identify it in the TKPROF report. Now for the trigger on this table: 

SQL> create or replace trigger t_trig
  2  before insert on t for each row
  3  declare
  4     l_cnt number;
  5  begin
  6     select count(*) into l_cnt 
  7       from dual IN_TRIGGER;
  8     p;
  9  end;

 10  /
Trigger created.


And now I am ready to test. All I do to test this is turn on SQL_TRACE and then perform some inserts into T . I'm showing only two of the inserts here for brevity, but I ran the first single-row insert six times and the second multirow insert once: 

SQL> alter session set sql_trace=true;

Session altered.

SQL> insert into t values ( 1 );
1 row created. Repeated 6 times

SQL> insert into t
  2  select rownum from all_users;
32 rows created.


Now, when I review the resulting TKPROF report, I observe the following: 

select count(*)
  from dual IN_TRIGGER;

call     count  ...
------  ---     ...
Parse   7       ...
Execute 38      ...
Fetch   38      ...
------  ---     ...
total   83      ...


select count(*)
  from dual IN_PROCEDURE;

call    count   ...
------  ---     ...
Parse   1       ...
Execute 38      ...
Fetch   38      ...
------  ---  ...
total   77      ...


Notice how the parse count for the query tagged IN_TRIGGER is 7, which is 6 parses for each of the single-row inserts and another parse for the multirow insert. Every call, every statement that fired that trigger, caused a soft parse of all the SQL inside that trigger. The SQL statement tagged IN_PROCEDURE , however, did not exhibit that behavior. It was parsed once for my session, and PL/SQL will keep it cached as long as possible for me, reusing that cached cursor over and over. Since excessive parsing (even soft parsing) not only consumes resources (such as CPU time) but also requires latches into the shared pool, it will lead to performance and scalability issues over time.

Flush the Cache

ALTER SYSTEM FLUSH SHARED_POOL flushes the shared pool entries, not the block buffers. Is there a way to flush them, too? This is important for a tuning exercise where several methods are tried one after the other, but we would like to reduce the effect of a preexisting block in the buffer without doing a database restart.

Actually, it is important that a tuning tool not do that. It is important to run the test, ignore the results, and then run it two or three times and average out those results. In the real world, the buffer cache will never be devoid of results. Never. When you tune, your goal is to reduce the logical I/O (LIO), because then the physical I/O (PIO) will take care of itself.

Consider this: Flushing the shared pool and buffer cache is even more artificial than not flushing them. Most people seem skeptical of this, I suspect, because it flies in the face of conventional wisdom. I'll show you how to do this, but not so you can use it for testing. Rather, I'll use it to demonstrate why it is an exercise in futility and totally artificial (and therefore leads to wrong assumptions). I've just started my PC, and I've run this query against a big table. I "flush" the buffer cache and run it again: 

SQL> set autotrace traceonly statistics
SQL> set timing on
SQL> select count(data_object_id)
  2    from big_table real_pio;
1 row selected.
Elapsed: 00:00:07.06

Statistics
--------------------------------
      13720  consistent gets
      13646  physical reads


So apparently, on this system, in order to perform 13,646 PIOs, it takes on the order of seven seconds. Or does it? I can have Oracle get rid of the cached blocks for a tablespace simply by taking that tablespace offline and putting it back online as follows: 

SQL> alter tablespace users offline;
Tablespace altered.

SQL> alter tablespace users online;
Tablespace altered.

SQL> select count(data_object_id)
  2    from big_table fake_pio;
1 row selected.
Elapsed: 00:00:01.06

Statistics
--------------------------------

      13656  consistent gets
      13646  physical reads


So, how do I reconcile that? All of a sudden, 13,646 PIOs on this system take one second. Even if I shut down and restart the database between runs, I would observe this strange behavior; I know because I tested that as well. What has happened here is that the operating system itself has a file system buffer cache, and the PIO that you believe you were doing was fake; it was being satisfied by the OS buffer cache itself! In order for me to get this query to take seven seconds again, I would have to either do something on my system to completely flush the OS file system buffer cache or reboot the computer-neither of which happens in the real world.

It is interesting how this particular issue, that of the OS file system cache, affects us. Here, it would give us a skewed view of our query performance. Some queries that did 100 PIOs would seem to be worse performers than some queries with 10,000 PIOs, depending on what is in this OS file system cache. Another place I see this come up is when a team decides to move all of its data files from regular file systems to raw (devices).

The motivation here is that everyone knows "raw is fast" and "raw is faster than regular file systems." People observe their own systems and see them doing a lot of PIO. At that point conventional wisdom says, "Since raw is faster than regular file systems, we should move to raw, and our systems will go faster."

Well, the day after making the move to raw, performance on many of these systems is horrible—much worse than before. What happened? Their systems were double-buffered. There was the Oracle cache and the OS file system cache. Maybe 90 percent of their PIO when using the cooked (nonraw) file system was satisfied via the OS buffer cache. So, only 10 percent of their PIO really went to disk. When they moved to raw, they took that OS buffer cache away, so now 100 percent of their PIO is true PIO; it really goes to disk. Performance goes down as a result.

So while you can flush the Oracle buffer cache in a performance test, you never would. Not unless you want to get misleading results, that is.

Insert and Update from a Record

I am using the Oracle9i Release 2 new feature that allows you to insert a record using a PL/SQL record type and update a row using the same. It is not allowing me to specify the column names in the INSERT/UPDATE statement. Is this normal?

This question refers to the new capability in Oracle9i Release 2 that lets you perform operations using a PL/SQL record instead of an explicit list of columns. For example, suppose you had a simple table: 

SQL> create table t
  2  ( x varchar2(5),
  3    y varchar2(5) );
Table created.



In Oracle9i Release 2, you can create and update records using this notation: 

SQL> declare
  2      l_rec t%rowtype;
  3  begin
  4      l_rec.x := 'x val';
  5      l_rec.y := 'y val';
  6
  7      insert into t
  8      values l_rec;
  9
 10      l_rec.x := 'X VAL';
 11      update t
 12         set row = l_rec;
 13  end;

 14  /
PL/SQL procedure successfully completed.

SQL> select * from t;

X       Y
-----   -----
X VAL   y val


In prior releases, you would have had to code INSERT INTO T (X,Y) VALUES ( L_REC.X, L_REC.Y ) , which for tables with many columns could be tedious. So this shortcut works nicely. If you wanted to explicitly list the columns in the INSERT (or UPDATE ) statement, you would use an inline view. I'll demonstrate this technique, but I'll also use it to demonstrate why you might not want to do this and point out an important fact about this new feature. That fact is that the INSERT and UPDATE are done positionally, not by name. If you explicitly list the columns in the INSERT/UPDATE statement, you could be setting yourself up for trouble later on. First, I'll show the syntax for doing this: 

SQL> declare
  2      l_rec t%rowtype;
  3  begin
  4      l_rec.x := 'x val';
  5      l_rec.y := 'y val';

  6
  7      insert into
  8      (select x, y from t)
  9      values l_rec;
 10
 11      l_rec.x := 'X VAL';
 12      update
 13      (select x, y from t)
 14         set row = l_rec;
 15  end;
 16  /
PL/SQL procedure successfully completed.

SQL> select * from t;

X       Y
------  -------
X VAL   y val


Notice how I am modifying an inline view. Here I can explicitly list out the columns in order. That last part is key: in order. Consider what happens if I reverse X and Y in the inline view above: 

SQL> declare
  2      l_rec t%rowtype;
  3  begin
  4      l_rec.x := 'x val';
  5      l_rec.y := 'y val';
  6
  7      insert into
  8      (select y, x from t)
  9      values l_rec;
 10
 11      l_rec.x := 'X VAL';
 12      update
 13      (select y, x from t)
 14         set row = l_rec;
 15  end;
 16  /
PL/SQL procedure successfully completed.

SQL> select * from t;


X       Y
-----   -----
y val   X VAL


The data for X went into Y and the data for Y went into X. This is by design; these records are applied to the table positionally, in the order of the attributes in the record. The reason I don't think you want to use this technique, even if you supply the column names in the "correct" order, is that if someone comes along later and drops T and rebuilds it as CREATE TABLE T (Y INT, X INT) , thus changing the order of the columns, your explicit code will end up doing the wrong thing. If, on the other hand, you just use INSERT INTO T VALUES L_REC , then when that table is dropped and rebuilt, the database will recompile your code and "fix everything" for you. X will go into column X and Y into column Y. So this is one time when being explicit is not a good idea.

This also points out that you always want to use TABLE_NAME%ROWTYPE for declaring records you use with this new feature. If you define your own record type and someone adds, removes, or moves columns in the table, your existing code will break—perhaps silently.

By silently, I mean that the code does not raise an error at runtime; it just silently puts the data into the wrong columns. This sort of error would be difficult at best to detect and find.

How to Write this Query

The following statement is a basic example of what I want to do: 

select ename from bonus 
  where sal = nvl(:v, sal);


Disregarding why I'd want to do this, would it be preferable to a) have two statements, one used if :v is null, the other if not null; b) dynamically build a ref cursor; c) leave as is; or d) do something else.

This question shows that one can learn something new every day. My initial response was going to be that A or B is the right answer, but testing has shown that today C is the right answer! A or B used to be correct (and might be for some of you out there), but C is the correct answer now. This is why I always try to provide a test case demonstrating what I say, since things can and do change over time.

When I thought that A or B was the right answer, I was applying my knowledge of the rule-based optimizer (RBO) to queries in general. When I set up the test to show how superior A or B would be as compared to C, I discovered that A and B are now inferior to C using the cost-based optimizer (CBO)! What I'll do to show this is to set up a very simple but sizable table: 

SQL> create table t
  2  ( acc_nbr number,
  3    data char(255),
  4    constraint t_pk
  5    primary key(acc_nbr) );
Table created.

SQL> insert into t
  2  select rownum, 'x'

  3    from all_objects;
29882 rows created.

SQL> analyze table t compute statistics
  2  for table
  3  for all indexes
  4  for all indexed columns;
Table analyzed.


And now I'll test the efficiency of the various approaches. I will run two queries that differ only in the optimizer they use. That query is: SELECT COUNT(*) FROM T WHERE ACC_NBR = nvl(:n,ACC_NBR) . I'll run it with and without a /*+ RULE */ hint, and I'll run each version of it with a NULL bind value as well as a non-NULL value and compare the results using SQL_TRACE and TKPROF. Starting with the query using the CBO and a NULL bind variable, I find the following: 

select count(*)
  from t n_is_null_cbo
 where acc_nbr = nvl(:n,acc_nbr);
call    count     cpu   elapsed   query
-----    -----  ----    --------        ------
total       4    0.23      0.27    62

        Rows    Row Source Operation
         ------ ---------------------------
        1       SORT AGGREGATE
        29882   CONCATENATION

        29882   FILTER
        29882   INDEX FAST FULL SCAN T_PK
        0       FILTER
        0       INDEX UNIQUE SCAN T_PK


Notice how the query plan is almost like two plans in one: There is an INDEX FAST FULL SCAN that is used when the bind variable is NULL and an INDEX UNIQUE SCAN that, in this case, did not execute at all. The plan was constructed in such a way that at runtime, N was NULL, so that portion of the plan would not execute. I got my answer in this case using an efficient FAST FULL SCAN of the small primary key index. Compare this to the query using the RBO: 

select /*+ RULE */ count(*)
  from t n_is_null_rbo
 where acc_nbr = nvl(:n,acc_nbr);

call    count    cpu    elapsed query
-----   -----   ----    --------        -----
total       4   0.10       0.13  1131

          Rows  Row Source Operation
        ------  ----------------------
        1       SORT AGGREGATE
        29882     TABLE ACCESS FULL T


The RBO is pretty simplistic here. It is not aware of the fact that it could use the index with a FAST FULL SCAN (because that access path is unique to the CBO). Further, it has a "one plan fits all" mentality as well. This query makes a full scan of the table regardless of the value of :N at runtime. Just comparing the efficiency of the INDEX FAST FULL SCAN used by the CBO with 62 logical (query column) IOs versus the RBO FULL SCAN with 1,131 logical IOs indicates that there is great benefit in the CBO already. But it gets better.

Now, I set the bind variable : N to the number 55 and rerun both queries.

Here I observe the following: 

select count(*)
  from t n_is_55_cbo
 where acc_nbr = nvl(:n,acc_nbr);

call    count    cpu     elapsed         query
-----   -----   ----    --------        ------
total      4    0.00      0.00       2

Rows   Row Source Operation
----   -----------------------------
   1   SORT AGGREGATE
   1    CONCATENATION
   0     FILTER
   0      INDEX FAST FULL SCAN T_PK
   1     FILTER
   1      INDEX UNIQUE SCAN T_PK


Notice how in this case the INDEX FAST FULL SCAN portion of the query does not execute at all. That is clear not only by the zero row counts in the query plan but also even more so by the reduction in logical IOs—only 2 logical IOs, in this case, to do the INDEX UNIQUE SCAN. Now, comparing the same to the RBO using : N = 55, I observe the following: 

select /*+ RULE */ count(*)
  from t n_is_55_rbo
 where acc_nbr = nvl(:n,acc_nbr)

call    count    cpu     elapsed  query
-----   -----   ----    --------        -----
total       4   0.10      0.13    1131

        Rows    Row Source Operation
        ------  ----------------------
        1       SORT AGGREGATE
     1    TABLE ACCESS FULL OBJ#


This is predictable. No matter what the inputs, this query behaves the same: a full scan of all 1,131 blocks every time. So, there are two things to take away from this question. One is that things change and you learn something new every day. The other thing to take away from this question is that the CBO offers great benefits once you start using it.

Next Steps


ASK Tom
 asktom.oracle.com
Oracle Vice President Tom Kyte answers your most difficult technology questions. Highlights from that forum appear in this column.

READ
 more Tom
asktom.oracle.com

Oracle documentation
oracle.com/technetwork/documentation

DISCUSS
Oracle technology
oracle.com/forums

Advanced access paths are available to the CBO that simply are not available to the RBO. Intelligent, sophisticated query plans—such as those demonstrated above—are unique to the CBO and are just not available to the RBO. Even more compelling, perhaps, is the fact that the release that follows Oracle9i Release 2 will not even include a rule-based optimizer.

Read-Only Distributed Queries

We started our standby database in read-only mode. Almost all users are able to run read-only queries just fine. One user however, has a database link to another database, and he cannot run a read-only query. He receives an ORA-16000 "database open for read-only access" error. The target database he is connecting to is read-write. The source database temp data files are read-write. Could you please enlighten us on what is going wrong?

The distributed facility in the database starts a distributed transaction for you "just in case." If you set your transaction to read-only prior to executing the queries, you can disable this behavior. Issue a SET TRANSACTION READ ONLY command before issuing any other statement, and all of your distributed queries will then run successfully.

If you attempt this when connected as the SYS user, however, you will still receive the ORA-16000. That is because the SYS user can never be in a read-only transaction. But this solution works for all "normal" users, and we all know that we should not be using SYS for anything anyway!


Tom Kyte (thomas.kyte@oracle.com) has worked for Oracle since 1993. Kyte is a vice president in the Oracle Government, Education, and Health group; the author of Expert One-on-One: Oracle (Wrox Press); and a coauthor of Beginning Oracle Programming (Wrox Press).

 

Send us your comments