As Published In
Oracle Magazine
July/August 2007

TECHNOLOGY: Ask Tom


On Ignoring, Locking, and Parsing

By Tom Kyte

Our technologist looks out for WHEN OTHERS, locks, cascades, and parses.

There's a programming construct, WHEN OTHERS, in PL/SQL that I wish weren't there. This clause, when used in an exception block, catches any unhandled exceptions. There are times when this is useful, such as for logging errors: 

exception
   when others
   then 
      log_error(....);
      raise;
end;

But the problem I see is that far too many people use the WHEN OTHERS clause without following it by a RAISE or RAISE_APPLICATION_ERROR call. That effectively hides the error. In reality, the error happened, but it was not handled in any meaningful way; it was just ignored—silently. The invoker of your code has no idea that something went horribly wrong and that your code failed, and he or she typically thinks that everything was, in fact, successful.

On Ask Tom (asktom.oracle.com), I see questions and responses about this mistake made over and over again. For example, I recently received this question:

I created a package that spools 10 jobs with ranges of data to bulk-load into a flat file. I need to use UTL_FILE, because in the middle of the process, I call three procedures to retrieve some data. The total time of the process is six hours to bulk-load 9,500,000 strings into the file. I need your advice, because I need to decrease the total time.

This is the code: 

PROCEDURE prcl_MakeFile(...)
IS
... variables ... 
BEGIN
    l_FileID := UTL_FILE.FOPEN (...);
    OPEN cur;
    LOOP
        ... process records here ...
        ... lots of code ...
    END LOOP;
    CLOSE cur;
    UTL_FILE.FCLOSE(l_FileID);
EXCEPTION
    WHEN OTHERS THEN
        IF (UTL_FILE.IS_OPEN(l_FileID)) 
        THEN UTL_FILE.FCLOSE(l_FileID); 
        END IF;
END prcl_MakeFile; 

My answer was simple: I could make this code go infinitely fast. All this procedure needs to do is 

PROCEDURE prcl_MakeFile(...)
IS
... variables... 
BEGIN
    Return;
END prcl_MakeFile;

The two procedures are logically equivalent, but mine is much faster! So why are they logically equivalent? It is all because of the WHEN OTHERS clause coded in the procedure—the WHEN OTHERS clause that is not followed by a RAISE or RAISE_APPLICATION_ERROR. Suppose that the call to UTL_FILE .FOPEN failed in that routine—what would happen? All of that code would be skipped, but no one would know . If you have code wrapped with an exception block that contains WHEN OTHERS—but does not reraise the exception—all of the code in that exception block can be safely deleted, in my opinion. You do not need it, because if the code fails to execute, you ignore the fact that it did not execute . If the code is allowed to not execute sometimes, you can, in fact, allow that code to never execute . You cannot ever rely on this code's actually running, so you never have to execute it.

I also assert that it would be safer to not run the code than to run it. At least if you do not execute it, you know what state the database will be in. If you invoke a piece of code that includes the WHEN OTHERS construct (not followed by a reraising of the exception), the database will be left in some unknown state. For example, given the following procedure: 

procedure p
is
begin
   insert into t1 values(1);
   insert into t2 values(2);
   insert into t3 values(3);
exception
   when others then
      dbms_output.put_line
      ('something bad happened!'); 
end;

The invoker of this procedure would never know if 

  • No rows were inserted

  • All three rows were inserted

  • T1 was inserted, but not T2 and T3

  • T1 and T2 were inserted, but not T3

There are four possible outcomes for this procedure, but the invoker would never know which one had happened. At least if the WHEN OTHERS code were removed, the invoker would understand what had taken place after any given call—specifically, nothing .

To read more on this important topic, see 


Locking on the Web

I recently came across a .NET application running against Oracle Database 10g in which the developers had used optimistic locking (get a version ID from a table, update the required row, and then update the version ID) because it was the only way they could make sure that the same record was not updated by many users at the same time. I believe that this can be done with the FOR UPDATE clause rather than with unnecessary code that creates artificial locking. Am I right?

There are two ways n -tier applications access the database:

1. Stateful: holding a connection for a long period of time over many Web pages

2. Stateless: holding a connection for a very short period of time, maybe less than the time it takes just to generate a single HTML page

Most applications today use method 2, in my experience. In this fashion, end users consume resources only when they are actually "active" in the database. They grab a connection, use it, and release it.

If you have a stateless connection to the database, you cannot use pessimistic (FOR UPDATE) locking. You lose the lock you asked for with the FOR UPDATE clause after each page is generated. For that type of application, optimistic locking is the only way to go.

So the developers you are working with are probably doing the right thing. In my book Expert Oracle Database Architecture: 9 i and 10 g Programming Techniques and Solutions (Apress, 2005), I wrote about this extensively. The following is a small excerpt; the content has been edited for style and length.

Optimistic or Pessimistic Locking?

Which method is best? In my experience, pessimistic locking works very well in Oracle Database (but perhaps not in other databases) and has many advantages over optimistic locking. However, it requires a stateful connection to the database, such as a client/server connection, because locks are not held across connections. This single fact makes pessimistic locking unrealistic in many cases today. In the past, with client/server applications and a couple dozen or hundred users, it would have been my first and only choice. Today, however, optimistic concurrency control is what I recommend for most applications. Having a connection for the entire duration of a session is just too high a price to pay.

There are many ways to implement optimistic concurrency control, including 

  • Using a special column maintained by a database trigger or application code to identify the version of the record

  • Using a checksum or hash that was computed with the original data

  • Using the new Oracle Database 10g feature ORA_ROWSCN

So which do I use? I tend to use the version-column approach with a time stamp column. It gives me extra information about when a particular row was last updated. It is less computationally expensive than a hash or checksum, and it doesn't run the risk of encountering a hash or checksum in the processing of LONG, LONG RAW, CLOB, BLOB, and other very large columns.

If I had to add optimistic concurrency controls to a table that was still being used with a pessimistic locking scheme, such as if the table was accessed in client/server applications as well as on the Web, I would opt for the ORA_ROWSCN approach. That's because the existing legacy application might not appreciate it if a new column appeared, and even if I took the additional step of hiding the extra column, I might not appreciate the overhead of the trigger necessary to maintain it. The ORA_ROWSCN technique would be nonintrusive and lightweight in that respect (well, after I got over the table recreation, that is).

The hashing/checksum approach is very database-independent, especially if we compute the hashes or checksums outside of the database. However, performing the computations in the middle tier rather than in the database means higher resource usage penalties in terms of CPU usage and network transfers.

Cascading Updates

I have an EMP table with child tables, and each of the child tables has its own child tables. I want to update employee code in EMP, and I want all child tables (including child-of-child tables) to update automatically. How can I do this?

Primary keys are supposed to be immutable—unchanging, constant. If you ask me, you have a data model problem, not a SQL problem. If you feel the need to update a primary key and have it cascade, you really need to rethink your approach. You should realize that the employee code in your example is not the primary key of the EMP table—not if it changes. You would need to choose something else (even an artificial key) to be the primary key.

However, you can use an UPDATE cascade if it is truly needed (for example, as a one-time fix for data you are merging). Use deferrable constraints and a stored procedure; for example, suppose you have the following schema: 

create table p 
( x int primary key );

create table c1 
( x constraint c1_fk_p r
    references p deferrable, 
  y int, 
  primary key(x,y) );

create table c2 
( x int, 
  y int, 
  z int,
  constraint c2_fk_c1 
foreign key(x,y) 
  references c1 deferrable,
  primary key(x,y,z));

Now you can set the constraints on C1 and C2 to be deferred—not checked until you either set them as IMMEDIATE or commit the transaction—whichever comes first. This allows you to write a stored procedure, as shown in Listing 1.

Code Listing 1: Cascading changes with deferred constraints 

create or replace procedure 
cascade_p_c1_c2
( p_old in int, p_new in int )
as
begin
   execute immediate 'set constraint c1_fk_p deferred';
   execute immediate 'set constraint c2_fk_c1 deferred';
   update p set x = p_new where x = p_old;
   update c1 set x = p_new where x = p_old;
   update c2 set x = p_new where x = p_old;
   execute immediate 'set constraint c1_fk_p immediate';
   execute immediate 'set constraint c2_fk_c1 immediate';
end;

And now you can call this procedure, and it will be able to successfully cascade the update from the parent table P to the child table C1 and its child table C2. But again, you would do this only on the rarest of rare occasions—this is not something that should become part of your permanent design and implementation strategy.

A Parse Is a Parse Is a Parse

I have a database with "latch : library cache" problems. I'm trying to identify the source of these problems with STATSPACK. 

                 Per Second    Per Transaction
                -----------    ---------------
...
User calls:        1,107.76              53.24
Parses:              389.92              18.74
Hard parses:           0.28               0.01
...

Do you have any suggestions?

This is a system that parses a lot of SQL—your system is parsing about 390 times per second. The good news is that the parses are mostly soft .

The only people who can reduce parses are the developers themselves. Oracle Database parses a SQL statement every time the application tells it to, and here the application is telling the database to do it a lot.

In the 12.98 minutes of your STATSPACK information, you did about 303,669 parses. A parse requires the library cache latch.

The ultimate solution: realize that the only good parse is a nonexistent parse and have the code that executes against the database adhere to this philosophy and have it not parse so much. Keep cursors open. Do not close them until you do not need them. You can promote this approach easily by placing all SQL into stored procedures (where PL/SQL will automatically cache them—not close them—so even if you say "PL/SQL close this cursor," the database will ignore you and cache it).

If you use Java, use Java DataBase Connectivity (JDBC) statement caching, so that JDBC will ignore your developers when they say "close this cursor."

The only way to reduce latching is to reduce the number of times you do the thing that requires latching. And parsing is a huge user of latches.

One thing to look into: if you have not set the session_cached_cursors parameter, you might consider using that setting. It can make your soft parses "softer." This setting will help if your application performs the following logic over and over:

1. Parse
2. Bind
3. Execute
4. Close
5. Go back to 1 and repeat all of the above, over and over

Consider this example, which I monitored by using a small test harness I use frequently called runstats. I'll start with a small routine that just parses and parses over and over again, using dynamic SQL (the parses will be mostly soft parses), as shown in Listing 2.

Code Listing 2: Parsing over and over 

create or replace procedure p( p_n in number )
as
    l_cursor sys_refcursor;
begin
    for i in 1 .. p_n
    loop
       open l_cursor for 
         'select * from dual d' || mod(i,2);
       close l_cursor;
    end loop;
 end;

Now, if I compare the latching differences when I run that procedure with and without session cursor caching, I'll see a large difference, as shown in Listing 3.

Code Listing 3: Latches with session_cached_cursors=0 

SQL> alter session set session_cached_cursors=0;
Session altered.

SQL > exec runStats_pkg.rs_start;
PL/SQL procedure successfully completed.

SQL > exec p(1);
PL/SQL procedure successfully completed.

SQL > exec runStats_pkg.rs_middle;
PL/SQL procedure successfully completed.

SQL > exec p(100000);
PL/SQL procedure successfully completed.

SQL > exec runStats_pkg.rs_stop(90000);

Name                            Run1       Run2     Diff
STAT...parse count (total)         6    100,005   99,999
LATCH.shared pool simulator        8    100,012  100,004
LATCH.shared pool                 10    100,053  100,043
LATCH.library cache lock          36    400,044  400,008
LATCH.library cache               67    400,093  400,026

Run1 latches total versus runs -- difference and pct
        Run1    Run2    Diff    Pct
        361     11,001,012      1,000,651       .04%
PL/SQL procedure successfully completed.

So, Listing 3 shows that 100,000 soft parses needed about 11,000,000 latches—most on the library cache—when I parsed over and over and over. Now I set the session_cached_cursors parameter as follows: 

SQL> alter session set 
session_cached_cursors=100;
Session altered.

I run the same example and observe numbers similar to those in Listing 4.

As you can see in Listing 4, there is a huge reduction in latching, but the parsing—the raw number of times I parse—is the same, because the session cursor cache came into play and made the soft parses softer .

Code Listing 4: Latches with session_cached_cursors=100 

Name                            Run1       Run2     Diff
STAT...parse count (total)         6    100,005   99,999
STAT...execute count               6    100,005   99,999
STAT...session cursor cache hi     2    100,001   99,999
STAT...calls to get snapshots      2    100,001   99,999
STAT...opened cursors cumulati     6    100,005   99,999
STAT...recursive calls             5    300,002  299,997

Run1 latches total versus runs -- difference and pct
Run1   Run2    Diff    Pct
 304    845     541 35.98%

Stringing Them Up

I need to take the results of a query and pivot a value. That is, I would like the output from the EMP table to look like this: 

      DEPTNO      ENAME
------------      --------------------
          10      clark king miller
          20      adams ford ...
...

Can this be done in just SQL?

With the addition of analytic functions in Oracle8i Release 2 and the SYS_CONNECT_BY_PATH() function in Oracle9i Database Release 1, this became something you can do rather easily in SQL. Take the following approach:

1. Partition the data by DEPTNO and, for each DEPTNO, sort the data by ENAME, and assign a sequential number by using the ROW_NUMBER() analytic function.

2. Use a CONNECT BY query, starting with ROW_NUMBER() equal to 1 and connecting that record to the same DEPTNO value with ROW_NUMBER() equal to 2, and so on. So, you eventually end up with a record that is the result of connecting 1 to 2 to 3 to 4, and so on, for each DEPTNO value.

3. Select just the "longest connect by path" for each DEPTNO value—the longest connect by path for each DEPTNO value will have all of the ENAME values gathered together.

The SYS_CONNECT_BY_PATH() function will return the list of concatenated ENAME values.

The query looks like this: 

select deptno,
    max(sys_connect_by_path
       (ename, ' ' )) scbp
  from (select deptno, ename, 
            row_number() over 
           (partition by deptno 
            order by ename) rn
         from emp
          )
start with rn = 1
connect by prior rn = rn-1 
and prior deptno = deptno
  group by deptno
  order by deptno
 /

   DEPTNO         SCBP
---------         ----------------------------------
       10         CLARK KING MILLER
       20         ADAMS FORD JONES SCOTT ...
       30         ALLEN BLAKE JAMES MARTIN ... 

Cutting Down on Redo

Next Steps



ASK Tom
Oracle Vice President Tom Kyte answers your most difficult technology questions. Highlights from that forum appear in this column.
 asktom.oracle.com

READ more Tom
Expert Oracle Database Architecture: 9i and 10g Programming Techniques and Solutions

I have a PL/SQL package that copies data across multiple databases by using a database link to our data warehouse. It uses BULK_COLLECT with a limit of 1,000 to 2,500, depending on the column count within each table. I am committing outside the loop, so there is only one commit.

The DBAs are stating that the amount of REDO is incredible, and they have had to increase the disk space available for Oracle Database to catch up. They haven't specifically stated that the issue is from my code, but the issue appeared around the time of my code implementation.

How can I monitor or ensure that the logging is optimized when writing code?

By doing things in single SQL statements as much as possible. The biggest impact you can have on REDO generation is to limit the amount of work done per call (per execution).

Also, look for opportunities to do direct-path, nonlogged operations. (But coordinate with your DBAs! They need to do a backup right after the operation.)

Look at the difference in REDO generated between a row-by-row (slow-by-slow) approach and a single SQL statement, shown in Listing 5.

Code Listing 5: Minimizing redo 

SQL> create table t ( x int primary key, y char(10), z date );
Table created.

SQL > create index t_idx1 on t(y);
Index created.

SQL > create index t_idx2 on t(z);
Index created.

SQL > @mystat "redo size"

NAME                   VALUE
-------------          -------------
redo size              84538288

SQL > begin
  2    for x in (select object_id a, 'x' b, created c from all_objects)
  3    loop
  4        insert into t values ( x.a, x.b, x.c );
  5    end loop;
  6  end;
  7  /
PL/SQL procedure successfully completed.

SQL > @mystat2

NAME                   VALUE             DIFF
-------------          -------------     ---------- 
redo size              144124840         59,586,552

SQL > truncate table t;
Table truncated.

SQL > @mystat "redo size"
SQL > set echo off

NAME                   VALUE
-------------          ------------ 
redo size              144294508

SQL > begin
  2   insert into t select object_id, 'x', created from all_objects;
  3  end;
  4  /
PL/SQL procedure successfully completed.

SQL > @mystat2

NAME                  VALUE             DIFF
-------------         -----------       -----------
redo size             168114280         23,819,772

That is 59MB of REDO with a row-by-row insert versus 23MB with a single, efficient SQL statement! 


Tom Kyte has worked for Oracle since 1993. He is a vice president in the Oracle Public Sector group and the author of Expert Oracle Database Architecture: 9 i and 10 g Programming Techniques and Solutions (Apress, 2005) and Effective Oracle by Design (Oracle Press, 2003), among others.

Send us your comments