By Steven Feuerstein
Explode and decode the myths in your code.
Don't use the XYZ feature; there was a problem back in version n.n." "Always use explicit cursors; doing so is the most efficient way to fetch a single row of data."
"Avoid packages; they use too much memory."
I bet you've heard one variation or another of such statements. They are just a few of the myths that are perpetuated in our PL/SQL applications. In some cases, there was a very good reason for their initial injection into our software. In many other cases, ignorance is a more active ingredient. Whatever the original motivation, software that reflects these myths can be very hard to ferret out and even harder to replace with "true" code. This article explodes a variety of the most pernicious PL/SQL myths, shows you how to systematically remove them, and offers advice about how to avoid writing today what will be mythological code tomorrow.
How do myths creep into code? Let's look at a common scenario in software development: coding bug workarounds.
Suppose I am building an application that relies on a third-party analysis package called analyze_rates . It contains the optimal_plan function, which returns information about the optimal plan for a given company.
CREATE OR REPLACE PACKAGE analyze_rates IS TYPE optimal_info_rt IS RECORD ( rate_level PLS_INTEGER, rate_type PLS_INTEGER, is_optimal BOOLEAN); FUNCTION optimal_plan (id_in IN company.id%TYPE) RETURN optimal_info_rt;
The function is supposed to return a value of FALSE in the is_optimal field of the optimal_info_rt record TYPE if the plan is not optimal. Unfortunately, this function has a bug: It always returns TRUE for is_optimal , even if the plan is not the best. Even worse, I cannot change this package, because it is provided by a third party and is wrapped.
Fortunately, there is a workaround: It turns out that if the rate level and type are NULL, the plan is not optimal.
In a big hurry, I send out a memo to my team of 25 developers, explaining the problem and the solution. I instruct them all to write code like the following whenever they call analyze_rates.optimal_plan :
DECLARE l_company_id company.id%TYPE; opt_info analyze_rates.optimal_info_rt; BEGIN get_company (l_company_id); opt_info := analyze_rates.optimal_plan (l_company_id); IF opt_info.rate_level IS NULL AND opt_info.rate_type IS NULL THEN -- Not the optimal plan!
That works—for now. So all of my developers will copy and paste this IF statement into their programs whenever they need it. The foundation of a myth has thereby been established.
What happens when and if the vendor changes the behavior of the optimal_plan function, rendering the workaround invalid? What happens when the vendor actually fixes the problem and optimal_plan does finally return FALSE in the is_optimal field of the record?
It's very unlikely that my team is going to comb through all its programs to find and replace the compound IF statement.
The likely outcome is that this workaround will become a permanent part of the application, yet it is also a myth reflecting an old reality, causing confusion, and greatly increasing the potential for more future bugs.
At the end of this article, in the section "How to Keep Myths Out of Your Code," I show how to avoid such a scenario and make it easy to upgrade your code as bugs are fixed and functionality improves, thereby erasing the myth before it can take hold in the code. First, however, let's consider a few of the most commonly occurring myths in PL/SQL programs.
PL/SQL was designed—and has consistently proved itself—to be the fastest, easiest programmatic path to the Oracle Database. This is evident when it comes to querying data from a database, because PL/SQL gives you several ways to write and execute SELECT cursors, including implicit static cursors, explicit static cursors, DBMS_SQL dynamic cursors, native dynamic SQL cursors, and BULK COLLECT INTO . Implicit and explicit static cursors are the most common ways to query information from the database.
CREATE OR REPLACE PROCEDURE change_team ( id_in IN team_member.id%TYPE ,new_team_id_in IN team_member.team_id%TYPE ) IS l_team_member team_member%ROWTYPE; BEGIN SELECT * INTO l_team_member FROM team_member WHERE id = id_in;
The explicit cursor: If you would like more control over your cursor processing—and possibly want to reuse your SQL statement—you can declare your own cursor explicitly and then perform each operation yourself. Here is a rewrite of the same change_team program, this time with an explicit cursor:
CREATE OR REPLACE PROCEDURE change_team ( id_in IN team_member.id%TYPE ,new_team_id_in IN team_member.team_id%TYPE ) IS CURSOR member_cur IS SELECT * FROM team_member WHERE id = id_in; l_team_member member_cur%ROWTYPE; BEGIN OPEN member_cur; FETCH member_cur INTO l_team_member;
For years, many experts in the Oracle world (including me) preached a fairly rigid dogma about explicit and implicit cursors. The dogma went like this:
You should always use explicit cursors ( CURSOR <cursor_name> IS ) and avoid implicit cursors ( SELECT INTO ) entirely, because explicit cursors are always more efficient. They are more efficient because implicit cursor behavior must conform to the ANSI standard, which dictates that even for a single-row query, you must always perform two fetches: one to fetch the row and a second to see if there are too many rows (two or more). Hence, using an implicit cursor is slower than using an explicit cursor, for which you can issue a single FETCH command.
Doesn't that sound logical and convincing? The problem is that although it might have been true a long time ago in Oracle6, it certainly isn't true for Oracle8i, Oracle9i, or Oracle 10g. In fact, implicit cursors often run faster than explicit cursors. That explicit cursors are always going to be more efficient is one of the most enduring of PL/SQL myths, because it was propagated early and widely.
So what should you do when it comes to writing cursors? I suggest the following guidelines:
1. Whenever possible, use BULK COLLECT —it offers dramatic performance gains. In Oracle9i Release 2, you can even use BULK COLLECT to fetch multiple rows directly into a collection of records.
2. For fetching just a single row, using BULK COLLECT doesn't make sense. Whatever kind of cursor you decide to use for your single-row fetches, hide that cursor behind a function interface. By placing a layer of PL/SQL code between your application logic and the underlying SQL, you give yourself the latitude to change the implementation of the query without affecting your application code. With this approach, my change_team program would look like the following:
CREATE OR REPLACE PROCEDURE change_team ( id_in IN team_member.id%TYPE ,new_team_id_in IN team_member.team_id%TYPE ) IS l_team_member team_member%ROWTYPE; BEGIN l_team_member := team_member_qp.onerow (id_in);
team_member_qp is the query package for team_member , encapsulating a wide range of query logic for the table, including the onerow function. Within the onerow function, I can choose an implicit or explicit cursor—or even see if caching team_member data in a package-level collection might make sense. The point is that this implementation issue is below the radar of the application programmer, so it can be changed without causing a ripple effect.
I have a lot of respect in general for Oracle database administrators. They have to know an awful lot about Oracle architecture and internals, and they have to keep up with all the changes that come with each new release. If a DBA doesn't keep up with the latest developments from Oracle or didn't correctly understand some aspect of the database architecture, however, that person can insist on certain rules that make no sense.
When a DBA tells developers to avoid packages because they use too much memory, for example, it reveals ignorance about the runtime architecture of PL/SQL. Of course, as with any other myth or legend, there is always a grain of truth at its center, and this myth is no exception.
When you call any single program in a package, it is true that the partially compiled code for the entire package is loaded into memory. So if you design your package carelessly, you may use too much memory, in the following sense: Suppose I have created an enormous package, BIGPKG , containing 100 programs that altogether require 250K of memory, and those programs cover a wide array of functionality. Suppose I then call the program BIGPKG.ONETHING in my application but don't use any of the other programs in the package. If ONETHING requires only 20K of memory, I will be using an extra 230K of system global area (SGA) memory needlessly.
So there is a potential for memory abuse—but it is not the fault of the package structure. It is the consequence of poor package and application design. In fact, if you design your packages carefully (see the following guidelines), you can improve the memory utilization of your code base in the SGA.
For building a PL/SQL-based application, I recommend the following package use guidelines:
Put all of your code in packages; avoid stand-alone procedures and functions. (Yes, the guideline is the precise opposite of the myth.) Even if your package has just a single program right now, you are likely to think of another program with similar functionality later, and then you will have a logical place to put it.
Create lots of small, tightly focused packages. It does not make sense, for example, to have a package called UTILITIES that is a dumping ground for anything and everything. Instead, create several more narrowly defined packages such as string_utils, date_utils, constants_pkg , and config_pkg .
Make sure the names of your packages and programs accurately describe their contents. Carefully chosen names can make the difference between an application that is easily understood and maintained and an application that is a tangle of murky elements.
The big question remains: How are you going to avoid putting more and newer myths into your code?
There are two basic answers to this question: Question authority and encapsulate.
Question authority. If your DBA tells you to avoid an area of PL/SQL functionality because of a historical grievance such as "There once was a bug. . . ." or to believe some high-level policy statement such as "That takes too much memory," check for yourself. Is there still a bug, or has it been fixed? Oracle offers an excellent resource in MetaLink to help you get answers to such questions.
If your questions have more to do with performance or functionality, you can write your very own tests. These aren't philosophical questions, after all; they are technical statements that can be challenged and either verified or dismissed. One of the great attractions of software (indeed, the fundamental reason I believe that writing code can sometimes seem positively addictive) is that we are working with a closed system. A computer is a machine that follows instructions. There are no mysteries of the soul, and the PL/SQL runtime engine will never lie to you.
So if someone claims there is a memory problem with packages, run your own tests and analyze memory usage. Listing 1 contains a program you can use to display the UGA and PGA memory utilization in a specific session, which is helpful for PL/SQL developers, because program data such as collections takes up space in the PGA.
If someone claims that a certain technique, such as implicit cursors, is slow or slower than another approach, write yourself a program to compare the various approaches. You can use SET_TIMING ON in SQL*Plus or take advantage of functions in DBMS_UTILITY (either GET_TIME or GET_CPU_TIME —the latter is new in Oracle 10g) that help you calculate the elapsed time with subsecond precision. Listing 2 , for example, contains a specification of an object type that allows you to start and stop timers within your program.
Encapsulate. The second primary mechanism for dealing with mythological code is encapsulation. When you encapsulate, you build a layer of code between two other pieces of software, usually the application logic and some underlying functionality. If you have a reason to believe that this underlying functionality may change in the future, hide it within a function or a procedure and then call that program. When the functionality does evolve (for example, the bug is fixed), you simply swap out the old implementation for the new—and the mythological code is gone before it can become a myth.
Because you do not change the public face of that program—its name and parameter list—none of the programs in the application layer of code are affected. This process of changing the internals of a program without modifying its external interface is known in much of the programming world, by the way, as refactoring.
Encapsulating workarounds or patches for bugs is especially important, and there are a few specialized steps you should take for this kind of encapsulation. Let's now revisit the optimal_plan bug introduced at the beginning of this article.
Earlier I cautioned developers against exposing the workarounds in their code by writing lines like this:
IF opt_info.rate_level IS NULL AND opt_info.rate_type IS NULL THEN -- Not the optimal plan!
Let's see how encapsulation can help us out in such situations. Here is an alternative set of steps to follow when responding to bugs like this:
1. Create a separate package to hold the workarounds for the analyze_rates package. Let's call it analyze_rates_wa .
2. Create a function in this workarounds package that sits on top of optimal_plan , has the same name, and calls that third-party function but also implements the workaround for the is_optimal bug. This workaround function acts the way the original is supposed to act.
3. Set a rule for your application development that analyze_rates.optimal_plan is never to be called and that analyze_rates_wa.optimal_plan should be called instead. Note that you can effect this change in your existing code with a carefully executed and reviewed global search and replace.
4. Document in the workaround function exactly what is wrong and how to fix it.
5. When the vendor fixes/upgrades its code, you can do one of two things: switch references to analyze_rates_wa.optimal_plan back to analyze_rates.optimal_plan , or, if other bugs remain, leave the call to the workaround program in your code but upgrade the internals of that program to use the new functionality in the provided software.
Listing 3 includes my example implementation of the workarounds package for the optimal_plan function. I have included a comment in my function explaining the cause of the problem, the nature of the workaround, and how to upgrade this code when the bug is fixed.
Suppose the bug is fixed two years after you write the application. The original development team is long dispersed. Because of your foresight in writing a workaround package, the developer now maintaining the application can easily and confidently upgrade the code to reflect the new reality, without having to wrestle with obscure workarounds that have since become myths.
The programs you write generally have a longer lifespan than anyone would expect. And the more code you write, the more code goes into production and into maintenance mode. If you are not careful about how you write your programs, you (or whoever inherits your programs) will end up dealing with complicated, tangled, and fragile applications.
Mythological code—lines of PL/SQL that present a false picture—only make this situation worse. The more you can do today to remove and avoid myths in your code, the better off everyone will be.