You always write your best code, but you should be prepared for imperfection.
The folks at extremememe.info have been feeling lots better about their application lately. Their daily job now runs in well under 24 hours (as documented at bit.ly/ZtmApg). They’ve learned to make the most of SQL, but also that there are times when persistent data and SQL processing should be replaced by in-session memory management enabled by PL/SQL code (as reported at bit.ly/YU4Wto). In my latest visit to the company, I found that their focus had shifted to best practices, specifically with the objective of producing more-consistent, high-quality code across their entire team.
They asked me to come up with a standards document and go through a detailed training on best practices. I responded with the voice of experience.
I’ve worked with many companies over the years on best practices for PL/SQL. Along the way, I’ve come to accept certain harsh realities when it comes to improving code quality, including the following:
So I suggested that the team at extremememe.info identify a specific and significant pain point in the development process. They can then come up with a plan of action, figure out how best to apply it, and see if it can be put into practice effectively.
“Ooh, ooh, I’ve got one!” piped up Miguel, one of the newest members of the team. “A user reported a bug on her screen last week, and I had the hardest time sorting out what might be causing the problem. I couldn’t reproduce it in our sandbox, and, of course, I couldn’t see what the user was seeing. There was nothing in the error log, and the error message shown to the user was simply ‘Application Error. Call Support.’”
Several other developers were shaking their heads sadly. It seems that they knew exactly what Miguel was talking about.
“All right,” I replied. “That’s a fine place to start. Miguel, did you ever figure it out?”
“No way,” I replied. “I really doubt that your problem has to do with defining business logic ‘deep’ inside the database. Instead, it has more to do with how you anticipate problems that might occur in that logic—and bullet-proof your code to make it easier to diagnose and resolve problems that arise. Show me your procedure.”
Miguel showed me the em_process_data procedure:
CREATE TABLE em_memes ( meme_id INTEGER PRIMARY KEY, meme_name VARCHAR2 (1000) UNIQUE, discovered_on DATE, meme_status VARCHAR2 (100) ) / PROCEDURE em_process_data ( for_this_meme_id_in IN INTEGER) IS l_meme_info em_memes%ROWTYPE := em_pkg.g_memes ( for_this_meme_id_in); BEGIN /* Do lots of stuff with l_meme_info. */ . . . . END;
With the quickest glance at the start and end of this procedure, I was able to identify a number of potential problems, including
Of course, a quick glance can lead to quick and careless advice. So it was time to dive in and drive the lessons home. First up: the assignment of a value within the declaration section, as a default value.
It’s always important to keep in mind one key feature of PL/SQL exception handling: The exception section of a block can only possibly handle exceptions raised in the executable section of that block.
If an exception is raised when a value is assigned to a variable or constant in the declaration section of a block, that exception will always cause that block to shut down with an unhandled exception, even if you have included a WHEN OTHERS handler.
In the em_process_data procedure, Miguel used the value of the for_this_meme_id parameter as an index value in the em_pkg global collection. If there is no element defined at that index value, Oracle Database raises the NO_DATA_FOUND exception.
There is no way to trap this NO_DATA_FOUND exception inside em_process_data, as the procedure is currently written.
Bottom line, I told Miguel, is that if you do want to trap this exception inside the em_process_data procedure, you should move the assignment from the declaration section to the executable section, optimally into its own nested initialization subprogram, like this:
PROCEDURE em_process_data ( for_this_meme_id_in IN INTEGER) IS l_meme_info em_memes%ROWTYPE; PROCEDURE initialize IS BEGIN l_meme_info := em_pkg.g_memes ( for_this_meme_id_in); END; BEGIN initialize; /* Do lots more stuff */ . . . . END;
Miguel’s response was quick: “OK, I get that, and I like using that subprocedure to hide all the initialization logic. One thing, though: The guideline we follow in our application is that we do not trap the exception locally. That would mean having to write WHEN clauses all over the place. Very unproductive. Instead, we just use WHEN OTHERS in the very top-level PL/SQL block. That way, no matter what goes wrong, and no matter where it goes wrong, we’ve got it covered.”
I somehow managed to restrain myself from pointing out that in fact they didn’t have it covered at all, since it took Miguel way too long to identify and fix last week’s bug.
“That’s a strategy I hear occasionally from developers,” I replied, “but I must admit it has never made a whole lot of sense to me. So let’s take a closer look at that issue.”
I shared with Miguel that most developers I had worked with readily agree that they really don’t want exceptions to go unhandled and sent straight to users. When that happens, the users see a most unfriendly and usually unintelligible error message. This reduces their confidence in the application and, even worse, in their developers.
Miguel nodded his head vigorously. “At my last job, users got so sick of seeing those errors that they stopped even bothering to report them and stopped coming to meetings with our team. It was not a good feeling.”
And that’s not all. Some organizations also need to be very careful, from a security standpoint, about the error information provided to users. If people are trying to hack into a system, they are likely to hit lots of errors along the way. You don’t want an error message to inadvertently help the hacking process along.
This caution is evident in Oracle Database’s error messages. For example, if I try to select from a table that does not exist, I see this message:
ORA-00942: table or view does not exist
Notice that it does not say this:
ORA-00942: "CRITICAL_DATA" does not exist
So sorting out what kind of message to show users when an error occurs requires some planning and attention to the requirements of the application itself. Just as important, however, is deciding what information you need to store in your error log to make it as easy and quick as possible to diagnose and repair the problem.
PL/SQL provides a number of functions that allow you to retrieve “generic” information about the state of your application when the exception was raised:
This is all great information to have, and whenever you handle an error you should call these functions and store the relevant information in your log.
These functions, however, mostly make it easier to figure out where the error occurred, not why. And as many of you undoubtedly know, the why of an error is usually much more difficult to figure out than the where.
To figure out why an error occurred, you almost always need to know the state of your application at that moment, and that state is reflected not only in the contents of tables, but in the values held by variables and constants.
This obvious fact bears directly on the question, where do you put your exception handlers? If you choose to avoid exception sections on “inner” subprograms, and only include a WHEN OTHERS or a WHEN clause at the top-level block, two facts become clear:
Inaccessibility of values of local variables is the price you pay when you choose to rely solely on a single exception section at the “top.” It is certainly easier and improves developer productivity when writing new programs, but you pay the price of a loss of information when an error occurs, which can lead to dramatically reduced productivity when it comes to debugging that code.
When writing new applications, you are always under lots of pressure to deliver that application on time. So the idea of a single exception section is terribly attractive. Yet if you consider the overall lifespan of an application, developers spend much more time in maintenance and support of an application than with new software development. So a guideline that makes it easier to meet your initial production deadline and makes it harder to keep that application up and running—and responsive to user enhancement requests—seems like a bad idea.
For that reason, I offer the following recommendation for where and how to add exception sections to your subprograms:
When you’ve “finished” your subprogram (are you ever really finished?), ask yourself, “When something goes wrong, will I want to see the values of any locally declared data?” If the answer is yes, add an exception section and log those values. If the answer is no, then at least for the time being, you can let the exception propagate out of the subprogram unhandled.
Inside extrememe.info’s em_process_data procedure, a reference to the meme ID is passed as an argument in the following expression:
Looking at the names, I can deduce that em_pkg is a package and g_memes is a global collection declared in the package specification. A deduction, however, is not a certainty, so I need to look at that specification to make sure.
Miguel very obligingly opens up a window for the em_pkg package, and I find the following:
PACKAGE em_pkg IS TYPE meme_ids_t IS TABLE OF em_memes%ROWTYPE INDEX BY PLS_INTEGER; g_memes meme_ids_t; END;
So the g_memes collection is an associative array of meme records, indexed by integer.
Looking at this code does not, unfortunately, answer all my questions. Instead, it raises other questions, including the following:
These are the kinds of questions that usually arise when you reference global data structures (declared in a package specification) directly inside your subprograms. These kinds of hidden references (not visible in the subprogram header) cause entanglements and dependencies that result in easily broken code.
A much better way to design the code is to move all global references into the parameter list. The interfaces between subprograms will then be cleaner: easy to understand, debug, and test.
Assuming that the em_process_data procedure does not reference the for_this_meme_id_in parameter after using it to get the meme info from the global collection, I could change the procedure header as follows:
procedure em_process_data ( for_this_meme_id_in in em_memes%rowtype)
and it can be called as follows:
em_process_data (em_pkg.g_memes (l_my_id));
This change does not address concerns about the index value used in the reference to g_memes, but it does remove the dependency on em_pkg.g_memes inside the em_process_data procedure.
Every program makes assumptions about the state of the application when the program is executed. Of course, programs do not write themselves, so what this really means is that the programmer made assumptions—and Miguel is no exception to this rule.
Some assumptions are fundamental, such as, “The em_memes table is defined and contains data.” But other assumptions, especially those very specific to the program being written, should be given careful consideration and attention—right at the start of the program.
Specifically, it is critical to make sure that the assumptions have been met, before the application starts executing logic that depends on those assumptions. If assumptions are not validated, the programs are much more likely to break in ways that confuse both users and developers.
Consider the original form of extrememe.info’s em_process_data procedure:
PROCEDURE em_process_data ( for_this_meme_id_in IN INTEGER) IS l_meme_info em_memes%ROWTYPE := em_pkg.g_memes ( for_this_meme_id_in);
If the value of for_this_meme_id_in does not identify a defined index value in g_memes, Oracle Database will raise a NO_DATA_FOUND exception. If for_this_meme_id_in is NULL, then Oracle Database will raise a VALUE_ERROR exception with this error message:
ORA-06502: PL/SQL: numeric or value error: NULL index table key value
Clearly, em_process_data assumes that a meme ID value has been passed to the procedure. Simply hoping that the meme ID will be valid can leave a developer having to sort out the source of the very common NO_DATA_FOUND and VALUE_ERROR exceptions.
A much better approach is to explicitly validate any assumptions, and raise an exception that proactively warns of the problem, as in
PROCEDURE em_process_data ( for_this_meme_id_in IN INTEGER) IS l_meme_info em_memes%ROWTYPE; PROCEDURE initialize IS BEGIN IF for_this_meme_id_in IS NULL THEN raise_application_error ( -20000, 'Meme ID is NULL'); END IF; IF NOT em_pkg.g_memes.EXISTS ( for_this_meme_id_in) THEN raise_application_error ( -20000, 'Meme ID not found in g_memes array'); END IF; l_meme_info := em_pkg.g_memes ( for_this_meme_id_in); END;
Miguel interrupted me. “I hate to say it, Steven, but I don’t think that I am ever going to feel like I have the time to write all that validation code, especially if I have to do it over and over again. Should I just plan on copying and pasting the original assertion, and change the message each time?”
“No, no, no!” I almost shouted, immediately feeling bad, but that’s what happens when I hear the words “copy and paste” associated with code. “A much better approach is to use a generic, reusable assertion package that lets you easily assert that your assumptions are valid, and if not, procedures in the assertion package will raise an error, write information to a log, display text on the screen—really, whatever you want it to do.”
In just a few minutes, I modified Miguel’s em_process_data procedure to give him an idea of what that might look like:
PROCEDURE em_process_data ( for_this_meme_id_in IN INTEGER) IS l_meme_info em_memes%ROWTYPE; PROCEDURE initialize IS BEGIN em_assert.is_not_null ( for_this_meme_id_in, 'Meme ID is NULL'); em_assert.is_true ( em_pkg.g_memes.EXISTS ( for_this_meme_id_in), 'Meme ID not found in g_memes array'); l_meme_info := em_pkg.g_memes ( for_this_meme_id_in); END;
Miguel looked relieved. “Ah, OK, that looks much more manageable. Now I just need to write that package. Maybe I can find some time over the weekend.”
“No, no, no, Miguel,” I said, this time in a much calmer voice. “I’ve already done the work for you. Download my assertion package from the PL/SQL Challenge website. Click Library, then Utilities, and then search on
“Thanks, Steven, that’s going to help a lot,” said Miguel. “But I don’t think it’s going to help with one of the most frustrating situations I have to deal with: a user reports a problem in her production version of the application that I simply cannot reproduce in our development environment.”
“You’re right, Miguel. Assertions are much more helpful during development and less so in production. And I suppose you’d really rather not tell them, ‘Sorry, we cannot reproduce the problem. Let us know if it happens again.’”
Miguel shook his head emphatically. “No way. I tried that once. Next thing I knew, my manager was calling me to say that she had gotten a call from the user’s manager. I definitely do not want to experience that again.”
“OK,” I replied. “We need to talk about instrumenting your code.”
As with the error-related functions, Oracle Database does provide lots of real-time information about the database instance and sessions to help developers resolve production issues. Yet, once again, there is often a big hole in that data, specifically when it comes to viewing the values of arguments and local variables in user sessions.
There is, unfortunately, no way to ask Oracle Database to write those values to a trace file or return them to the program with a function call. Developers must, instead, add code to their programs to save the values they know are most important to diagnosing a problem to a table or some other repository. This is known as tracing or instrumentation.
It’s not a terribly complicated aspect of programming. Generally, the program will call a procedure and pass it the values to store, along with text describing what is being saved. That procedure inserts the data into your trace table and then commits the insert. To avoid also committing the application transaction, trace procedures are defined as autonomous transactions.
Oracle Database offers the DBMS_APPLICATION_INFO package to help with tracing. There are also open source tracing utilities available for download.
“Which one should I use?” asked Miguel.
I responded, “That’s a good question, but an even better question is, ‘What features should I look for to ensure effective tracing implementation?’”
Regardless of the way a developer chooses to do tracing, I offer the following guidelines for tracing most effectively:
BEGIN IF em_trace_pkg.trace_enabled THEN em_trace_pkg.trace_parameter ( parameter_name => 'for_this_meme_id_in', parameter_value => for_this_meme_id_in); END IF;
“Those first three make perfect sense, and I will definitely use them to review tracing options,” said Miguel. “But that fourth point has me worried. Sure, with em_process_data, I have only a single parameter, so it’s no big deal. It’s like what you just showed me in guideline number 2. But another procedure I have to debug all the time has 20 parameters. Do I really have to write an IF statement and call trace_parameter for every one of those? Honestly, Steven, I will probably never take the time to do that.”
Over my many years of giving advice to developers, I have come to accept that what Miguel is saying is bedrock reality: It doesn’t matter how good the idea is. If you don’t make it really easy and fast to do, many developers won’t do it.
“Good point, Miguel,” I replied. “You need to find a way to generate the trace code for procedures like that. And whenever you think about generating code based on database objects, you should always check to see if there is a data dictionary view that will help. In this case, we can use ALL_ARGUMENTS.”
I showed Miguel how you can query the ALL_ARGUMENTS view to obtain all the parameters of em_process_data:
SELECT * FROM all_arguments WHERE owner = USER AND object_name = 'EM_PROCESS_DATA'
That certainly looks straightforward enough, but ALL_ARGUMENTS can be a bit tricky, especially when you are dealing with overloaded subprograms. I don’t have space in this article to get into all the details, but that’s not necessary anyway, because I have created a procedure that does all the work for you.
Here’s an example that calls the sf_gen_trace_call procedure for em_process_data:
BEGIN sf_gen_trace_call ( pkg_or_prog_in => 'EM_PROCESS_DATA', pkg_subprog_in => NULL, tracing_enabled_func_in => 'em_trace_pkg.trace_enabled', trace_func_in => 'em_trace_pkg.trace_parameter'); END;
Note: You can download the procedure at bit.ly/1uLkSbJ.
Most programmers I meet are optimists. No matter how bad the last application rollout went, they are convinced that this time they’ll get it right. Oh, and that most of the problems in the last rollout were not their fault.
Let’s face the facts, though: We will all still make mistakes, and even if we didn’t, we can certainly depend on our users to make their own. There will be production problems; there will be upset, impatient users.
That’s why experienced programmers assume that trouble will be knocking on their doors. So they build into the software, right from the start, the ability to gather lots of information when an error occurs, as well as the ability to “flip a switch” in their production application and obtain live tracing data so they can see what users and their applications are doing.
I assured Miguel that I thought his code was better than most that I review. And I encouraged him to detach his ego from his code as much as possible, while also remaining committed to applying best practices whenever possible to fend off problems later in production.
Take the Challenge
Each PL/SQL article offers a quiz to test your knowledge of the information provided in it. The quiz appears below and also at PL/SQL Challenge (plsqlchallenge.com), a website that offers online quizzes on the PL/SQL language as well as SQL, Oracle Application Express, database design, and deductive logic.
Here is your quiz for this article:
I execute the following statements:
CREATE OR REPLACE PACKAGE plch_pkg IS g_save_value NUMBER; END; / CREATE OR REPLACE PROCEDURE plch_log_error ( data_in IN VARCHAR2) IS BEGIN DBMS_OUTPUT.put_line ( SQLCODE || '-' || data_in); END; /
Note: SQLCODE returns -6502 for a VALUE_ERROR exception.
CREATE OR REPLACE PROCEDURE plch_do_stuff ( value_in IN NUMBER) IS l_value NUMBER := value_in * 100; BEGIN RAISE VALUE_ERROR; EXCEPTION WHEN OTHERS THEN plch_log_error ('Value=' || l_value); RAISE; END; / DECLARE l_value NUMBER; BEGIN plch_do_stuff (10); EXCEPTION WHEN OTHERS THEN plch_log_error ('Value=' || l_value); END; /
CREATE OR REPLACE PROCEDURE plch_do_stuff ( value_in IN NUMBER) IS l_value NUMBER := value_in * 100; BEGIN RAISE VALUE_ERROR; EXCEPTION WHEN OTHERS THEN plch_pkg.g_save_value := l_value; RAISE; END; / DECLARE l_value NUMBER; BEGIN plch_do_stuff (10); EXCEPTION WHEN OTHERS THEN plch_log_error ('Value=' || plch_pkg.g_save_value); END; /
CREATE OR REPLACE PROCEDURE plch_do_stuff ( value_in IN NUMBER) IS l_value NUMBER := value_in * 100; BEGIN RAISE VALUE_ERROR; END; / DECLARE l_value NUMBER; BEGIN plch_do_stuff (10); EXCEPTION WHEN OTHERS THEN plch_log_error ('Value=' || l_value); END;