DEVELOPER: PL/SQL Practices
On Signatures and Changing WHEREBy Steven Feuerstein
Best practices for changing headers and handling different WHERE clauses
I maintain a large application implemented in PL/SQL. Lately, a number of enhancements have required changes to the signatures of several procedures and functions. I have had to add new parameters and remove others. This has led to the need to change many other programs that call these units. How can I minimize the ripple effect of changes such as these?
Once a procedure or function is in use (called by other programs), you have to be very careful about how you change the header (also known as the signature ) of that program. You have in essence told other developers on your application team that they can use the program If they call it a certain way. Thus, the current header (name, parameter list, and return value if a function) constitutes a kind of contract with the users of the program.
Any change to that header is, as you note in your question, likely to cause a ripple effect throughout the rest of your code base. At best, dependent programs will be invalidated and will have to be recompiled. In fact, with some enhancements in Oracle Database 11g, the improved best scenario is that Oracle Database will no longer even invalidate your programs (or certainly much less frequently). At worst, those dependent programs will have compile errors and you will have to change the way your program is called in all those dependent programs.
Changes must also be propagated from the development instance, through one or more test instances, and finally into the live production environment. The better you manage the header change—and minimize the impact of that change—the shorter the downtime your production instance will experience as you upgrade your application code base.
In this answer, I first take a look at the impact of header changes (how many other programs will be affected). Then I review the different kinds of changes to a program’s signature and the corresponding actions you might need to take to get the rest of your code base back in sync.
How Oracle Database deals with your changes. Oracle Database automatically maintains dependency information about all the objects in your database. This information is then made available through a set of data dictionary views, such as ALL_DEPENDENCIES.
Whenever you change a database object, all dependent PL/SQL program units are then marked invalid—they must be recompiled before they can be used. Thus, Oracle Database does an awful lot of work for us to keep our code base in sync.
Prior to Oracle Database 11g, the granularity of this dependency tracking was at the object level. Suppose, for example, that procedure PKG1.PROC1 calls subprogram PROC2 defined in package PKG2. Any change to PKG2, in either PROC2 or any other subprogram of the package, will cause invalidation of PKG1. Furthermore, if your program is dependent on a database table, any change to that table—even one that has no impact on your program—sets the status of that program to INVALID.
In Oracle Database 11g, Oracle offers fine-grained dependency tracking: the granularity of dependency information is now at the level of an element within a program unit. This means that Oracle Database will mark a dependent program unit as invalid much less frequently than in the past, minimizing the need for recompilation.
In the following example, I demonstrate this fine-grained dependency tracking. My first step is to compile a procedure (use_last_name) that references only the last_name column of the employees table. My next step is to change a different column (first_name), and the status of my procedure remains VALID:
CREATE OR REPLACE PROCEDURE use_last_name ( id_in IN employees.employee_id%TYPE) IS l_name employees.last_name%TYPE; BEGIN SELECT e.last_name INTO l_name FROM employees e WHERE e.employee_id = use_last_name.id_in; END use_last_name; / ALTER TABLE employees MODIFY first_name VARCHAR2(2000) / SELECT status FROM user_objects WHERE object_name = 'USE_LAST_NAME' / VALID
Note that I made sure to qualify every reference to a column with its table name alias and every reference to a PL/SQL variable with the procedure name. This approach has always been a recommended best practice; with fine-grained dependency tracking, it is more important than ever. When you qualify all these identifiers, Oracle Database is able to further reduce the need to invalidate and then recompile program units.
Getting a sense of the scope of the ripple. Before making any changes to your existing programs, you should analyze the potential impact of those changes. You have the following options for analyzing the impact of changing the signature of a program:
1. Search your source code (either in files or in the ALL_SOURCE data dictionary view) to find all the program units that contain specific strings, such as the name of a program. The following query, for example, finds all programs that call my_package.my_procedure:
SELECT * FROM ALL_SOURCE WHERE upper (text) LIKE '%MY_PACKAGE.MY_PROCEDURE%'
2. Query one of the dependency views to find all program units that will be affected by the change you are about to make to a program. Here, for example, is a query that shows all program units that depend on the package (my_package) defined in the hr schema:
SELECT * FROM ALL_DEPENDENCIES WHERE referenced_owner = 'HR' AND referenced_name = 'MY_PACKAGE'
3. In Oracle Database 11g only, use the new PL/Scope feature to obtain much more detailed information about the way your programs are used throughout your application.
Analysis based on the ALL_DEPENDENCIES view tells us that one program unit depends on another. It does not, however, tell us the nature of the dependency, nor is that dependency information sufficiently detailed. I can find out, for example, that procedure PROC1 depends on package PKG1, but I cannot determine which subprogram within PKG1 is actually called.
PL/Scope can do that, and much more. A full treatment of PL/Scope is outside the scope of this discussion, but the following example will demonstrate the power of this feature.
To use PL/Scope, I must first direct the PL/SQL compiler to analyze all of the identifiers in my program when it is compiled:
ALTER SESSION SET plscope_settings='IDENTIFIERS:ALL' /
Then when I compile a program unit, Oracle Database will populate the data dictionary with detailed information about how each named element (variables, types, programs, and so on) in my program is used.
I next create a package and two schema-level procedures that each rely on a different subprogram from that package:
CREATE OR REPLACE PACKAGE my_package IS FUNCTION func (arg NUMBER) RETURN NUMBER; PROCEDURE proc (arg VARCHAR2); END my_package; / CREATE OR REPLACE PROCEDURE use_proc IS BEGIN my_package.proc ('a'); END use_proc; / CREATE OR REPLACE PROCEDURE use_func IS BEGIN DBMS_OUTPUT.put_line (my_package.func (1)); END use_func; /
Now I want to find all the places in my code where the my_package.proc procedure is called. I can do so with this query against the USER_IDENTIFIERS view:
SELECT called.object_name FROM user_identifiers called, user_identifiers declared WHERE declared.USAGE = 'DECLARATION' AND declared.NAME = 'PROC' AND declared.object_name = 'MY_PACKAGE' AND called.USAGE = 'CALL' AND called.signature = declared.signature AND called.object_name <> 'MY_PACKAGE' /
which then shows me a single row:
Note that the use_func procedure does not show up in the results; that procedure is dependent on a different subprogram of the package.
Hopefully, even from this very simple example, you can see that PL/Scope offers enormous potential in helping us better understand our code and analyze the impact of change on that code.
Now let’s examine the different ways you might change a program’s signature.
Move a schema-level program into a package. Suppose you are working on an existing application that already has one schema-level function to calculate a mortgage rate and you now need to define several others. You create a package, write your new functions, and move the existing function into the package. Much better!
Now, however, you must change every call to the original program to call the packaged version. This is not a difficult task. Just put “package_name.” in front of the program name. What if, however, this program is called in dozens of existing production programs and your manager definitely does not want to change all those programs just because you like packages?
Simple. Leave the schema-level program in place but change its implementation so that it is nothing more than a pass-through to the packaged program. For example
FUNCTION loan_rate ( loan_type_in IN loans.loan_type%TYPE ,years_in in PLS_INTEGER) RETURN NUMBER IS BEGIN RETURN mortgage_calcs.loan_rate (loan_type_in, years_in); END loan_rate;
With this approach, none of the existing programs needs to be changed, all future code can use the packaged version, and you have avoided redundant logic in your application.
Change program names. Names of programs are very important. If a name is misleading or cryptic, developers will have difficulty (and experience resistance to) using and reusing that program. Once the program has been called in an application, however, you should not change program names because you believe that the new names are better. Developers get to know the names of programs, and when you change them, you create confusion for the team.
So change names of existing programs with great care and only when absolutely needed. And when you do, follow the advice from the previous section: create a pass-through program with the old name that simply calls the new program name. That way you keep existing code valid and can selectively (perhaps over time) “upgrade” the old names to the new ones.
Change parameter names. Bad parameter names make it harder to use a program. And you might think that it is no problem to change the names of parameters, because the name appears only in the header of the program anyway. But is that really true?
Certainly, if you call a program by using positional notation, the name of the parameter will not appear in the call to that program. Here is an example that calls loan_rate function with positional notation:
DECLARE l_type loans.loan_type%TYPE := 'FIXED'; l_term PLS_INTEGER := 30; BEGIN DBMS_OUTPUT.PUT_LINE ( mortgage_calcs.loan_rate (l_type, l_term)); END loan_rate;
If every call to loan_rate used positional notation, I could change the parameter names every day and it wouldn’t make any difference. If, on the other hand, I use named notation , such changes will break my code:
DECLARE l_type loans.loan_type%TYPE := 'FIXED'; l_term PLS_INTEGER := 30; BEGIN DBMS_OUTPUT.PUT_LINE ( mortgage_calcs.loan_rate ( loan_type_in => l_type ,years_in => l_term); END loan_rate;
Named notation leads to code that is much more readable and easier to maintain, but it also hard-codes the parameter name in the call to the program.
There is, unfortunately, no way to mask changes to parameter names when your code uses named notation. You will need to find every occurrence of named notation calls to the changed program and change the parameter name.
Change the program type. Sometimes you may find the need (or feel the desire) to change the type of your program from procedure to function or vice versa. A previous developer might, for example, have created a function that returns values via OUT arguments in the function. Many of us feel that that is a bad design for a function and would prefer to use a procedure.
In such a case, I would make sure that the program is in a package and then create a second overloading of the program with the same name but a different type. Then I would define one of those programs as a pass-through, so that there is a single implementation of the program’s algorithm, but also allow existing calls to the program to remain intact.
Add an IN parameter with a default value to the end of the parameter list. The most common change to programs involves the addition of parameters to the parameter list. If you add an IN parameter to the end of the parameter list, none of the existing calls to the program needs to be changed in the least—as long as you provide a default value for that parameter.
If any of those calls need to use a nondefault value for that new IN parameter, however, you will need to change only those calls, adding a value for that parameter. Otherwise, PL/SQL will automatically assign a default value as the actual argument and then execute the code in the program.
Make any other changes to the parameter list. Suppose you need to change the parameter list in any other way, such as
Any of these parameter changes will force you to change every existing call to the program. To avoid this, you should leave the existing program unchanged and add an overloading with that same program name and the new parameter list, which means that all existing calls to the program remain valid. You can then very selectively change only those calls that actually need to take advantage of the change in the parameter list.
Conclusion. We change the header of a program because our users have changed the requirements for that program. Sometimes those changes affect every single call to the program already in place in the application. In other circumstances, some or all of the existing calls are not affected by the change in requirements.
If you are careful about how you implement your changes, you can minimize the need to propagate those changes throughout your application code base. The best ways to do this include moving your schema-level programs into packages and taking advantage of overloading, adding trailing IN parameters with default values, and redefining existing programs as pass-throughs.
I need to write a procedure to process multiple rows of data from a table, and each time I call the procedure, the WHERE clause may change. I would like to use EXECUTE IMMEDIATE, but that lets me return only a single row of data. How can I avoid the nightmare of maintaining code in multiple procedures that is completely identical except for the WHERE clause?
When native dynamic SQL was first introduced in Oracle8i Database, you could use EXECUTE IMMEDIATE only to query single rows of data. You could, on the other hand—even in Oracle8i Database—use the OPEN FOR statement with a cursor variable based on a dynamically constructed query to achieve your desired results.
Listing 1 shows a simple example of such a program, the process_rows procedure, querying data from the classic employees table.
Code Listing 1: The process_rows procedure using the OPEN FOR statement
CREATE OR REPLACE PROCEDURE process_rows (where_in IN VARCHAR2) IS TYPE weak_rc IS REF CURSOR; l_cursor weak_rc; l_name employees.last_name%TYPE; BEGIN DBMS_OUTPUT.put_line ( 'Employees identified by " ' || where_in || ' " '); OPEN l_cursor FOR 'SELECT last_name FROM employees WHERE ' || where_in; LOOP FETCH l_cursor INTO l_name; EXIT WHEN l_cursor%NOTFOUND; /* Do the processing here. */ DBMS_OUTPUT.put_line (l_name); END LOOP; CLOSE l_cursor; END process_rows; /
I then run the process_rows program for two different WHERE clauses and see these results:
BEGIN process_rows ('department_id = 10'); process_rows ('salary > 15000'); END; / Employees identified by "department_id = 10" Whalen Employees identified by "salary > 15000" King Kochhar De Haan
In Oracle9i Database, Oracle added a predefined weak ref cursor type, called SYS_REFCURSOR, enabling us to simplify the declaration of the cursor variable to nothing more than this:
But Oracle added even more goodies in Oracle9i Database. Starting with this version, you could use EXECUTE IMMEDIATE to fetch multiple rows of data, depositing the result into a collection. You could then write much simpler code and also improve performance, because this approach also involves the use of BULK COLLECT. Listing 2 shows a rewrite of the process_rows procedure with EXECUTE IMMEDIATE.
Code Listing 2: The process_rows procedure using EXECUTE IMMEDIATE
PROCEDURE process_rows (where_in IN VARCHAR2) IS TYPE employees_t IS TABLE OF employees.last_name%TYPE INDEX BY PLS_INTEGER; l_names employees_t; BEGIN DBMS_OUTPUT.put_line ( 'Employees identified by " ' || where_in || ' " '); EXECUTE IMMEDIATE 'SELECT last_name FROM employees WHERE ' || where_in BULK COLLECT INTO l_names; FOR l_index IN 1 .. l_names.COUNT LOOP /* Do the processing here. */ DBMS_OUTPUT.put_line (l_names (l_index)); END LOOP; END process_rows; /
Note that I iterate through the collection from index value 1 to the count of elements in the collection. I could have used this formulation instead:
FOR l_index IN l_names.FIRST .. l_names.LAST
but if the collection is empty, both the FIRST and LAST methods return NULL. If either of the low or high values in a numeric FOR loop evaluates to NULL, Oracle Database will raise the ORA-06502 VALUE_ERROR exception. Because BULK COLLECT always fills its collection starting at index value 1, we avoid even the possibility of this error by iterating from 1 to the COUNT.
Using EXECUTE IMMEDIATE with BULK COLLECT is the simplest solution to your challenge. If, however, your table has a very large number of rows, the collection it fills will consume a very large amount of memory. To achieve best performance with reasonable memory consumption, you should use OPEN FOR and FETCH BULK COLLECT with a LIMIT clause, as shown in Listing 3.
Code Listing 3: Using OPEN FOR, FETCH BULK COLLECT, and LIMIT
PROCEDURE process_rows ( where_in IN VARCHAR2, limit_in IN PLS_INTEGER DEFAULT 100) IS TYPE employees_t IS TABLE OF employees.last_name%TYPE INDEX BY PLS_INTEGER; l_names employees_t; l_cursor sys_refcursor; BEGIN DBMS_OUTPUT.put_line ( 'Employees identified by " ' || where_in || ' " '); OPEN l_cursor FOR 'SELECT last_name FROM employees WHERE ' || where_in; LOOP FETCH l_cursor BULK COLLECT INTO l_names LIMIT limit_in; FOR l_index IN 1 .. l_names.COUNT LOOP /* Do the processing here. */ DBMS_OUTPUT.put_line (l_names (l_index)); END LOOP; EXIT WHEN l_cursor%NOTFOUND; END LOOP; CLOSE l_cursor; END process_rows; /
The bottom line is that you can definitely avoid writing multiple procedures for different WHERE clauses by using EXECUTE IMMEDIATE or OPEN FOR. Furthermore, you can leverage BULK COLLECT to greatly speed up your row retrieval.
One final note: in these code examples, I created two procedures that each accept a WHERE clause as dynamic text, to be concatenated into the static part of the SQL query. This is a classic scenario for SQL injection, in which attackers trick the SQL engine into executing unintended commands. Here is a quick demonstration of SQL injection taking advantage of my process_rows procedure:
BEGIN process_rows ( 'department_id = 10 UNION SELECT username FROM all_users'); END; / Employees identified by "department_id = 10 UNION SELECT username FROM all_users" ANONYMOUS APEX_PUBLIC_USER BI ... Whalen XDB
Note that I now see the last name Whalen as well as all the users defined in this instance. This is not the intended use for this procedure, but a clever (and malicious) user can twis t the intent of my procedure to execute a very different sort of code.
SQL injection presents a serious security challenge in any database programming environment. I do not have space in this column to address this topic, but Oracle has set up a Security Technology Center on the Oracle Technology Network. This center offers guidance on SQL injection and other security-related issues. I strongly recommend that you take advantage of this center to reduce the likelihood of putting into production use any stored programs that allow users to provide text to be concatenated into dynamic SQL strings.
Steven Feuerstein (firstname.lastname@example.org) is Quest Software’s PL/SQL evangelist. He has published 10 books on Oracle’s programming language, including Oracle PL/SQL Programming and Oracle PL/SQL Best Practices (O’Reilly Media). Feuerstein’s self-appointed mission in life these days is to improve the quality and quantity of PL/SQL code testing.