DEVELOPER: PL/SQL Practices
On Privacy and FunctionBy Steven Feuerstein
Use Oracle Virtual Private Database and Function Result Cache—securely.
We just upgraded to Oracle Database 11g, and I am excited about being able to use the new Function Result Cache feature. We also use Oracle Virtual Private Database (Oracle VPD) to restrict the rows in our tables that can be seen by our users. Can I use the Function Result Cache with an application built on Oracle VPD?
You certainly can use Function Result Cache to optimize retrieval of data from tables on which you have specified access control through Oracle VPD. You will, however, likely need to change the parameter list of your function to ensure that the correct data is always returned by Function Result Cache.
In this answer, I will demonstrate problems that can arise when you’re using Function Result Cache with Oracle VPD-controlled tables. I will then offer some suggestions on how to avoid these problems. And at the end of this answer, I will describe an Oracle Database 11g Release 2 enhancement for Function Result Cache that will make all our lives much easier.
I previously wrote about Function Result Cache in “On the PL/SQL Function Result Cache,” in the September/October 2007 issue of Oracle Magazine, so I will assume a rudimentary knowledge of the basics of this feature. Oracle VPD is a very powerful and complex feature of Oracle Database that cannot be fully explained in this answer; I will provide only a high-level explanation and relatively simple examples. I recommend that you review the Oracle documentation (links are provided in Next Steps) for information on this topic.
When you use Oracle VPD in your application, you define security or access policies for SQL operations on tables. Oracle Database then automatically adds these policies in the form of WHERE clause predicates to restrict the rows a user can query or change in that table. The bottom line: two different users can run what seems to be the same query (such as SELECT last_name FROM employees) and get different results. It is impossible for a user to get around these policies, because they are applied inside the SQL layer—and they are invisible to the user.
Function Result Cache can dramatically reduce the time it takes to return previously queried data to the application. When a result-cache-enabled function is called, Oracle Database first checks to see if a call to this function with matching input values has been executed. If so, the function is not executed again; rather, the return value cached from the previous invocation is returned. If the result-cache-enabled function is called with a new combination of input values, Oracle Database will execute the function and cache both inputs and return values. The cache is shared among all sessions connected to the same instance of the database. The cache can also be tied to underlying database tables, so as soon as a change has been committed to a table, the cache will be invalidated, guaranteeing clean data.
Now let’s take a look at how Function Result Cache can cause problems when combined with Oracle VPD. In this first and very simplistic example, my application data and code are owned by the appowner schema and the data is accessed from two user schemas, user1 and user2.
The application’s table—data_by_user—keeps track of text by ID; there is also an audit column populated by a trigger that tells us the schema to which a user was connected when the data was inserted:
CONNECT appowner/appowner CREATE TABLE data_by_user ( id NUMBER , text VARCHAR2 (100) , created_by VARCHAR2 (30) ) / CREATE OR REPLACE TRIGGER data_by_user_audit BEFORE INSERT ON data_by_user FOR EACH ROW DECLARE BEGIN :new.created_by := USER; END data_by_user_audit; / GRANT ALL ON data_by_user TO user1 / GRANT ALL ON data_by_user TO user2 /
The application also provides a function for retrieving text by ID: text_for_id, shown in Listing 1. Note that it is a result cache function and that it displays an indicator that the function was executed.
Code Listing 1: Creating text_for_id function
CREATE OR REPLACE FUNCTION text_for_id (id_in IN data_by_user.id%TYPE) RETURN data_by_user.text%TYPE RESULT_CACHE RELIES_ON ( data_by_user ) IS l_text data_by_user.text%TYPE; BEGIN DBMS_OUTPUT.put_line ('Retrieving text for ID ' || id_in); SELECT dbu.text INTO l_text FROM data_by_user dbu WHERE dbu.id = text_for_id.id_in; RETURN l_text; END text_for_id; / GRANT EXECUTE ON text_for_id TO user1 / GRANT EXECUTE ON text_for_id TO user2 /
Each user then inserts two rows into the table, as shown in Listing 2.
Code Listing 2: Inserting two rows into the data_by_user table
connect user1/user1 BEGIN INSERT INTO appowner.data_by_user (id, text ) VALUES (1, 'User1 data row 1' ); INSERT INTO appowner.data_by_user (id, text ) VALUES (2, 'User1 data row 2' ); COMMIT; END; / connect user2/user2 BEGIN INSERT INTO appowner.data_by_user (id, text ) VALUES (3, 'User2 data row 1' ); INSERT INTO appowner.data_by_user (id, text ) VALUES (4, 'User2 data row 2' ); COMMIT; END; /
Because there are not yet any Oracle VPD policies defined for the data_by_user table, both user1 and user2 can successfully retrieve all rows from the table with either a SELECT statement or the text_for_id function.
In addition, because the function is defined as a result cache function, the text for a particular ID will be retrieved from the table the first time it is requested. After that (and until the cache is invalidated) the function will not be executed to retrieve that same text, as you can see in Listing 3 (note that the “Retrieving text . . .” message appears only once).
Code Listing 3: Executing the text_for_id result cache function
SQL> connect user1/user1 Connected. SQL> SET SERVEROUTPUT ON SQL> BEGIN 2 DBMS_OUTPUT.put_line (appowner.text_for_id (3)); 3 DBMS_OUTPUT.put_line (appowner.text_for_id (3)); 4 END; 5 / Retrieving text for ID 3 User2 data row 1 User2 data row 1 SQL> connect user2/user2 Connected. SQL> SET SERVEROUTPUT ON SQL> BEGIN 2 DBMS_OUTPUT.put_line (appowner.text_for_id (3)); 3 END; 4 / User2 data row 1
Now let’s bring Oracle VPD into the picture. The access control policy for the data_by_user table is very simple: you should be able to see only rows you’ve inserted. In other words, when I am connected to user1, I should not be able to see rows inserted by user2; yet that is precisely what happened in Listing 3 (the row with id 3 was created by user2).
To implement this access control policy, I first create a packaged function—vpd_policies—that returns a WHERE clause predicate for the policy, as shown in Listing 4. Note that the schema and name arguments are required by Oracle VPD, even if they are not used within the function.
Code Listing 4: Creating the vpd_policies packaged function
CREATE OR REPLACE PACKAGE vpd_policies IS FUNCTION your_data_only (schema_in VARCHAR2, NAME_IN VARCHAR2) RETURN VARCHAR2; END vpd_policies; / CREATE OR REPLACE PACKAGE BODY vpd_policies IS FUNCTION your_data_only (schema_in VARCHAR2, NAME_IN VARCHAR2) RETURN VARCHAR2 IS BEGIN RETURN 'created_by = ' ' ' || USER || ' ' ' '; END your_data_only; END vpd_policies; /
I then call the DBMS_RLS.ADD_POLICY procedure (RLS stands for row-level security ) to apply this your_data_only policy function to the data_by_user table, as shown in Listing 5.
Code Listing 5: Calling DBMS_RLS.ADD_POLICY
Now that the policy is in place, user2 should not be able to see any rows entered by user1. This is easy to verify with a SELECT statement against the data_by_user table:
SQL> connect user1/user1 Connected. SQL> SELECT * 2 FROM appowner.data_by_user 3 WHERE id = 3; no rows selected
What happens, however, if we access this data through the text_for_id result cache function?
Suppose the database instance was just started by the DBA; all result caches are empty. I connect to user2 and ask to see the text for ID 3 with a call to the text_for_id function. Because that row was originally created by user2, it is displayed—after the function is executed and the information cached:
SQL> connect user2/user2 Connected. SQL> set serveroutput on SQL> BEGIN 2 DBMS_OUTPUT.put_line (appowner.text_for_id (3)); 3 END; 4 / Retrieving text for ID 3 User2 data row 1
I then connect to user1. Because the Oracle VPD policy is in place, I should not be able to see the text for ID 3, but in fact I can:
SQL> connect user1/user1 Connected. SQL> set serveroutput on SQL> BEGIN 2 DBMS_OUTPUT.put_line (appowner.text_for_id (3)); 3 END; 4 / User2 data row 1
The Oracle VPD policy has been violated. How could this have happened? Note that when I ran this function in the user1 schema, the “Retrieving text . . .” line was not displayed. That’s because the inputs and the return string were previously cached—so the body of the function, which includes the SELECT statement on which the Oracle VPD policy would have been applied, was not executed.
In other words, the benefit of the result cache is precisely the cause of the policy violation.
Avoiding Result Cache Interference with Oracle VPD Policies
You might conclude that Function Result Cache should never be used with any function that queries the contents of a table on which Oracle VPD policies are defined. That is too extreme a conclusion, though. In fact, you can make these technologies compatible, but it will take some re-engineering of your code.
I will show you how to do this for the text_for_id function. Then I will generalize this technique to a rule and apply it to a more complex, real-world scenario.
The problem with my current text_for_id function implementation is that it specifies a single cache for all users in the database instance, yet different users need to see different data. To address that need, I could ask Oracle Database to create a separate cache for each user, by adding the user name to the parameter list of my function.
I could also simply add the user_in argument to the original, “public” text_for_id function and provide a default value of USER. This approach, though, leaves the application more vulnerable to bugs. The application requires that the name of the currently connected user be passed to the result cache function, so why give the programmer an opportunity to provide a different value?
A better approach is to create a “private” result cache function—priv_text_for_id—that accepts the user name as an argument, as shown in Listing 6.
Code Listing 6: Creating the priv_text_for_id “private” result cache function
CONNECT appowner/appowner CREATE OR REPLACE FUNCTION priv_text_for_id ( id_in IN data_by_user.id%TYPE, user_in in VARCHAR2 ) RETURN data_by_user.text%TYPE RESULT_CACHE RELIES_ON ( data_by_user ) IS l_text data_by_user.text%TYPE; BEGIN DBMS_OUTPUT.put_line (' Retrieving text for ID ' || id_in); SELECT dbu.text INTO l_text FROM data_by_user dbu WHERE dbu.id = priv_text_for_id.id_in; RETURN l_text; END priv_text_for_id; /
With the private function in place, next create the “public” function (text_for_id—the one that can be executed outside of the appowner schema) to call the private priv_text_for_id function and pass along the username, as shown in Listing 7. Note that the text_for_id function is no longer a result cache function.
Code Listing 7: Creating new text_for_id “public” function—no result cache
CREATE OR REPLACE FUNCTION text_for_id (id_in IN data_by_user.id%TYPE) RETURN data_by_user.text%TYPE IS BEGIN return priv_text_for_id (id_in, USER); END text_for_id; /
With this new configuration, the Oracle VPD access policy will be applied even when the text_for_id function is called:
SQL> connect user2/user2 Connected. SQL> SQL> /* Cache the data for id 3 */ SQL> SQL> BEGIN 2 DBMS_OUTPUT.put_line (appowner.text_for_id (3)); 3 END; 4 / SQL> connect user1/user1 Connected. SQL> SQL> /* Try to get text for id 3. */ SQL> SQL> BEGIN 2 DBMS_OUTPUT.put_line (appowner.text_for_id (3)); 3 END; 4 / BEGIN * ERROR at line 1: ORA-01403: no data found
The vpd_policies.your_data_only function concatenated USER into the predicate string. Including USER in the parameter list of the priv_text_for_id function ensured a distinct cache for each distinct predicate clause.
From this reworking of the text_for_id function, we can come up with a general rule for using Oracle VPD with the result cache: Make sure the parameter list of the result cache function includes all variant data used by the access policy function to produce the predicate clause.
Avoiding Oracle VPD Violations in the Real World
Although the use of USER in the parameter list of the priv_text_for_id function provided a solution for using Oracle VPD and Function Result Cache, I doubt that there are very many Oracle VPD policy functions that rely on the USER built in to determine the predicate clause returned by the policy function. First of all, these policies are likely to be much more complex, relying on multiple values. Second, most applications these days use connection pooling, enabling many users to share the same database connection. USER simply doesn’t provide specific enough information about the person who is actually sitting at the keyboard and trying to access the database.
Oracle Database offers session-specific application contexts (essentially, name/value pairs) to enable developers to obtain more-detailed information about user identity. These contexts are usually set in a logon trigger. A developer can then call the SYS_CONTEXT function to retrieve the value for a specific context name.
In addition, Oracle Database makes available a wide variety of “user environment” settings through a predefined USERENV context.
You can obtain the current user, for example, with this call:
SYS_CONTEXT ('USERENV', 'CURRENT_USER')
You can also obtain client-specific information. For example, if in the logon trigger, you make a call like this:
DBMS_SESSION.SET_IDENTIFER ('real user name')
then in your VPD policy function, you can retrieve that identifier with
SYS_CONTEXT ('USERENV', 'CLIENT_IDENTIFIER')
If your application requires the use of application contexts to generate the VPD predicate clause, you will need to pass this context as an argument to the result cache function.
To demonstrate this with the text_for_id function code, I would change my “public” program as shown in Listing 8.
Code Listing 8: text_for_id function with application contexts
CREATE OR REPLACE FUNCTION text_for_id (id_in IN data_by_user.id%TYPE) RETURN data_by_user.text%TYPE IS BEGIN return priv_text_for_id ( id_in, SYS_CONTEXT ('USERENV', 'CLIENT_IDENTIFIER')); END text_for_id; /
If my predicate function is more complex and, say, references the client identifier, the host name, and an application-specific context value, I will then need to pass all three of those values. Listing 9 includes an example of what this might look like.
Code Listing 9: text_for_id function with multiple application contexts
CREATE OR REPLACE FUNCTION text_for_id (id_in IN data_by_user.id%TYPE) RETURN data_by_user.text%TYPE IS BEGIN return priv_text_for_id ( id_in , SYS_CONTEXT ('USERENV', 'CLIENT_IDENTIFIER') , SYS_CONTEXT ('USERENV', 'HOST') , SYS_CONTEXT ('My_app_context', 'Favorite Flavor') ); END text_for_id; /
To conclude, you definitely can use the function result cache feature with Virtual Private Database-based applications. You will, however, need to carefully review all your policy functions. Identify all session-specific (and other) dependencies, and then add parameters to your result cache functions so that you can pass in those values at runtime. If you do this, your users will never see the wrong data and they will benefit from the improvements in performance that come with the Oracle Database 11g function result cache feature.
Making Our Lives Easier
One very nice enhancement in Oracle Database 11g Release 2 relates to the function result cache. In Oracle Database 11g Release 1, you had to supply a list of tables on which the cached data is based, with the RELIES_ON clause. Oracle Database would then use this list to automatically invalidate caches when changes were committed to any of the tables in the list.
If you forgot to include the RELIES_ON clause and the table list or if you left a table off that list, the caches could become dirty and deliver the wrong data to your users.
Now with Oracle Database 11g Release 2, Oracle Database has made result cache management smarter. You can leave out the RELIES_ON clause, and the result cache will automatically determine the set of tables on which the cache depends. If your function includes a RELIES_ON clause, it will be ignored in Oracle Database 11g Release 2.
Steven Feuerstein (firstname.lastname@example.org) is Quest Software’s PL/SQL evangelist. He has published 10 books on Oracle’s programming language, including Oracle PL/SQL Programming and Oracle PL/SQL Best Practices (O’Reilly Media). Feuerstein’s self-appointed mission in life these days is to improve the quality and quantity of PL/SQL code testing.