TECHNOLOGY: Ask Tom
The Trouble with TriggersBy Tom Kyte
Our technologist looks at trigger maintenance and implementation challenges.
Those of you who frequent the asktom.oracle.com Web site know that I have an aversion to triggers. Once upon a time, a long time ago, I thought triggers were the coolest thing ever and I used (and abused) them heavily. Now, whenever possible, I will go very far out of my way to avoid a trigger.
I have two primary reasons for avoiding triggers:
Hopefully the first reason to avoid triggers, the maintenance headache, is fairly easy to see. Suppose you are working on someone else’s project, because you have inherited someone else’s work. Say you have a bit of code or better yet a stored procedure that represents a transaction, and you read it. You might presume, “I understand what this does. I get it.” But if you are on a system laden with triggers all over the place, you won’t have gotten it at all—you’ll have gotten it wrong, at best.
Triggers cause side effects. They do things out of the mainstream. When you run an update and see “1 row processed,” 500 other things may also have happened. More than once, I’ve received an e-mail similar to the following (this is a cut-and-paste—I did not make this up):
We have a problem when updating a column. When we update that column (type is varchar2), the update is showing 1,972 rows updated (and we commit after the UPDATE). When we retrieve that column, the query does not return the updated column values, but other columns are being updated and we are able to see those values. What is the problem?
I wrote back simply, “Got triggers?” The response back to me was, “Oh, that explains it. We have discovered the issue. We had ‘:new.name := :new.fname || ‘ ’ || :new.lname;’ in the trigger.” (I hear this type of thing week after week. It happens more often than you might think, in real life.)
I looked at their trigger and decided immediately that I did not like it.
First, their NAME attribute should have been a column in a view (or, in Oracle Database 11g, a virtual column). This name information is derived data based on the values in two other columns. The NAME attribute is the result of a trivial function, the function is not computationally expensive, and they could have indexed the function if necessary. There was no reason to ever store the result of the function in the table.
Second, they blindly overwrite the NAME attribute in the trigger. This causes confusion (otherwise, I would not have received the e-mail in the first place!), and it makes working with the data cumbersome. In this system, this NAME attribute was apparently not expected to always be “fname||’ ‘||lname”; otherwise, this update would never have been attempted (because it does not set the NAME attribute to that function result!). Apparently they were trying really hard to set it to some other value (but they could not do so without first disabling the trigger).
Triggers make understanding an existing system hard. Triggers make maintaining an existing system hard. Triggers cause confusion. Triggers are hidden in the data definition language (DDL) and in the schema.
During a code review, triggers are typically not even looked at in the context of the code being reviewed. But they should be. The trigger is like a subroutine that is called over and over again. A trigger is just like a subroutine , but most people consider triggers part of the DDL. And just as you would not review a CREATE TABLE statement while reviewing a package implementation, most people do not review the hidden code in a trigger.
The first reason I have for disliking triggers (the long-term maintenance headache) is merely an “I find them a pain in the neck” reason. With proper documentation and proper review processes, we all could probably deal with the maintenance issues.
But dealing with incorrect implementation of triggers is another thing. Far too often, just anyone will create a trigger, without understanding the ramifications of doing so. For example, can you immediately see the huge bug in this trigger?
SQL> create trigger send_mail 2 after insert on purchase_order 3 for each row 4 begin 5 utl_mail.send 6 (sender=>'firstname.lastname@example.org', 7 recipients=>'email@example.com', 8 subject=>'New Order ' || :new.po_number, 9 message=> ' ... ' ); 10 end; 11 / Trigger created.
Obviously, it is syntactically correct—it compiles. And if you insert a record, the trigger will run flawlessly. But it is very wrong: The implementation has a huge mistake.
The mistake boils down to one word: rollback. What happens if you insert 100 rows into the PURCHASE_ORDER table and then decide to roll back the insert and not commit it? Well, SMTP (Simple Mail Transfer Protocol) won’t participate in a distributed transaction with Oracle Database, so the sending of the e-mail will not roll back. You’ll have 100 e-mails sent that describe 100 new orders that never really happened . This is perhaps the most frequent error made in the use of triggers—performing an operation that cannot be rolled back.
(Note that you do not need to perform the rollback yourself to have a rollback happen. As part of normal processing, Oracle Database rolls back updates, deletes, and merges all the time without telling you that it has done so. See tkyte.blogspot.com/2005/08/something-different-part-i-of-iii.html , tkyte.blogspot.com/2005/08/part-ii-seeing-restart.html, and tkyte.blogspot.com/2005/09/part-iii-why-is-restart-important-to.html for details on this behavior.)
What this means is that any trigger that invokes many of the UTL_ functions is probably doing it wrong. For example, if you use a trigger to invoke UTL_FILE to write text to a file, that is probably wrong. The UTL_FILE .PUT_LINE calls in a trigger will not roll back; you’ll have written to a file about an event that never happened. If you use a trigger to invoke UTL_HTTP to invoke a service on a Web server, you’ll have invoked that service for an event that never happened. UTL_MAIL, UTL_SMTP, and UTL_TCP all suffer from the same problem: if the transaction that caused the trigger to fire rolls back, they’ll have performed operations that cannot be rolled back.
So, for anyone who writes triggers, the first rule of trigger coding is, “Do not perform any operation that cannot be rolled back.” Consider what happens if your trigger fires—the code executes—but the transaction rolls back.
The next implementation problem with triggers stems from the fact that developers often do not understand all the nuances of concurrency controls and isolation levels in the database. One of Oracle Database’s greatest features is that reads do not block writes and writes do not block reads. However, that single feature, when not fully understood by a developer, turns into a liability when the developer turns to triggers, especially triggers that enforce some “rule.”
For example, just recently on asktom.oracle.com, I was presented with the following scenario:
Suppose we have a table that includes country currency combinations with a primary currency. The following is sample data:
Country Currency Primary_Currency US USD Y US USN N US USS N
We need to enforce the rule that at most one currency can be primary for a given country. We have a BEFORE UPDATE trigger on the above table for each row (using autonomous transaction to avoid the mutating error) to check whether the country has any primary currency.
That was all I needed to read. I knew they had a serious bug on their hands when I read—paraphrasing:
The trigger would have looked something like this:
SQL> create or replace 2 trigger currencies_trigger 3 before update on currencies 4 for each row 5 declare 6 PRAGMA AUTONOMOUS_TRANSACTION; 7 l_cnt number; 8 begin 9 select count(*) 10 into l_cnt 11 from currencies 12 where primary_currency='Y' 13 and country = :new.country; 14 if ( l_cnt > 1 ) 15 then 16 raise_application_error 17 (-20000, 'only one allowed'); 18 end if; 19 end; 20 / Trigger created.
Now, there are many things wrong with this trigger. But the first obvious clue that something was seriously wrong was their need to use an autonomous transaction. They did that because without it, an update would produce the following result:
SQL> update currencies 2 set primary_currency = 'Y'; update currencies * ERROR at line 1: ORA-04091: table OPS$TKYTE.CURRENCIES Is mutating, trigger/function may not see it ORA-06512: at "OPS$TKYTE.CURRENCIES_TRIGGER”, line 4 ORA-04088: error during execution of trigger 'OPS$TKYTE.CURRENCIES_TRIGGER'
That is not really an error but more of a warning. Basically it is saying, “You are doing something so fundamentally wrong in your trigger that Oracle Database is just not going to permit you to do that.” If the database allowed the trigger to read the table it was defined on, as the update was proceeding, the trigger would see the table partially updated. If five rows were being updated, the row trigger would see the table with one of the rows modified, then two, then three, and so on. It would see the table in a manner in which the table never should be seen.
By way of example, suppose the CURRENCIES table, above, was in place with the sample data provided in the question and the trigger was permitted to read the table while it was changing. Now issue the following command:
update currencies set primary_currency = decode(currency, 'USD', 'N', 'USN', 'Y') where country = 'US' and currency in ( 'USD', 'USN');
That should be OK, because it moves the primary currency flag from USD to USN. After the statement finishes, there will be only one primary currency row. But what if the rows get updated in order of first USN and then USD. Well, when the trigger fires the first time, it will see USN with PRIMARY_CURRENCY=‘Y’ and USD with PRIMARY_CURRENCY=‘Y’. The trigger will fail the statement, but the statement is supposed to succeed. On the other hand, what if the data is processed in the order of first USD and then USN? In that case, the trigger will fire and find zero PRIMARY_CURRENCY=‘Y’ rows and then fire again, see only one, and be done.
So, for this trigger, the update will work for some data, sometimes. For other bits of data, the update will not work, sometimes. Two databases with the same data will fail on different sets of rows and succeed on others. It will all depend on how the data happens to be organized on disk and in what order it is processed. And that is unacceptable (not to mention very confusing).
That, in a nutshell, is why the mutating table constraint exists: to protect us from ourselves. But unfortunately, the developers asking this question found a way around the mutating table constraint: the autonomous transaction. That “feature” permits developers to query the table the trigger is firing on yet query it as if it were in another session/transaction altogether. The trigger will not see its own modifications to the table, and that is the huge flaw with this thinking: the trigger is attempting to validate the very modifications it cannot see. The trigger’s sole purpose is to validate the modifications to the data, but the trigger is reading the data before the modifications take place. It therefore cannot work!
The developers might find another way around this mutating table constraint. A common technique is to use a package and three triggers (as documented at asktom.oracle.com/tkyte/Mutate. Note that in Oracle Database 11g, there is a new feature, the compound trigger, that can be used in place of this “package and three triggers” technique, but the results will be the same as described below.)
With this technique, the package would have a global variable of a PLSQL table type, such as an array. The global variable would be set to “empty” by a BEFORE statement trigger. Then the global variable would be populated with primary keys by the FOR EACH ROW trigger (the keys of the modified rows) or their rowids. Last, an AFTER statement trigger would iterate over the global variable values and would be able to query the table the trigger was defined on, because the modification has already taken place. This sounds like it would work, and it would, if you were the only user of the database and never had more than one transaction at a time! This trigger solution works purely in a single-user environment, as do many other triggers I’ve reviewed. They work well in isolation but fail to do their job when multiple users invoke them simultaneously.
For example, suppose the currency table started this way (with all Ns for primary_currency):
Country Currency Primary_Currency US USD N US USN N US USS N
Now, in one session, I issue the following command:
update currencies set primary_currency='Y' where country = 'US' and currency = 'USD';
With the three-trigger solution, the BEFORE statement trigger will fire and set the session’s global variable to empty. Then the FOR EACH ROW trigger will fire and remember the countries modified in this global variable, just by recording the :NEW.COUNTRY and :OLD.COUNTRY values. Last, after all the records are updated, the AFTER statement trigger will fire and iterate over the COUNTRY values it discovers; it will query and count the PRIMARY_CURRENCY=‘Y’ and COUNTRY=‘US’ records and discover that only one record exists. All is well and good.
However, in another session, right after that update (which has not committed yet), I issue
update currencies set primary_currency='Y' where country = 'US' and currency = 'USN';
The triggers will now fire in this session, and when this AFTER statement trigger performs the same count as the AFTER statement trigger did in the other session, this session’s trigger will also discover that there is only one record with PRIMARY_CURRENCY=‘Y’ for COUNTRY=‘US’! That is because the read in this trigger is not blocked by the write in the other session—this session’s trigger cannot see the other session’s update.
Now both transactions commit, and we are left with a table that has two records for COUNTRY=‘US’ that have the PRIMARY_CURRENCY value=‘Y’. That is, the rule we were trying to validate was not validated. To enforce this rule in a trigger, we’d have to lock the table and serialize access to it. We’d need to do that to prevent two sessions from inserting a primary currency at the same time for the same country. (It is very hard to lock data we cannot see; we cannot see someone else’s insert, hence we have to prevent that person from inserting).
I like to say that if you have a trigger that enforces integrity across rows in a table, as this one tries to, and/or across tables, to enforce referential integrity, and you have not used the LOCK TABLE command to serialize access, you have almost certainly done it wrong. Because reads are not blocked by writes, and vice versa, enforcing entity integrity in a trigger is very complex and needs to involve explicit locking. Furthermore, this locking must be at a level far above row-level locking. It must usually be table-level locking.
But I have yet to see an application that uses the LOCK TABLE command in that fashion. I have seen many applications using triggers to enforce entity integrity, but they are almost universally implemented incorrectly. They have race conditions that, in a multiuser scenario, cause invalid data to be entered and stored. This, in turn, leads to failures in downstream processing, where other processes are working under the assumption that the data is “clean, valid, and follows the rules.”
What is the correct answer for the above business problem (at most one currency can be primary for a given country)? Well, I have two answers, both of which are extremely concurrent (scalable) and correct (they actually work). If you use triggers, you may pick either extremely concurrent or correct—you cannot have both simultaneously. If you are extremely concurrent with your trigger implementation, you have probably done it wrong. If you are correct with your trigger implementation, you are almost certainly not very scalable.
So, in reference to my first answer, I need to fix the data model, because the data model is entirely wrong. This business problem screams out for two tables: a table with primary currencies and a table with other currencies. If the application needs a single “table” for querying ease, then views would allow us to pull all the data together. To enforce the “at most one currency can be primary” rule, we would simply issue the following:
SQL> create table primary_currency 2 ( country varchar2(2) primary key, 3 currency varchar2(3) 4 ) 5 / Table created. SQL> create table other_currencies 2 ( country varchar2(2), 3 currency varchar2(3), 4 constraint other_currencies_pk 5 primary key(country,currency) 6 ) 7 / Table created.
We are done. We have implemented the one “rule.” I defy you to create two primary currencies for COUNTRY=‘US’. It is impossible. The primary key enforces that.
In the real world, however, this is typically just the beginning. The next rule that goes with the “at most one currency can be primary for a given country” rule is usually “and a country must have a primary currency.” That is, there must be at most one primary currency and at least one primary currency.
Again, we need no code for that. A simple declarative constraint does everything we need:
SQL> alter table other_currencies add 2 constraint must_have_at_least_one_primary 3 foreign key(country) 4 references primary_currency(country) 5 / Table altered.
We are done again, this time with the original and the follow-up rules—code-free, maintainable, understandable, and done. The foreign key makes it impossible to have another currency without having a primary currency. And it is done correctly, in a very scalable fashion.
And—to keep stressing a point—this being the real world, we are probably still not done with the rules needed for this data. The real, complete rule is probably as follows: “A country must have at least and at most one primary currency, and the primary currency is not allowed to be another currency.”
That is, if USD is the primary currency for the U.S., then USD cannot appear in the OTHER_CURRENCIES table when COUNTRY=‘US’.
That one is tricky. That is like an “antiforeign key”—a feature that does not exist. We can implement it as a database rule, however. Basically, we need to make sure that if we join PRIMARY_CURRENCY to OTHER_CURRENCIES by COUNTRY and CURRENCY, there are always zero records as a result.
Listing 1 creates a materialized view that does that join and puts a constraint on it that ensures that no data is ever produced as a result of the join.
Code Listing 1: Materialized view and constraint ensuring that join produces no data
SQL> create materialized view log 2 on primary_currency with rowid 3 / Materialized view log created. SQL> create materialized view log 2 on other_currencies with rowid 3 / Mat erialized view log created. SQL> create materialized view primary_is_not_other 2 refresh fast 3 on commit 4 as 5 select a.rowid arid, b.rowid brid 6 from primary_currency a, other_currencies b 7 where a.country = b.country 8 and a.currency = b.currency 9 / Materialized view created. SQL> alter table primary_is_not_other 2 add constraint primary_curr_cannot_be_other 3 check (arid is null and brid is null) 4 / Table altered.
That is the first answer that satisfies the original rule, “at most one currency can be primary for a given country,” and the extra rules likely to be required of the same information. For the second answer—if the original rule were the only rule (if we didn’t have the other two parts of the rule described above)—we could use the following index on the original table:
create unique index only_one_can_be_primary on country_currency_ref ( case when primary_currency = 'Y' then country end );
This answer uniquely indexes COUNTRY when PRIMARY_CURRENCY=‘Y’. If PRIMARY_CURRENCY is not ‘Y’, the case statement returns an entirely NULL index key that is not indexed. This way we end up uniquely indexing just the PRIMARY_CURRENCY=‘Y’ records.
I consider this index to be a “bad hack.” It does the job, but it is not pretty. And it is not capable of enforcing the other constraints that probably exist.
The correct data model, when designed to support the requirements, does it all for us nicely with declarative syntax.
Less code equals fewer bugs. Look for ways to write less code.
Triggers should be viewed with a skeptic’s eye. If the trigger is in place to enforce entity integrity, be very suspicious of it. Think about multiuser conditions. Think about what happens when two or three people operate on similar data at, or at about, the same time. Run all the combinations in your head or on the whiteboard. Play with your schema, using multiple sessions. See what happens when you have concurrent access.
If the trigger is there to supply values to columns, be aware of the possible maintenance issues and the “Whoops, I didn’t know that would happen” side effect.
Triggers should be the exception, not the rule. They should be used only when you cannot do something any other way. Given the concurrency issues, the problems with doing nontransactional operations in them, and the maintenance problems, triggers are something to use sparingly, if at all.
Tom Kyte is a database evangelist in Oracle’s Server Technology division and has worked for Oracle since 1993. He is the author of Expert Oracle Database Architecture: 9i and 10g Programming Techniques (Apress, 2005) and Effective Oracle by Design (Oracle Press, 2003), among others.