Mining for Clues
By Arup Nanda
Use Oracle LogMiner to recreate transactions and undo damage.
Paul is on the other end of the line and is irate. He had been the account manager for a customer named Ellen Smith for the past 20 years. But today, when Ellen walked into the bank, the system said Paul was no longer the account manager! Embarrassed and justifiably upset, he is screaming for answers.
John, the lead DBA at the bank, needs to act fast. He performs a quick query on the ACCOUNTS table, which shows that the value of the ACC_MGR_ID column for account number 4567 (Ellen's account) is 1, whereas Paul's ID is 5. The value of the column was definitely 5 before. Who could have changed it, and why? Furthermore, a separate query shows that Paul was assigned another customer (with account number 9876), whom he does not recognize. Why did this happen, and who is the real account manager for account 9876?
Digging Through the Archives
What options does John have for rectifying the situation?
Let's start by looking at Oracle's undo and redo logs. Note that Oracle Database generates an undo and a redo entry for each change made to the database. In simple terms, an undo entry provides the values of data stored before a change and the redo entry provides the values of data stored after a change. The redo entries are written to log buffers in the system global area (SGA) whenever a change occurs. The log writer (LGWR) process then writes the redo entries to the online redo logs, and, if the database is in ARCHIVELOG mode, the archiver process (ARC0) writes the redo entries to the archived logs. The purpose of the redo entries is to ensure that database activities can be replayed in case a disaster occurs in the database. In the event of a complete loss, a DBA can restore a previous backup of the database and apply all the archived logs generated since that time to roll it forward to the time of failure.
What about undo? According to Oracle's read consistency model, a user sees the value of a database change only after the change is committed; otherwise, the database returns the previous (unchanged) values. These previous values are retrieved from the undo segments. Furthermore, during data recovery after a disaster, Oracle Database must roll back all changes that were not committed, to maintain consistency and data integrity. These entries are also obtained from the undo segments. Because undo segments are also database changes, they generate redo entries as well, which go into online redo logs and then to archived logs.
As a result, online and archived logs contain both undo and redo information. DBAs can use them to see both old and new data values. However, online and archived logs have an unpublished format and are not human-readable. To read them, DBAs use a tool called Oracle LogMiner , which makes it possible to read and search both online and archived logs. It is implemented via a supplied package named DBMS_LOGMNR , introduced in Oracle8i. This article describes how to use this powerful tool to solve common and not-so-common problems involving changes made to an Oracle database.
Let's return to the issue of Ms. Smith's account. John can start looking for how and when the account manager was changed by using Flashback Query to select prior values of the record in this table. Flashback Query is an Oracle feature that lets a user see the value of a column at some point in the past, before that value may have been changed and committed.
However, the Flashback Query approach works only if a change occurred within the period specified by the UNDO_RETENTION parameter, which is often as short as 30 minutes for an OLTP database. In this case, there's a good chance that Flashback Query will not find the change.
John can find the answer to the question at hand by looking directly at the online redo log files, using LogMiner. Here are the steps for setting up and using LogMiner:
In the Listing 1 output, John sees that a user named Joe updated the table at 2:16 p.m. The SID and SERIAL# columns indicate those identifiers of the session in which Joe was logged in.
Code Listing 1: Retrieving information from V$LOGMNR_CONTENTS
SQL> select username, to_char(timestamp,'mm/dd/yy hh24:mi:ss') timestamp, 2 seg_type_name, seg_name, table_space, session# SID, serial# 3 from v$logmnr_contents 4 where table_name = 'ACCOUNTS' 5 and seg_owner = 'ARUP' 6 and operation = 'UPDATE'; USERNAME TIMESTAMP SEG_TYPE_N SEG_NAME TABLE_SPAC SID SERIAL# ___________ __________________ _____________ ___________ ___________ ____ ______ JOE 01/26/05 14:16:41 TABPART ACCOUNTS,P1 ACC_D1 532 62 JOE 01/26/05 14:16:41 TABPART ACCOUNTS,P2 ACC_D2 532 62 JOE 01/26/05 14:16:41 TABPART ACCOUNTS,P3 ACC_D3 532 62 ...
Another column in this view, SESSION_INFO , provides additional information on Joe's session. The value of the SESSION_INFO column for all the rows is
login_username=JOE client_info= OS_username=jsmoe Machine_name=ACMENTNY\JSMOE OS_terminal= OS_process_id=4080:3096 OS_program name=sqlplus.exe
Based on this information, John knows that Joe updated some records in the ACCOUNTS table from a client machine named JSMOE connected to the ACMENTNY network. The last piece of information, OS_program name=sqlplus .exe , provides another big clue. Because Joe was using SQL*Plus, he probably performed an ad hoc update, but because Joe is a developer, he is not supposed to perform updates to a production database except in a dire emergency. Why would he have made these changes?
In any event, John needs to know what Joe updated as well as the old values stored before the update. To find this information, John modifies the query in Listing 1 to select the two very important columns in the V$LOGMNR_CONTENTS view: SQL_REDO , which shows a statement that can be executed to cause the same change made by the user, and SQL_UNDO , a statement that can be executed to undo the changes. Listing 2 shows the output of this modified query.
Code Listing 2: Output of the SQL_UNDO and SQL_REDO columns
SQL> select sql_undo, sql_redo 2 from v$logmnr_contents 3 where table_name = 'ACCOUNTS' 4 and seg_owner = 'ARUP' 5 and operation = 'UPDATE'; SQL_UNDO SQL_REDO ____________________________-_______ ___________ _______________________ update "ARUP"."ACCOUNTS" set "ACC_M update "ARUP"."ACCOUNTS" set "ACC_M GR_ID" = '2' where "ACC_MGR_ID" = ' GR_ID" = '6' where "ACC_MGR_ID" = ' 6' and ROWID = 'AAOKdAALAAAGcEAB' 2' and ROWID = 'AAOKdAALAAAGcEAB' update "ARUP"."ACCOUNTS" set "ACC_M update "ARUP"."ACCOUNTS" set "ACC_M GR_ID" = '1' where "ACC_MGR_ID" = ' GR_ID" = '6' where "ACC_MGR_ID" = ' 6' and ROWID = 'AAAOKdAALAAAGcEAB0' 1' and ROWID = 'AAAOKdAALAAAGcEAB0' update "ARUP"."ACCOUNTS" set "ACC_M update "ARUP"."ACCOUNTS" set "ACC_M GR_ID" = '5' where "ACC_MGR_ID" = ' GR_ID" = '6' where "ACC_MGR_ID" = ' 6' and ROWID = 'AAAOKdAALAAAGcEAB5' 5' and ROWID = 'AAAOKdAALAAAGcEAB5' ...
When confronted, Joe admits that he made the changes. In his defense, however, Joe provides a strong rationale. A very important customer—Ellen Smith—had an unpleasant experience with her assigned account manager and demanded to deal with a new one. These changes are usually made by the general manager, but because he was away, the new account manager had called Joe and asked him if he could somehow "work his magic." Joe did so by issuing the following statement:
UPDATE ACCOUNTS SET acc_mgr_id = 6 WHERE first_name = 'Ellen' AND last_name = 'Smith';
Joe did not notice, however, that the above statement updated 340 records , each of which corresponded to a customer with the name Ellen Smith. Joe wanted to update the record of only one customer, but he inadvertently updated 340 records. Among those affected was Paul's customer, who clearly didn't want her account manager changed.
Note that in the above example, John mined the online redo logs. However, if the issue was old, the contents of the online redo log would already have gone to the archived log files. In that case, he could have mined archived logs as well. To use LogMiner against archived logs, he would simply have supplied the archived log file names in the call to the DBMS_LOGMNR.ADD_LOGFILE procedure. The rest of the tasks remain the same. He can locate the archived log file names from the V$ARCHIVED_LOG data dictionary view.
Undoing the Damage
John's next task is to update the other 339 records to their old values immediately. Note that in the output in Listing 2, the text under SQL_REDO shows what originally happened and the text under SQL_UNDO shows the statements that can be used to reverse those changes. Although Joe originally used a single statement, LogMiner reconstructs the statements to modify one record per statement, as shown. To reinstate the old account manager IDs, John can execute each of the statements listed under SQL_UNDO . This will undo the changes Joe made.
The output in the SQL_UNDO and SQL_REDO columns in Listing 2 is not very easy to read, because everything is on one line. John can use the print_pretty_sql option to solve that problem. He also wants to execute the SQL statements in a PL/SQL loop, using dynamic SQL, and this approach requires that the semicolon at the end of each statement be removed. The no_sql_delimiter option handles that task.
John decides to start a LogMiner session, using the options as follows:
begin dbms_logmnr.start_logmnr( options => dbms_logmnr. dict_from_online_catalog + dbms_logmnr.print_pretty_sql + dbms_logmnr.no_sql_delimiter ); end;
After this command runs, the output for the SQL_UNDO column from the query in Listing 2 against V$LOGMNR_CONTENTS will look like the following:
SQL_UNDO ______________________________ update "ARUP"."ACCOUNTS" set "ACC_MGR_ID" = 2 where "ACC_MGR_ID" = 6 and ROWID = 'AAOKdAALAAAGcEAB' ...
The previous sections discussed a way to retrieve the SQL statements to undo all the update operations made to the ACCOUNTS table. Of course, many other users may have updated the table as well, and John would not want to undo their changes. John can filter the contents of the V$LOGMNR_CONTENTS view to show only the pertinent ones, as shown in Listing 3.
Code Listing 3: Selective output from V$LOGMNR_CONTENTS
select acc_type, first_name, last_name, dbms_logmnr.mine_value(redo_value,'ARUP.ACCOUNTS.ACC_MGR_ID') new_value, dbms_logmnr.mine_value(undo_value,'ARUP.ACCOUNTS.ACC_MGR_ID') old_value from v$logmnr_contents l, accounts a where table_name = 'ACCOUNTS' and operation = 'UPDATE' and dbms_logmnr.column_present(undo_value,'ARUP.ACCOUNTS.ACC_MGR_ID') =1 and dbms_logmnr.column_present(redo_value,'ARUP.ACCOUNTS.ACC_MGR_ID') =1 and a.rowid = l.row_id; A FIRST_NAME LAST_NAME NEW_VALUE OLD_VALUE ________________ _____________ _____________ _________ D Ellen Smith 5 6 M Ellen Smith 5 10 D Ellen Smith 5 7 ...
Review this SQL carefully. It uses additional functions provided in the DBMS_LOGMNR package. The first function, MINE_VALUE , selects the value of the specified column as recorded before or after the change. For instance, DBMS_LOGMNR.MINE_VALUE (UNDO_VALUE, 'ARUP.ACCOUNTS.ACC_MGR_ID') shows the value of the ACC_MGR_ID column before the change. If the UNDO_VALUE parameter is changed to REDO_VALUE , the statement will instead return the value after the change.
Furthermore, the COLUMN_PRESENT function returns 1 if the column is present and 0 otherwise. When joined with the ACCOUNTS table, it shows other pertinent details of the account.
The output of Listing 3 shows the old and updated values of ACC_MGR_ID , the ACC_TYPE , and the name ( FIRST_NAME and LAST_NAME ). This output helps John assess, in one glance, the overall damage done and provides a way to correct it. Using the SQL statement in Listing 3, John can write some simple PL/SQL code to update the ACC_MGR_ID values to their original values selectively after checking other columns such as ACC_TYPE or FIRST_NAME , as shown in Listing 4.
Code Listing 4: PL/SQL for executing SQL_UNDO values
SQL> begin 2 for undo_rec in ( 3 select sql_undo, row_id 4 from v$logmnr_contents 5 where table_name='ACCOUNTS' and seg_owner='ARUP' and operation='UPDATE' 6 and dbms_logmnr.column_present(undo_value,'ARUP.ACCOUNTS.ACC_MGR_ID')=1 7 and dbms_logmnr.column_present(redo_value,'ARUP.ACCOUNTS.ACC_MGR_ID')=1 8 ) 9 loop 10 begin 11 execute immediate undo_rec.sql_undo; 12 exception 13 when no_data_found then 14 dbms_output.put_line('RowId '||undo_rec.row_id||' Failed.'); 15 end; 16 end loop; 17 end;
The foregoing discussion showed how to use LogMiner on the source database in the production environment. For increased security, John may want to transfer the archived logs to a different database and perform the mining there. Because this remote database would not be accessible to the DBAs of the source database, this approach minimizes the chances that a malicious DBA will remove his trail of commands. Another motivation behind remote mining is to limit the performance impact on the source database by offloading mining activity to a separate database.
With the present setup, however, there is one minor issue John must first address. In the scenario above, John used the online data dictionary as the dictionary for translating the redo log object names from hexadecimal values to readable names. That approach works fine for using the source database, but when the mining is performed on a remote database, the online dictionary of the source database is not available.
If John uses remote mining, he has two alternatives for solving this problem:
Both options require downloading the online data dictionary, as either a redo log or a flat file. This task can be performed with a supplied package called DBMS_LOGMNR_D . If the package is not already present, John can create it by running the dbmslmd.sql and prvtlmd.sql scripts in the $ORACLE_HOME/rdbms/admin directory.
If John chooses the first option (extracting the source data dictionary to the redo logs), he will execute the following command:
begin dbms_logmnr_d.build ( options => dbms_logmnr_d.store_in_redo_logs ); end;
Note that the database must be in archivelog mode to enable this option. Archivelog mode adds the dictionary information to the online redo logs, which, in turn, go into the archived logs. When he is ready to mine, John will transfer these archived log files, using a file transfer utility such as FTP, and include them for mining in the add_logfile procedure in the dbms_logmnr package.
After that, he starts the LogMiner session, but with a slightly different parameter:
dbms_logmnr.start_logmnr( options => dbms_logmnr.dict_from_redo_logs );
If John chooses the second option (storing the data dictionary in a flat file), he will specify a file where the new dictionary should be generated. Oracle Database must be able to write to the directory, using the UTL_FILE package, so the file location must be specified in the UTL_FILE_DIR initialization parameter. If it is not set already, John will set it to a directory such as /tmp (in UNIX):
utl_file_dir = '/tmp'
John will then restart the source database instance for this change to take effect and will then generate the new dictionary by executing
begin dbms_logmnr_d.build ( dictionary_filename => 'ora_dict.txt', dictionary_location => '/tmp' ); end;
This command generates a file named ora_dict.txt in the /tmp directory. John will then transfer the /tmp/ora_dict.txt file from the source to the remote database server in the /tmp directory and start the LogMiner session by executing
dbms_logmnr.start_logmnr(dictfilename => '/tmp/ora_dict.txt');
Note the new dictfilename parameter, which points to the dictionary file created earlier. There is no need for the OPTIONS parameter, because John is not reading from the online dictionary or the redo logs. He can still use the OPTIONS parameter for other things such as formatting the SQL, removing semicolons at the end, and so on.
Which approach is better? With the first approach, the redo stream contains the dictionary, so those redo logs must be available during mining. When you are mining a small set of log files, it may be inconvenient to keep those redo logs in the set as well. With the second approach, the flat file can be made available regardless of the number of redo log files to manage. If the UTL_FILE_DIR initialization parameter is not set, however, the flat-file option becomes unavailable; the only option is to use redo logs.
Let's look at a slightly different scenario for the "Ellen Smith Account Manager" story. Suppose that the ACCOUNTS table is range-partitioned on the column ACC_MGR_ID , where one partition is created for each value of the ACC_MGR_ID . In this case, when Joe updated the ACC_MGR_ID column, the rows moved to a different partition. For instance, a row with ACC_MGR_ID = 6 was in the sixth partition, but when Joe updated the column value to 3, the row moved to the third partition. Because ROWID is based on the physical location of the row, this physical movement changed the ROWID of the row.
In this case, how would you proceed?
If you execute the following undo statement retrieved from the LogMiner session (and shown in Listing 2)—
update "ARUP"."ACCOUNTS" set "ACC_MGR_ID" = '6' where "ACC_MGR_ID" = '3' and ROWID = 'AAAOKeAAcAAAH8MAA7';
—it may not update anything, because this ROWID may not correspond to any current ROWID in the table. Worse, it may correspond to an actual ROWID , but it will be the wrong row to update, because the ROWID has definitely changed for the row in question. So you cannot depend on ROWID to get to the changed row.
To uniquely identify the row, you would use the primary key of the table. But the primary key is not recorded in the redo log files, by default.
The way to get to the primary key information in the log files is to enable supplemental logging , which records additional column values in the redo logs. You would issue the statement
alter database add supplemental log data (primary key) columns;
to begin supplemental logging. This statement causes the primary key values to be recorded, along with the other changes in the redo logs. Note that the above statement takes effect for changes occurring only after the statement is issued, not retroactively. If you had issued the above statement before the original changes were made, you would have been be able to mine the logs and see the contents of the SQL_UNDO column in the Listing 2 output as
update "ARUP"."ACCOUNTS" set "ACC_MGR_ID" = '6' where "ACC_NO" = '5720' and "ACC_MGR_ID" = '3' and ROWID = 'AAAOKeAAcAAAH8MAA7';
Note how this SQL_UNDO column output now lists the primary key, ACC_NO , in addition to ROWID , which makes it possible to identify the correct rows.
With supplemental logging, John can get the primary key values from LogMiner, which helps him uniquely identify a row. If a table does not have a primary key, John can use supplemental logging of all columns, which makes the row unique. On the other hand, recording this extra information places an additional burden on the database server, affecting performance, and should be evaluated carefully.
Oracle redo logs are literally a transcript of what goes on inside the database and contain a mine of information. LogMiner is the tool for retrieving the information from these log files and applying it to identify user transactions or even undo them.
Arup Nanda (email@example.com) is the director of Database Engineering and Operation at Starwood Hotels and Resorts in White Plains, New York. The recipient of Oracle Magazine's 2003 DBA of the Year award, Nanda is the coauthor of Oracle Privacy Security Auditing, from Rampant Press (rampant-books.com), 2003.