As Published In
Oracle Magazine
January/February 2012

TECHNOLOGY: Ask Tom

  

On History, Basics, and Network Performance

By Tom Kyte

 

Our technologist recalls a Web seminar, gives the right hint, and dishes on network performance.

I recently delivered an online Web seminar on Oracle Database security. At the end of the seminar, I took questions from the audience. There were many participants—a lot of questions on Oracle Total Recall—and when it was done, many unanswered questions.

Until now. The unanswered questions were forwarded to me, and I’ll be addressing them here in this column. But before I begin, here’s a quick overview of what Oracle Total Recall is and does.

How Oracle Total Recall Works

The purpose of Oracle Total Recall is to provide long-term flashback query capability—the flashback query can go many days, weeks, months, or even years into the past. Syntactically, an Oracle Total Recall query looks no different than a standard flashback query. It uses the AS OF and VERSIONS BETWEEN syntax in the FROM list. But under the covers, it operates very differently.

When flashback query was first introduced in Oracle9i Database, I heard from many developers asking if this new database feature could be used to replace their own custom audit trails. The developers had developed custom triggers that would save the :OLD records in an audit trail, and this approach enabled them to reconstruct the data their tables contained at any prior point in time. The problem with this implementation was twofold. First, coding the query to retrieve the point-in-time data was nontrivial; it required a UNION ALL between the current table and the history table and a messy, complex WHERE clause to get the right version of a row. Second, it necessarily made the original UPDATE and DELETE transactions at least twice as big datawise as they were before the addition of these custom triggers, which resulted in increased response time for the end users. So, the developers were motivated to find another approach that was less intrusive and easier to implement.

When they asked, “Can we use flashback query instead of our own custom audit trails,” the answer was simply, “No.” There were a few technical reasons why that was the answer. The first was that flashback query is based on UNDO, so to execute a flashback query on the data as of five hours ago, you would have to have all of the UNDO generated in the last five hours available online. Likewise, to execute a flashback query on the data as of two days ago, you’d need all of the UNDO generated in the last two days to be available online. I do not know of many systems in which the DBA would configure the UNDO tablespace to be able to contain two days of UNDO—even five hours is somewhat rare—let alone months or years of UNDO. The UNDO tablespace would be huge.

Another reason flashback query is not a replacement for custom audit trails is that the theoretical limit for a flashback query is five days, so using UNDO-based flashback query is strictly limited to the last five days of uptime for the database. So, even if you kept the UNDO for a really long time, you still couldn’t execute a flashback query on data as it stood more than five days ago.

The last reason is that UNDO-based flashback query is somewhat nonscalable. The further back in time you use flashback query, the longer it takes, because more work has to be performed. To execute a flashback query on data as of one hour ago, the database would have to roll back all the blocks it hit during the query to put them back the way they were an hour ago. If, for instance, a given block was modified by 100 different transactions in that last hour, the database would have to perform 100 rollback operations. Now, if you asked for the same data as of two hours ago, the database would likely have to roll back many more changes to that block—it would take longer to execute a flashback query as of two hours ago than it would to execute a flashback query as of one hour ago. The further back in time you execute a flashback query, the longer it is likely to take for the query to execute, because there are many more changes to roll back.

Enter Oracle Total Recall, available as of Oracle Database 11g. It solves both the performance issue and the UNDO storage and scalability issues associated with flashback query. With Oracle Total Recall, the client transaction is not affected: the processing performed by Oracle Total Recall takes place in the background, using a new database process called Flashback Data Archiver (FBDA). The client transaction just does its modifications, generates the UNDO for those modifications (as it always has done), and commits. Shortly after that client transaction commits, the FBDA process will mine the generated UNDO, looking for UNDO generated against tables in the flashback data archive. The DBA will have identified which tables need the special long-term query capability of the flashback data archive, and the FBDA process will look for UNDO generated against those tables.

When the FBDA process finds “interesting” UNDO—for tables in the flashback data archive—it rolls back the change, then and there, and reconstructs the row as it appeared before the UPDATE or DELETE operation. This reconstructed row is inserted into a flashback data archive table—a history table, if you will—and the client transaction response time is not affected. That solves the performance issue as well as the scalability issue. To execute a flashback query as of six months ago against data in the flashback data archive, the database does not have to roll back all the changes made in the last six months, because the FBDA process has been doing that all along. Instead, the database just has to query the row that was in place six months ago—a much easier challenge, with the same amount of time needed to query the data as of six months ago as six years ago. Oracle Total Recall also helps reduce storage requirements. UNDO-based flashback query would require preservation of all UNDO generated against every single table in the database. With the flashback data archive used by Oracle Total Recall, you store only the historical rows of interesting tables—not the entire database.

So, with that background in place, I can start looking at the questions received online during the Web seminar. I also encourage you to check out the Oracle Total Recall product page at oracle.com/us/products/database/options/total-recall.

Oracle Total Recall: The Questions

What is the performance impact of enabling Oracle Total Recall?

Oracle Total Recall was designed to be as nonintrusive as possible. The vast majority of the work performed by Oracle Total Recall happens asynchronously—in the background—after your transaction commits. Therefore, assuming that your database server has the excess capacity to handle this background processing, the impact on existing applications will be nominal.

If that assumption is not true—if your existing database server is running at full utilization right now—it would have an impact, of course, but that impact can be mitigated by rightsizing your hardware. You would need some additional CPU for the FBDA process, some additional I/O capabilities (you’ll generally be reading the UNDO out of the buffer cache, but you’ll be generating more overall UNDO at the system level; more overall REDO at the system level; and of course, writing to the flashback data archive itself), and you might need a little more memory to make this all run smoothly.

As always, I recommend benchmarking any change such as this before introducing it in production. You can benchmark either with your own tools or by using a product such as Oracle Real Application Testing (oracle.com/us/products/database/options/real-application-testing).

If a table is set up for Oracle Total Recall, will it affect DML and DDL on that table?

This question is somewhat version-specific and has two parts: data manipulation language (DML) and data definition language (DDL).

As for DML, the short answer is that it will not affect DML operations, except that it will permit you to use flashback query syntax to query the table as of a long time ago. So a SELECT statement would be affected, but only in a positive way. Your other DML operations are not affected.

As for DDL, the answer is version-dependent. In the first release of Oracle Total Recall, in Oracle Database 11g Release 1, DDL was very much restricted. Just about the only DDL that was permitted against a table used with Oracle Total Recall was the ALTER statement for adding a column. You could not drop a column, truncate the table, and so on. In short, pretty much anything that did not generate UNDO could not be performed against the table.

These restrictions have been removed as of Oracle Database 11g Release 2. In this release, most DDL is natively supported for tables used with Oracle Total Recall, and even the DDL that is not directly supported can be executed by administrators. If administrators need to perform an unsupported operation, such as an ALTER statement to exchange a partition in a partitioned table, they can first invoke the DBMS_FLASHBACK_ARCHIVE.DISASSOCIATE_FBA procedure, perform their operations, and then invoke DBMS_FLASHBACK_ARCHIVE.REASSOCIATE_FBA so the modified table will be rejoined with its history.

So, in short, there are no DML restrictions and effectively no DDL restrictions for Oracle Total Recall in Oracle Database 11g Release 2.

What Is Oracle Total Recall?


Introduced in Oracle Database, Enterprise Edition 11g, Oracle Total Recall is a database option that provides a secure, efficient, easy-to-use, and application-transparent solution for long-term storage and auditing of historical data. Oracle Total Recall also makes it simple to securely track and query historical data for any database table. Oracle Total Recall can be used for many purposes. Examples include

 

  • Data forensics—find and revert changes made by a disgruntled employee

  • Information lifecycle management (ILM)— guarantee immutable history of data

  • Retention policy enforcement—automatically purge history more than five years old

  • Historical reporting—analyze product changes over time

  • Error recovery—restore erroneously removed or updated records

  • Employee fraud detection—find assets that were deleted but never sold

You said that it would take thousands of GB for an UNDO-based flashback query to be able to query months or years in the past. How much space would be needed for Oracle Total Recall?

Yes, I did say that if you attempted to do a long-term UNDO-based flashback query, you’d have to have all of the UNDO generated for that entire period of time—for the entire database—online and available. And that would quickly get into many terabytes of UNDO over time.

The Oracle Total Recall processing will minimize the overall amount of storage you need, because it is enabled table by table—not databasewide—and is stored in a compressed format. So, first and foremost, you’ll need sufficient storage for only your interesting tables, not for every table in the database.

The answer to how much space you will need is that it depends. If you have a table you mostly insert into and hardly ever delete from or update, your Oracle Total Recall storage needs will be minimal, because you need to log only the before images of rows that were updated or deleted. On the other hand, if you have a one-row table that gets updated 1,000 times per day, your Oracle Total Recall storage for the table will be many times larger than the base table itself, because you will have to enter 1,000 rows in the archive history every single day—while the table itself remains small.

So, you will need to understand how often you update and delete in the table(s) in your archive to determine how much storage will be required over time.

Can Oracle Total Recall be used in the application realm as well as for security? For example, instead of using “effective dating,” could I use Oracle Total Recall to get a view of historic data?

Absolutely. This is definitely one of the use cases for Oracle Total Recall. You can enable it on tables that do not use effective dating processing—without modifying the application—and it helps solve data purging issues as well.

Typically, when applications use the effective dating type of processing, they need to retain the data for some period of time and then purge it. This purging is often accomplished with the DELETE statement, which is probably the slowest, most resource-intensive approach to purging old data. Additionally, after the DELETE, you may feel compelled to reorganize your tables and rebuild your indexes to reclaim the space you just freed up. With Oracle Total Recall, you get the ability to purge old information simply, without using DELETE and without consuming resources.

When DBAs create flashback data archives with Oracle Total Recall, they specify a retention period that tells Oracle Total Recall how far back in time they want to flashback-query the tables in the archive. Oracle Total Recall will set up a partitioned table (don’t worry if you don’t have the partitioning option; it comes with Oracle Total Recall), and over time it will simply drop old partitions as they become older than needed. The dropping of a partition doesn’t generate REDO or UNDO (as opposed to DELETE), and any indexes on the flashback data archive will be maintained through DDL as well (for example, dropping a table partition will drop the corresponding local index partitions).

Will the database halt if the Oracle Total Recall allocated space is exceeded?

The database will not halt, but DML in affected tables may be prevented. For example, if the tablespace containing the flashback data archive for a table is full and the FBDA process cannot log more changes in it, applications that attempt to modify the table will receive an ORA-55617 “Flashback Archive <name> has run out of space and tracking on <name> is suspended” error message. So the outage will be contained to only those tables in the affected flashback data archive.

There were a few more questions, but they were all variations on these. You can watch a replay of this Webcast anytime, at bit.ly/omagdbsecurity.

Back to Basics

We use indexed tables and well-tuned queries in our application, but some of the tables are not making use of the indexes.

The database table and index are

create table records
(system_no char(3),
cust_id char(10),
rec_no char(4),
start_date char(8),
end_date char(9),
...);

create index recpk
on records
(system_no,cust_id,rec_no);

The rec_no field contains the values in descending order from 9999 to 0001.

While selecting a row from the table, we are getting unexpected results—as if the index hadn’t been created. By “unexpected results,” I mean that the following query

select /*+ INDEX_ASC
(records recpk) */ *
from records
where system_id='123'
and cust_id='3456218791'
and rec_no>'0000'
and rownum<2
order by system_id asc,
cust_id asc,
rec_no asc;

is working fine in instance1, but it’s returning wrong information in instance2, and vice versa.

Your query—nothing else—is the problem. You cannot assume that a hint will be observed. A hint is a hint; it isn’t a directive. If the index is unavailable for any reason (due to a different name, because it’s unusable . . . whatever), the query will just ignore it.

The solution is to code the query the way I’ll demonstrate, and I recommend not using the INDEX_ASC hint at all.

What your query is asking for is to find the first record that matches your predicate on SYSTEM_ID, CUST_ID, and REC_NO. (The ROWNUM<2 portion of the predicate happens before the ORDER BY does.) Hence, you are getting a record—any record at all—that matches your predicate and then sorting it.

You are getting the right answer for both of your instances. The SQL is allowed to return pretty much anything here, because you’ve said, “Find the first record such that this condition is true, and then sort it.”

What you need to do instead is say, “Sort these matching records, and return the first one.” But you must ask the right question—that is imperative.

The right question for you would look like this:

 

select * 
  from 
(
select /*+ first_rows(1) */ * 
  from records 
 where system_id='123' 
   and cust_id='3456218791' 
   and rec_no>'0000'
 order by system_id
          asc,cust_id asc,
          rec_no asc
)
 where rownum < 2;

 

Next Steps


 ASK Tom
Tom Kyte answers your most difficult technology questions. Highlights from that forum appear in this column.
 

READ more Tom
 Oracle Database Concepts 11g Release 2 (11.2)
 Expert Oracle Database Architecture: Oracle Database Programming 9I, 10g, and 11g Techniques and Solutions, Second Edition

 WATCH the Oracle Database Security Webcast

READ more about
 Oracle Total Recall
 Oracle Real Application Testing

 DOWNLOAD Oracle Database 11g Release 2

FOLLOW Oracle Database
 on Twitter

 on Facebook
 

That tells the database to find the records that match your predicate, sort them, and then return the first one. The database probably doesn’t have to do that much work—it would tend to use the index all by itself (without a hint), now that it knows that you want the first row of that ordered set and only that row. The FIRST_ROWS(1) hint is a much better choice than your INDEX_ASC hint, because it tells the optimizer what your goal is, and the optimizer will find the fastest way to meet that goal. If your index is unavailable, the database will use a top-n query optimization to find the row as quickly as possible, but if an index is available, the database will tend to use that instead.

But you’ll always get the right answer, because you’ve asked the right question.

Oracle OpenWorld: One Thing 

I was not able to attend Oracle OpenWorld 2011, but I heard about your “Five Things” presentations. Could you walk through some of those? 

Yes, at Oracle OpenWorld 2011, I gave two presentations that started with “Five things you probably didn’t know about . . . .” One was for SQL, and the other was for PL/SQL. Here is one highlight from the SQL talk.

Oracle Net Services compression. Did you know that Oracle has been silently compressing your data on the network for quite a few years now? Oracle Net Services automatically compresses data in the data stream by putting only the changes from the previous row into the data stream from the server to the client—the deltas from the previous row, if you will. So, if you return two rows that contain many of the same values, the second row will not send back very many bytes at all—just the differences from the first row. This sort of compression works amazingly well on database data, especially when you use an ORDER BY clause.

Here is a small example demonstrating this compression. I’ll start with a test table:

 

SQL> create table t
  2  as
  3  select *
  4    from all_objects;
Table created.

SQL> begin
  2    dbms_stats.gather_table_stats
  3    ( user, 'T' );
  4  end;
  5  /
PL/SQL procedure successfully completed.

 

Now, using AUTOTRACE TRACEONLY STATISTICS in SQL*Plus, I’ll retrieve that data and measure the bytes transferred, as shown in Listing 1.

Code Listing 1: Testing Oracle Net Services compression—control

 

SQL> select * from t;
72228 rows selected.

Statistics
————————————————————————————————————————————————————
       5794  consistent gets
    8015033  bytes sent via SQL*Net to client
      53385  bytes received via SQL*Net from client
       4817  SQL*Net roundtrips to/from client
      72228  rows processed

 

Note that it took about 8 MB of network traffic to deliver that result set and that the query performed 5,794 logical I/Os (consistent gets). If I modify the query slightly by adding an ORDER BY clause, I can change those numbers dramatically. I’m going to use ORDER BY TIMESTAMP in this case, because I know that this column is very wide (19 bytes), is NOT NULL, and has few distinct values compared to the number of rows in the table. The results are shown in Listing 2.

Code Listing 2: Testing Oracle Net Services compression—ordered

 

SQL> select * from t order by timestamp;
72228 rows selected.

Statistics
—————————————————————————————————————————————————————
       1031  consistent gets
    3427630  bytes sent via SQL*Net to client
      53385  bytes received via SQL*Net from client
       4817  SQL*Net roundtrips to/from client
      72228  rows processed

 

That was pretty dramatic. It dropped from 8 MB to 3.4 MB of data transferred, and that was entirely due to this Oracle*Net compression taking place. Because the TIMESTAMP value repeated so often, it didn’t need to be sent over and over again. Additionally, you might have noticed that the consistent gets went from 5,794 down to 1,031. This was another side effect of the ORDER BY. The first run of the query read the data directly out of the table but did not sort it and did not need to write it into temporary space. So every time I fetched data (the default array fetch size in SQL*Plus is 15 rows at a time), I had to get a block from the buffer cache and get 15 rows from it. My table stores approximately 73 records per block, so that meant that when I did not use ORDER BY, I had to retrieve the first block from the cache about five times to get all 73 rows from it, 15 rows at a time. When I sorted the data, I needed to read all the rows and sort them into temporary memory or temporary space on disk. When I retrieved them 15 at a time, I had to read them out of temporary space—not from the buffer cache.

So, what can you take away from that? Should you be applying ORDER BY statements to all your SQL to reduce the data transferred and to decrease logical I/Os? Of course not. The cost of sorting the data would almost certainly outweigh any gains achieved by compression and the reduction in the number of logical I/Os. That would be adding a lot of work (sorting). What you can take away is that if you are already sorting data, as you must in order to meet the application requirements, you are deriving some benefit, likely in the form of reduced data on the network and possibly less buffer cache contention.

Now, what if I went a step further and sorted the data even more? Because TIMESTAMP repeats frequently and I know that OBJECT_TYPE and OWNER do as well (SYS owns a lot of TABLES, for example), I might expect to see a further reduction in transferred data, as shown in Listing 3.

Code Listing 3: Testing Oracle Net Services compression—very ordered

 

SQL> select * 
       from t
      order by timestamp, 
        object_type, owner;
72228 rows selected.

Statistics
————————————————————————————————————————————————————
       1031  consistent gets
    3280011  bytes sent via SQL*Net to client
      53385  bytes received via SQL*Net from client
       4817  SQL*Net roundtrips to/from client
      72228  rows processed

 

The transferred data dropped from 3.4 MB to 3.2 MB. Note that the logical I/Os did not change—they cannot. The query must read every block in the table at least once, so 1,031 is the minimum number of I/Os, but the amount of data being transferred decreased a little bit.

Now, you may be asking, “What happens if I get more than 15 rows at a time?” It seems logical that the array size could have some impact on the amount of data transferred. If I send more rows back at a time, I’ll have more data that could be repeating and I could likely compress it better. Indeed, testing proves that.

Table 1 shows the amount of data transferred (in megabytes), data transferred as a percentage of the original query (showing the percentage reduction), and number of consistent gets. The “No Order” columns represent the query executed without an ORDER BY, the “Some Order” columns represent the ORDER BY TIMESTAMP executions, and the “Very Ordered” columns represent the three-column sort. The number in the column headings represents the array size.

  No Order
15
Some Order
15
Very Ordered
15
No Order
100
Some Order
100
Very Ordered
100
Bytes Sent 8.01 MB 3.42 MB 3.28 MB 7.48 MB 2.90 MB 2.76 MB
% of Original 100% 43% 41% 93% 36% 34%
Consistent Gets 5,794 1,031 1,031 1,741 1,031 1,031

Table 1: Comparing Oracle*Net compression impact of 15- and 100-row fetches

As you can see, introducing a larger array size—100 instead of 15—had a material effect on the amount of data transferred. The original unordered query benefited from it by the same percentage as the other queries—they all dropped another 7 percent. Before you get too excited thinking, “If 100 was better than 15, then 1,000 will be even better”—don’t. You will hit the law of diminishing marginal returns here. I did the test with an array size of 1,000, and Table 2 shows the results.

  No Order
1,000
Some Order
1,000
Very Ordered
1,000
Bytes Sent 7.39 MB 2.82 MB 2.67 MB
% of Original 92% 35% 33%
Consistent Gets 1,105 1,031 1,031

Table 2: Comparing Oracle*Net compression impact of 1,000-row fetches

As you can see, 1,000 was not much different from 100. There was about a 1 percent drop in data transferred—not really worth it. Also, the amount of client and server memory needed to package this 1,000-row fetch was about an order of magnitude more than in the case of the 100-row fetch. In my experience, somewhere between 100 and 500 rows at a time generally works extremely well, with 100 being a historically good number for me. I recommend making the array size your applications use a configurable parameter so you can try various values.

I’ll post more of the “Five things you probably didn’t know about . . .” presentations in future columns.

 


Tom Kyte Headshot


Tom Kyte is a database evangelist in Oracle’s Server Technologies division and has worked for Oracle since 1993. He is the author of Expert Oracle Database Architecture (Apress, 2005, 2010) and Effective Oracle by Design (Oracle Press, 2003), among other books.

 

 



Send us your comments