11g logo

Oracle Database 11g:
The Top Features for DBAs and Developers

by Arup Nanda Oracle ACE Director

Database Replay

Learn how to use the Database Replay, the shiny new tool in Oracle Database 11g that captures complete database workload so you can "replay" it at will.

See Series TOC

What is your biggest concern when you need to make a change in the database—be it some minor change, such as altering initialization parameters and database attributes, or major but inevitable ones such as applying patchsets? What about for your upgrade to Oracle Database 11g itself?

For me, the biggest concern is the risk of the change "breaking" something. Even the most minor changes can have a domino effect, eventually producing a visible impact.

To minimize this risk, most shops make the change in a control environment similar to the production one, apply a workload similar to your production system's, and observe the impact. It's rather trivial, at least technologically speaking, to replicate your production system but reproducing the workload is a different story. That's easier said than done.

Most organizations attempt to do that using some third-party load generation tool that can run automatically to simulate real user activity. Although this approach may be adequate in most cases, it's never a truly faithful reproduction of your production database workload. These third-party tools merely execute a pre-written query several times with different parameters; you have to supply the query to the tool and give it a range of parameters that it can use randomly. This is not a representative workload of your production system but rather merely the running of a small part of your production workload executed several times—resulting in as little as 1 percent of your application code being tested. Worst of all, these tools require you to supply all the queries from the production workload yourself, which can take weeks or months for small applications or even up to a year for complex ones.

If you could, wouldn't it be a better approach to record all database operations—DML-related and otherwise—inside the database itself, and then replay them in the very sequence they occurred?

Enter Database Replay

In Oracle Database 11g, your wish is granted, and then some. The new Database Replay tool works like a DVR inside the database. Using a unique approach, it faithfully captures all database activity beneath the level of SQL in a binary format and then replays it either in the same database or in a different one (which is exactly what you would want to do prior to making a database change). You can also customize the capture process to include certain types of activity, or perhaps exclude some.

Database Replay delivers half of what Oracle calls Oracle Database 11g's Real Application Testing option; the other half is provided by another tool, SQL Performance Analyzer. The main difference between these two tools is the scope involved: whereas Database Replay applies to the capture and replay of all (subject to some filtering) activities in a database, SQL Performance Analyzer allows you to capture specific SQL statements and replay them. (You can't see or access specific SQLs captured in Database Replay, while in SQL Performance Analyzer you can.) The latter offers a significant advantage for SQL tuning because you can tweak the SQL statement issued by an application and assess its impact. (SQL Performance Analyzer is covered in a forthcoming installment in this series.)

Conceptually, Database Replay works in the sequence shown in the figure below.

Figure 1

You start a capture process that records activities against the database. 

  • The process writes the activities to special files called "capture files" in a directory called /capture directory/.
  • After a while you stop the capture process and move these capture files to a test system in a directory called /replay directory/.
  • You start a replay process and several replay clients to replay all these capture files.
  • The capture files are applied against the test database.


So, what does Database Replay provide that third-party tools don't? Well, other tools merely replay several synthetic statements, which you provide. In contrast, Database Replay does not need you to provide SQL statements. Since it captures all activity underneath the SQL, you don't risk missing out on some key operations that may be the root of any performance issue. In addition, since you can capture selectively—for specific users, programs, and so on—and you can specify a time period when the workload is captured, you can replay specific workloads that cause you problems, not the entire database.

For instance, you notice that the month-end interest calculation program is causing issues and you suspect that changing a parameter will ease the process. All you have to do is capture the workload for the duration the month-end program runs, make the change in parameter on a test system, and then replay the capture files on that test system. If the performance improves, you have your solution. If not, well, it's only a test system. You didn't impede the operation of the production database.

In my opinion, this tool alone makes the upgrade to Oracle Database 11g worthwhile. Now, I'll show you how it works.

Capturing

The first task is to capture the workload from your database. All the tasks are done either via command line or Oracle Enterprise Manager Database Control, but you'll use the latter here.

  1. The workload captured is stored in the system on files—the veritable "tape" inside your "camcorder." This directory should be empty. So, the first task will be to create the directory if you don't have one. For this example, create a directory called /home/oracle/dbcapture.
    
      $ cd /home/oracle
    $ mkdir dbcapture
  2.  Create a directory object in the database for this directory:
    SQL> create directory dbcapture as '/home/oracle/dbcapture';
          
    Directory created.
          
  3.  Now you are ready to initiate capture. To demonstrate a real-life example, you will create a simple test harness that will generate a lot of INSERT statements and insert into a table called TRANS.
    
    create table trans (
            trans_id        number,
            cust_name       varchar2(20),
            trans_dt        date,
            trans_amt       number(8,2),
            store_id        number(2)
    )
    /
          


    Here is the little PL/SQL code snippet that does the trick. It generates 1,000 insert statements and executes them. (Note that it generates 1,000 distinct insert statements, not inserts 1,000 times in the same statement or program.)

    declare
      l_stmt varchar2(2000);
    begin
      for ctr in 1..1000 loop
         l_stmt := 'insert into trans values ('||
            trans_id_seq.nextval||','||
            ''''||dbms_random.string('U',20)||''','||
            'sysdate - '||
            round(dbms_random.value(1,365))||','||
            round(dbms_random.value(1,99999999),2)||','||
            round(dbms_random.value(1,99))||')';
         dbms_output.put_line(l_stmt);
         execute immediate l_stmt;
         commit;
      end loop;
    end;
          
  4. Just create the file with the contents as above; do not run it. Call this file add_trans.sql. (All the above steps were necessary for this lesson only. With the exception of the directory object, they are not needed when you perform the operation in production.)
  5.  In the real world, you would probably run the replay on a different database. For our purposes here, however, you will merely flashback the same database and replay the activities there. You can mark this spot by creating a Restore Point called GOLD.
    
    SQL> create restore point gold;
          


    Now, you are ready to capture. Navigate to the main Database Replay page in Oracle Enterprise Manager Database Control. From the home page, choose Software and Support (shown in the figure below, marked by "1").

    Figure 2
  6.  Click Database Replay (under Real Application Testing) to launch the Database Replay page (see below).

    Figure 3
  7.  In the left-hand pane, you will see a series of activities. Choose the first activity ( Step 1: Capture Workload) by clicking on the Go to Task icon next to it.
  8. The next screen brings up three assumptions you should carefully examine and confirm before starting the capture process:

    • That the current database can be restored on the replay system to the SCN when workload capture begins
    • That there is enough disk space to hold the captured workload
    • That you are ready to restart the database before workload capture begins, if you choose to
  9. Tick all check boxes to acknowledge.
  10. Click Next.
  11. The next screen has two different action items. On the top half of the screen you will see two radio buttons for you to choose if you want to restart the database before the capture process.

    When you start the capture process, there could be in-flight transactions, some of which may be captured and some not. Restarting the database will void these in-flight transactions. Restarting the database clears this "noise". Furthermore, restarting the database gives you a clean backup to be restored on a test system, ensuring that you are replaying the activities on a system that has the same SCN number as the production system.

    For these reasons, especially the first one, Oracle recommends that you restart the database prior to capture (and this selection is default). But you don't have to. If you don't want to restart, choose the other radio button.

  12. The bottom part of the screen now shows something similar to that shown below.

     

    Figure 4



    Now you will record the filters that the capture process will take into account while capturing activities. Two filters are there by default: to exclude all activities coming from Oracle Management Server and those coming from Oracle Management Agent.


    You can add additional filters too. For example, to add a filter to exclude all perl programs, click Add Another Row and enter "perl" and "%perl%" in the fields "Filter Name" and "Value" respectively. Similarly, correct a small mistake in the default parameter—the value of the Oracle Management Agent filter should be "%emagent%", not "emagent%".

    Or, suppose you want to exclude all SYS user actions. Then you will need to choose USER from the Session Attribute drop down box and enter SYS in the "Value" column.

  13. Click Next. This brings up a screen similar to the one shown below:

     

    Figure 5
  14.  In this screen, choose the directory name from the drop-down box where capture files will be stored. In this case you have used the directory DBCAPTURE. If you have not created this directory as shown in earlier steps, you can still create it by clicking Create Directory Object. Then click Next.
  15.  In the next screen you will see the Job Details such as when it needs to be executed and so on. Choose the radio button Immediate to execute this immediately.
  16. Fill in the other details in the page such as the OS username, SYS password, and so on, and click Next.
  17.  The next screen, labeled "Step 5 of 5", shows you all the information you entered such as the job name and the exclusion filters. If everything looks as you desired, Click Submit. Otherwise you can go back to make changes.
  18.  Once you hit Submit, the workload capture will start. You will see a confirmation screen as shown below.

     

    Figure 6



    Note the Status, which shows "In Progress".
  19.  Now that the workload is being captured, run your simulation workload from a SQL*Plus prompt. Of course, in a real-life system, you will not need to run any simulation; you will merely let the capture run for a while to capture all your workload.
    
    SQL> connect arup/arup
    SQL> @ins_trans
          

    This will execute 1,000 insert statements into the table TRANS. After the workload is completed, click the Stop Capture button as shown in the screen above. You will be asked to confirm.
  20.  Oracle takes Automated Workload Repository (AWR) snapshots automatically before and after the workload capture. In the next screen you will be asked if you want to export the AWR data. This is important if you replay on a different system and you would want to export the AWR data from this database to the target database, as shown in the screen below. Click Yes.

     

    Figure 7
  21.  This will create a Scheduler Job to export the AWR. Click on the job name and refresh the status screen until you see the jobs disappears from the Running tab.

You have just captured the workload in the files in the directory /home/oracle/dbcapture!

Pre-processing

Now that you have captured the workload, you can replay it. Usually, you will want to replay in a separate, test system so you will need to copy the files in the directory /home/oracle/dbcapture to a new host. Make sure that the directory is empty before you copy files to it. For learning purposes, here you will use the same database for replay.

Replaying in the same database is an uncommon but conceivable requirement. For example, you may want to replay the transactions in your main system and after testing is complete flashback to the starting point. You may have an outage window within which you want to test the effect of a parameter change, which you would do in the same database.

You will need to pre-process the workload captured before you can play it. Pre-processing makes these captured files ready for replay.

  1. Go the main Database Replay page.

  2. Select Step 2: Preprocess Workload.

  3. Choose the directory object from the drop down list box. It will show the captured workload. In your case, it's DBCAPTURE. If you have not created the directory object, you can easily create the directory by clicking the appropriate button.

  4. Click Preprocess Workload.

  5. In the next page you will be asked to provide a job name and the associated details like host username and password. Accept the defaults unless you want a specific job name. Choose to run this job immediately. The host userid and passwords should be already populated. If they are not, enter the appropriate values; click Submit.

  6. In the next page, you will see a conformation and a link to see the job status. Click on it.

  7. Refresh this screen until you see the status as "Succeeded."

The workload has now been pre-processed and is ready for replay.

 

Replaying

After the workload is captured and pre-processed, you can replay it in the test database. Again, for learning purposes, you have pre-processed the workload in the same database and will use the same database to replay the activities. To do so, you have to reset the database back to the starting point. You can easily do that by flashing it back to the restore point GOLD you created during capture process.

SQL> shutdown immediate;
... database shuts down ...
SQL> startup mount
... instance starts and mounts the database ...
SQL> flashback database to restore point gold;
... database will be flashed back ...
SQL> alter database open resetlogs;
... database is opened ...


Now you are at a point before the workload started and you can replay the workload you captured earlier. Follow the steps below to replay it.

  1. Go to the main Database Replay screen from the Database homepage as shown in the "Capturing" section.
  2. From the menu, select Step 3: Replay Workload. This will take you to the main Replay screen.
  3. You will see a drop down box for choosing the directory. Choose the directory where you placed the replay files. This is the directory object; not the actual UNIX directory. In the earlier example, you used the directory object DBCAPTURE, so choose that one. If you have not created the directory yet, you can click Create Directory and create a directory object.
  4. Click Setup Replay in the top right-hand corner.
  5. The next screen brings up a list if information about what is about to happen. Here is the lowdown on each of these informational items.
  6. Restore Database—When you replay a workload captured earlier, you are probably doing it on a test system. How did you build the test system? You probably restored the production database to the test system and restored it. Most likely the production database has not been shut down for this activity so your recovery is probably incomplete. In that case, confirm that the recovery operation is performed up to the SCN number specified during capture as well as preprocessing steps.


    In this example, you flashed back the database to that SCN number. So, you're in compliance.

  7. Perform System Changes—This is why you are doing the replay in the first place: to test a system change such as a parameter change or a setting change. Well, of course, you have to make the change prior to replay.

  8. Resolve References to External Systems—Suppose you had a directory object in the production database pointing to /home/appman/myfiles and this directory doesn't exist on the test system. When you replay, the references to this directory will fail. Similarly, all the DB Links in the source system will fail in the test system if they do not exist. So, you have to resolve all of them by creating or changing them. The next screen will allow you to change them.
  9. Set Up Replay Clients—You will see how to do that in subsequent steps.
  10. Click Continue, which bring up a screen such as the following:


    Figure 8



    You can change all the unreferenced parameters by clicking the links shown on the page. Please note that you will navigate away from the Database Replay page when you click on any one of them. So, it's preferable to change them separately, in SQL*Plus. Click Continue.
  11.  Enter a Replay Name or accept the default.
  12.  The next screen shows you some potential issues due to unresolved references to DB links, directories, and so on.


    Figure 9



    If you like you can make changes to the replay system on the right-hand side of the screen. In this example, as you are running on the same database, this step will not be necessary.
  13.  Click Next. This will bring up a screen as shown in the figure below:

     

    Figure 10



    This screen shows that the replay process is now waiting for replay clients. The replay clients are executed from outside the Database Control screen. These are client programs that read the captured workload and replay them. The program is named wrc (both on UNIX and Windows systems). To start a replay client, you need to go to the UNIX prompt and execute the following line:
    
    $ wrc userid=system password=* replaydir=/home/oracle/dbcapture
                    
  14.  Of course, you need to supply the correct password for SYSTEM. Change the directory name if you stored the captured files in a different place. It should return with the following message:
    
    Workload Replay Client: Release 11.1.0.6.0 - Production on Tue Sep 4 19:50:44 2007
     
    Copyright (c) 1982, 2007, Oracle.  All rights reserved.
     
    Wait for the replay to start (19:50:44)
                      
  15.  At this time the replay client is merely waiting for the replay governor (Database Control) to tell it to start. You may decide to start more clients to process the workload in parallel.
  16.  Immediately go to the Database Control screen. You should see the screen has changed to display the fact that the replay clients are connected. It displays the host name they connected from, the OS process id, and so on:

     

    Figure 11
  17.  Click Next and then Submit to start the replay process. If you go to the UNIX session now, you will see an additional message: "Replay started (01:49:56)". The screen will display the progress bar that displays how much of the data has been processed so far.
  18.  After some time the UNIX session will show "Replay finished (01:50:35)". At that time, if you check the Database Control screen, you will see a screen similar to the following.

     

    Figure 12
  19. This shows the detailed status of the replay job. The key field is "Status" at the upper left-hand corner that shows "Completed", indicating that the job is complete.
  20. Now you can analyze the run. The screen shows the metrics on the lower half of the screen, under heading "Comparison". In this example it indicates that the replay completed in 39.08% of the capture time. So, is this good news? Were the changes you implemented effective?

     

    Figure 13


    Not necessarily. Look at the next metric—Database Time—which is 180% of the capture. To dig more, click on the tab Report, which brings up the screen shown below:

     

    Figure 14
  21.  This screen shows various options for reports. Start with the simplest, the Workload Report. This report does not compare performance but shows you "divergence"—how much data was different in the replay. For instance, if you had a record with ID 3, it was updated and later deleted. During replay, suppose it was first deleted and then updated; that would qualify as divergence. The less the divergence, the more accurate the replay.
  22.  But don't stop there. For a definitive analysis, examine the AWR Compare Period Report, shown just below, for periods during the capture and the replay and see the difference for many other metrics such as latch contention, locks, redo generation, consistent gets, and so on, which give you a much better and clearer picture of the impact of your changes.

     

    Figure 15



    This report shows the differences between the capture and replay loads. During replay, physical writes and reads went up to 367% and 111% of that during capture respectively. Other parameters, such as sorts and logical reads, also went up, albeit not so dramatically. So you could conclude that whatever changes were made hurt performance rather than helped it.

 

Release 2 Addendum:

In Release 2, several new types of information are captured into the AWR repository. These new data sources are exposed as views with the name DBA_HIST in them.

DBA_HIST_DISPATCHER
DBA_HIST_DYN_REMASTER_STATS
DBA_HIST_IOSTAT_DETAIL
DBA_HIST_SHARED_SERVER_SUMMARY
DBA_HIST_SQLCOMMAND_NAME
DBA_HIST_TOPLEVELCALL_NAME


One view is particularly important to mention here. DBA_HIST_IOSTAT_DETAIL displays I/O statistics aggregated by combination of file type and function (component). This view contains snapshots of V$IOSTAT_FILE and V$IOSTAT_FUNCTION. Here is a snapshot of the contents:

SNAP_ID DBID INSTANCE_NUMBER FUNCTION_ID FUNCTION_NAME
---------- ---------- --------------- ----------- ------------------------------
FILETYPE_ID FILETYPE_NAME SMALL_READ_MEGABYTES
----------- ------------------------------ --------------------
SMALL_WRITE_MEGABYTES LARGE_READ_MEGABYTES LARGE_WRITE_MEGABYTES SMALL_READ_REQS
--------------------- -------------------- --------------------- ---------------
SMALL_WRITE_REQS LARGE_READ_REQS LARGE_WRITE_REQS NUMBER_OF_WAITS WAIT_TIME
---------------- --------------- ---------------- --------------- ----------
1099 1883196973 1 2 LGWR
1 Control File 4
4 0 0 279
286 0 0 279 985

1099 1883196973 1 2 LGWR
3 Log File 0
586 0 494 52
165586 0 1438 104 457

1099 1883196973 1 5 Streams AQ
1 Control File 13
4 0 0 861
246 0 0 861 1471

1099 1883196973 1 5 Streams AQ
2 Data File 5
0 0 210 569
82 0 210 610 19582

1099 1883196973 1 8 Buffer Cache Reads
2 Data File 664
0 346 0 62272
0 779 0 50136 396220

1099 1883196973 1 9 Direct Reads
2 Data File 62
0 1139 0 761
4 2340 1 0 0

1099 1883196973 1 10 Direct Writes
2 Data File 0
4 0 6 0
316 0 24 0 0

This output shows clearly the different types of IO calls (small or large reads, etc.) issued by components (DBWR, LGWR, Streams, Data Pump etc.) on different types of files such as Datafiles, Controlfiles, Redo Log files and Data Pump files.


Furthermore, in Oracle Database 11g Release 2, Database Replay has a new feature called scale up multiplier that can increase workload multiple times on the database during workload replay, as compared to the workload during the capture.

Let’s see how to use this feature with an example. We will start with a captured workload captured in the directory /u01/ratcapture. We have also defined a directory object called RATCAPTURE on that unix directory. With this information, we can replay the captured files. Let me demonstrate that that with the Database Workload Capture API instead of Enterprise Manager. (As of this writing, Enterprise Manager did not have the provision to put the special parameter required, so API was the only choice anyway.)

Here are the steps. At this point the workload has been captured and ready for processing at the target system.

First we will preprocess the captured files.

begin
  dbms_workload_replay.process_capture (capture_dir => 'RATCAPTURE');
end;
/


We will need to initialize the database for the replay. We have to give a name to the replay; let’s choose REPLAY1. 


begin
        dbms_workload_replay.initialize_replay (
                replay_name     => 'REPLAY1',
                replay_dir      => 'RATCAPTURE'
);
end;
/


We will then prepare the database for replay. This is where we will use a special parameter to scale up. This parameter – named appropriately scale_up_multiplier – when set to 10 increases the workload replay to 10x the original rate during capture.


begin
   dbms_workload_replay.prepare_replay (
      synchronization              => 'SCN',
      connect_time_scale           => 100, 
      think_time_scale             => 100,
      think_time_auto_correct      => TRUE,
      scale_up_multiplier          => 10
   );
end;
/


On a separate Unix session we will need to start the wrc client. This will wait for the actual replay to start.


# wrc system/oracle REPLAYDIR=/u01/ratcapture

Workload Replay Client: Release 11.2.0.1.0 - Production on Thu Aug 12 17:24:30 2010

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.


Wait for the replay to start (17:24:30)


On the first SQL*Plus session we will start the replay.


begin
  dbms_workload_replay.start_replay;
end;
/  


When the replay starts, the wrc client will start processing as well. It will display that fact along with the time it started.


Replay started (17:27:04)


After the replay has consumed all the captured workload, it will show that it is finished, along with the time and return to the unix prompt.


Replay finished (17:29:51)


ow is the time to analyze what happened during the replay, or look at the Replay Report. To do that, we will need to find out the replay ID:


SQL> select id, name, scale_up_multiplier
  2* from dba_workload_replays;

ID    NAME                           SCALE_UP_MULTIPLIER
---------- ------------------------------                         -------------------
    4 REPLAY-D112D2-20100812170729                     1
    6 REPLAY1                                         10


We can see that there were two replays. Once, with ID 4 was run without any scale up. The other, we just ran, was scaled up 10 times.


Next, we'll get the report for the ID 6. You can do that by executing the function REPORT from the package dbms_workload_replay. The output of the function is the report in CLOB datatype. You need to set LONGSIZE first before making the select, otherwise it will show only the first 80 bytes by default.

SQL> set longsize 999999
SQL> select dbms_workload_replay.report (
  2>   replay_id          => 6,
  3>   format             => 'TEXT'
  4> )
  5> from dual;
Here is the output:

 


DB Replay Report for REPLAY1
------------------------------------------------------------------------

-------------------------------------------------------------------------
| DB Name | DB Id | Release | RAC | Replay Name | Replay Status |
-------------------------------------------------------------------------
| D112D2 | 1883196973 | 11.2.0.1.0 | NO | REPLAY1 | COMPLETED |
-------------------------------------------------------------------------

Replay Information
------------------------------------------------------------------------
| Information | Replay | Capture |
------------------------------------------------------------------------
| Name | REPLAY1 | CAPTURE-D112D2-20100812162527 |
------------------------------------------------------------------------
| Status | COMPLETED | COMPLETED |
------------------------------------------------------------------------
| Database Name | D112D2 | D112D2 |
------------------------------------------------------------------------
| Database Version | 11.2.0.1.0 | 11.2.0.1.0 |
------------------------------------------------------------------------
| Start Time | 12-08-10 21:27:04 | 12-08-10 20:28:47 |
------------------------------------------------------------------------
| End Time | 12-08-10 21:27:10 | 12-08-10 20:28:59 |
------------------------------------------------------------------------
| Duration | 6 seconds | 12 seconds |
------------------------------------------------------------------------
| Directory Object | RATCAPTURE | RATCAPTURE |
------------------------------------------------------------------------
| Directory Path | /u01/ratcapture | /u01/ratcapture |
------------------------------------------------------------------------

Replay Options
---------------------------------------------------------
| Option Name | Value |
---------------------------------------------------------
| Synchronization | SCN |
---------------------------------------------------------
| Connect Time | 100% |
---------------------------------------------------------
| Think Time | 100% |
---------------------------------------------------------
| Think Time Auto Correct | TRUE |
---------------------------------------------------------
| Number of WRC Clients | 1 (1 Completed, 0 Running ) |
---------------------------------------------------------

Replay Statistics
------------------------------------------------------------
| Statistic | Replay | Capture |
------------------------------------------------------------
| DB Time | 5.047 seconds | 0.407 seconds |
------------------------------------------------------------
| Average Active Sessions | .84 | .03 |
------------------------------------------------------------
| User calls | 1120 | 112 |
------------------------------------------------------------
| Network Time | 3.748 seconds | . |
------------------------------------------------------------
| Think Time | 24.591 seconds | . |
------------------------------------------------------------

Replay Divergence Summary
-------------------------------------------------------------------
| Divergence Type | Count | % Total |
-------------------------------------------------------------------
| Session Failures During Replay | 0 | 0.00 |
-------------------------------------------------------------------
| Errors No Longer Seen During Replay | 0 | 0.00 |
-------------------------------------------------------------------
| New Errors Seen During Replay | 0 | 0.00 |
-------------------------------------------------------------------
| Errors Mutated During Replay | 0 | 0.00 |
-------------------------------------------------------------------
| DMLs with Different Number of Rows Modified | 0 | 0.00 |
-------------------------------------------------------------------
| SELECTs with Different Number of Rows Fetched | 0 | 0.00 |
-------------------------------------------------------------------


Now compare this report to the one with a scale up multiplier was left to default 1. This is the replay with ID 4 (from the output of DBA_WORKLOAD_REPLAYS view shown earlier).


DB Replay Report for REPLAY-D112D2-20100812170729
------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------
| DB Name | DB Id | Release | RAC | Replay Name | Replay Status |
------------------------------------------------------------------------------------------
| D112D2 | 1883196973 | 11.2.0.1.0 | NO | REPLAY-D112D2-20100812170729 | COMPLETED |
------------------------------------------------------------------------------------------

Replay Information
-----------------------------------------------------------------------------------
| Information | Replay | Capture |
-----------------------------------------------------------------------------------
| Name | REPLAY-D112D2-20100812170729 | CAPTURE-D112D2-20100812162527 |
-----------------------------------------------------------------------------------
| Status | COMPLETED | COMPLETED |
-----------------------------------------------------------------------------------
| Database Name | D112D2 | D112D2 |
-----------------------------------------------------------------------------------
| Database Version | 11.2.0.1.0 | 11.2.0.1.0 |
-----------------------------------------------------------------------------------
| Start Time | 12-08-10 21:13:17 | 12-08-10 20:28:47 |
-----------------------------------------------------------------------------------
| End Time | 12-08-10 21:13:22 | 12-08-10 20:28:59 |
-----------------------------------------------------------------------------------
| Duration | 5 seconds | 12 seconds |
-----------------------------------------------------------------------------------
| Directory Object | RATCAPTURE | RATCAPTURE |
-----------------------------------------------------------------------------------
| Directory Path | /u01/ratcapture | /u01/ratcapture |
-----------------------------------------------------------------------------------

Replay Options
---------------------------------------------------------
| Option Name | Value |
---------------------------------------------------------
| Synchronization | SCN |
---------------------------------------------------------
| Connect Time | 100% |
---------------------------------------------------------
| Think Time | 100% |
---------------------------------------------------------
| Think Time Auto Correct | TRUE |
---------------------------------------------------------
| Number of WRC Clients | 1 (1 Completed, 0 Running ) |
---------------------------------------------------------

Replay Statistics
-----------------------------------------------------------
| Statistic | Replay | Capture |
-----------------------------------------------------------
| DB Time | 1.458 seconds | 0.407 seconds |
-----------------------------------------------------------
| Average Active Sessions | .29 | .03 |
-----------------------------------------------------------
| User calls | 112 | 112 |
-----------------------------------------------------------
| Network Time | 0.087 seconds | . |
-----------------------------------------------------------
| Think Time | 2.786 seconds | . |
-----------------------------------------------------------

Replay Divergence Summary
-------------------------------------------------------------------
| Divergence Type | Count | % Total |
-------------------------------------------------------------------
| Session Failures During Replay | 0 | 0.00 |
-------------------------------------------------------------------
| Errors No Longer Seen During Replay | 0 | 0.00 |
-------------------------------------------------------------------
| New Errors Seen During Replay | 0 | 0.00 |
-------------------------------------------------------------------
| Errors Mutated During Replay | 0 | 0.00 |
-------------------------------------------------------------------
| DMLs with Different Number of Rows Modified | 0 | 0.00 |
-------------------------------------------------------------------
| SELECTs with Different Number of Rows Fetched | 0 | 0.00 |
-------------------------------------------------------------------


Did you note some important differences?

  • The replay duration was 6 seconds in scaled up case, compared to 5 seconds in the non-scaled case
  • The DB Time was 5.047 seconds compared to 1.458 seconds
  • Average active sessions and user calls were 0.84 and 1120 respectively while these numbers were 0.29 and 112 in the default case. Incidentally user calls is also 112 during capture, the same as the non-scaled up case. Since the scaled up factor was 10, the number of user calls increased to 10 times.

As you can clearly see, the scaled up multiplier increased the number of user sessions by that number. The test not only replayed the activities captured in production faithfully but scaled them up as well to check whether the system will be able to handle it.

There is something important you should keep in mind: The workload captured was just a SELECT query. Had there been DMLs, only one set of replays would have issued those statements and the others would have replayed only SELECTs. Running the same delete statement by 5 different replay clients would not have resulted in 5 deletes and 5 insert statements probably would have resulted in four failures due to primary key violation.

Also, consider this: say you have captured all the workload from the production system and replayed it on a test system. After some rounds of testing and tuning, you promoted the test system to production. Suppose one application still needs some further schema changes (such as additional indexes or different partitioning) and naturally you want to make sure those changes will enhance the performance and, more important, will not cause additional issues. Replaying the workload one more time with those schema changes in place is the most definitive way to address that concern.

How do you make sure you replay the activities of only a specific user, e.g. SH? One option is to capture the workload one more time from production by adding a filter that captures the activities of SH only and replaying that new workload on the test system. But this approach may not be feasible due to many causes:

  • The new workload has been captured as of a timeframe different from that in the test system making the replay ineffective.

  • Capturing the new workload may not be even possible because the old production is no longer working and the new production (the old test) has no activity related to SH schema.

  • Even if it were possible to capture the workload once again, you may not want to do it to avoid the risk of affecting production performance, or

  • Other logistical challenges like change control restrictions

What can you do?

In Oracle Database 11g Release 2, you have a simple choice: just replay selective workload during the replay phase. So, not only can you apply a filter during capture, you can apply a filter during replay as well. Let’s see how this works.

During the replay, you can create filters and use them. Let’s create a filter to use only the SH user:

begin
        dbms_workload_replay.add_filter (
                fname      => 'SH_ONLY',
                fattribute => 'USER',
                fvalue     => 'SH'
        );
end;
/


You can define any number of filters on many users. What’s more, you can define filters on different attributes, not just usernames. In the parameter FATTRIBUTE, you can use these values and in the FVALUE parameter you can use the value of your interest. Here are the possible values for attributes:


USER 
 the usernames of the calling users
MODULE 
 the module
ACTION 
 the action
PROGRAM 
 the program
SERVICE 
 service_name of the calling user
CONNECTION_STRING 
 the connect string used (like the one you use in tnsnames.ora file) 


After adding the filter, create a filter set:


begin
        dbms_workload_replay.create_filter_set (
                replay_dir       => 'RATCATURE',
                filter_set       => 'SH_ONLY',
                default_action   => 'INCLUDE'
        );
end;
/


Note the parameter default_action, set to INCLUDE, which is the default. It specifies the default action of the filter set. In this case INCLUDE means that the filters are included, i.e. SH user is included in the workload that is replayed. Had you wanted just the opposite, i.e. running the action of all the users except SH, you would have used EXCLUDE as a value of this parameter.

Once a filter set is created, you can then use it in replay. After the database is initialized prior to the replay:


begin
        dbms_workload_replay.initialize_replay (
                replay_name     => 'REPLAY1',
                replay_dir      => 'RATCATURE'
);
end;
/


You can use the filter set:


begin
        dbms_workload_replay.use_filter_set (
                filter_set     => 'SH_ONLY'
        );
end;
/


The ability to replay a subset of the workload is very useful in cases where you perform selective testing – in specific schemas, service names and so forth. It saves you from capturing the workload multiple times. Ideally you capture the entire workload of production only once and then replay selectively to test various components independently.

 

Use Cases

Database Parameter Change—Consider this scenario: You are wondering if you should change the default value of the parameter db_file_multiblock_read_count from 16 to 256. Well, is 256 good, or should you set it to 128? Or, should it be 64 or 32? The choices are finite but the impact may be limitless; changing the value has a profound impact on the optimizer so what might help a query can break 100 others. How can you determine the optimal value of the parameter?

Database Replay comes in very handy in this situation. You can capture the workload from the production system, move the captured load into a different test system, set the db_file_multiblock_read_count to 32, and then replay the workload. Then you can flashback the database to the original state, set the value to 64, and replay the same workload. You can re-execute this cycle of flashing back, setting the value, and replaying the captured load for all possible values of the parameter. In each of the replays, you will run the AWR reports before and after each replay and compare them. You can then choose the value of the parameter that yields the best overall result. Without Database Replay, it would have been impossible to determine the best value.

OS Upgrades—You are planning to upgrade the OS or even apply a small patch to fix an I/O problem, but how can you ensure that it will not break anything or introduce some other problem? Simple: Just capture the load and replay it in a test system where the patch is applied. This technique applies to kernel parameter changes as well.

Applying Patches—Say you hit a bug and there is a patch available for it. But you are not sure what impact it will have on existing operations and, of course, you and 1,000 others in your organization would give anything to find out. Database Replay is your answer.

Debugging—There's always a pesky program (or programs?) that throw out some results that you don't expect. Fortunately, debugging has never been easier with the Database Replay. Just capture the workload during which the programs run, move to a new system, change the program logic to put in some debugging information, replay the workload, analyze the output, and look like a hero. If it doesn't work the first time, don't lose heart. Repeat the process (from the replay onward; you don't need to capture again) until you find the solution.

Object Changes—You want to add an index or convert an index from b-tree to bitmap. What impact will it have on the INSERT statements? And where? Don't speculate; just get the captured workload and replay it in the test system.

Database Upgrades—This is the holy grail of change assurance. The time has come for upgrading to Oracle Database 11g. The zillion-dollar question is: Will all your applications work just as well, or even better? Instead of speculating, just capture the workload from Oracle Database 10g and replay it in Oracle Database 11g. You are not testing some synthetic transactions on the new version—rather, you are testing the very same SQL your applications use every day. If something doesn't go as planned, tune it in the new system until you are absolutely satisfied with the outcome.

Platform Changes—Suppose you want to migrate your database platform from Solaris to HP-UX, where asynch I/O is not available for filesystems. Will the performance be same? Why speculate? Just capture the workload in Solaris and replay in HP-UX.

Conversion to Oracle Real Application Clusters (RAC)—This is a common question: you are planning to convert the database from a single instance to a RAC one. Will the applications behave the same way? The only way to find out is to run an actual workload, capture it, and replay in the RAC database.

Conclusion

Change is never painless but it doesn't have to be unbearable either. You can mitigate many risks by capturing the exact activities your end users put into the system using the new Database Replay tool and then replay them on a test system to accurately gauge the impact of change, all through a few mouse clicks and key strokes. Remember, you can test the functionality of applications as well, not just performance.

Back to Series TOC