Fast-start checkpointing refers to the periodic writes by the database writer
(DBWn) processes for the purpose of writing changed data blocks from the Oracle
buffer cache to disk and advancing the thread-checkpoint. Setting the database
parameter FAST_START_MTTR_TARGET to a value greater than zero enables the fast-start
checkpointing feature. Fast-start checkpointing should always be enabled for
the following reasons:
- It reduces the time required for cache recovery, and makes instance recovery
time-bounded and predictable. This is accomplished by limiting the number
of dirty buffers (data blocks which have changes in memory that still need
to be written to disk) and the number of redo records (changes in the database)
generated between the most recent redo record and the last checkpoint.
- Fast-Start checkpointing eliminates bulk writes and corresponding I/O spikes
that occur traditionally with interval- based checkpoints, providing a smoother,
more consistent I/O pattern that is more predictable and easier to manage.
If the system is not already near or at its maximum I/O capacity, fast-start
checkpointing will have a negligible impact on performance. Although fast-start
checkpointing results in increased write activity, there is little reduction
in database throughout, provided the system has sufficient I/O capacity.
If determining the time to recover from an instance failure is a necessary
component for reaching required service levels, then FAST_START_MTTR_TARGET
should be set to the desired mean-time to recovery (MTTR). For example, if service
levels dictate that when a node fails, instance recovery time can be no more
than 3 minutes, FAST_START_MTTR_TARGET should be set to 180.
When enabling fast-start checkpointing, the following initialization parameters
should be removed or disabled (set to 0):
- LOG_CHECKPOINT_INTERVAL
- LOG_CHECKPOINT_TIMEOUT
- FAST_START_IO_TARGET
Although LOG_CHECKPOINT_INTERVAL and LOG_CHECKPOINT_TIMEOUT can also be used
to control checkpointing to some degree (hence influencing MTTR), neither takes
into consideration the number of data blocks that correspond to the amount of
redo generated since the last checkpoint, nor do they consider the time it actually
takes to read redo blocks or process data blocks during recovery. This results
in checkpoint behavior that remains static as the environment changes, making
it unfeasible to target a specific MTTR.
When fast-start checkpointing is enabled, Oracle automatically maintains the
speed of checkpointing so that the requested MTTR is achieved. Setting a non-zero
value for LOG_CHECKPOINT_INTERVAL or LOG_CHECKPOINT_TIMEOUT interferes with
fast-start checkpointing, resulting in a different MTTR than expected.
Fast-Start Checkpointing and Performance
When using fast-start checkpointing, performance may be influenced in the following
ways (as compared to typical interval-based checkpointing):
- Instance and crash recovery times are reduced significantly
- Database throughput is not affected unless using an aggressive FAST_START_MTTR_TARGET
setting
- The number of physical writes to the disk increases
- The I/O profile is smoother, more consistent, and more predictable
Overview of MTTR and Crash Recovery Components
FAST_START_MTTR_TARGET is designed to control the time it takes the database
to do crash recovery for a single instance. Crash recovery implies that there
are no running instances and that an instance restart is a necessary step. The
main components of crash recovery are:
- Instance startup
- Opening of all database files
- Cache recovery (rolling forward)
- Transaction recovery (rolling back)
There is a timing associated with each component. However, since transaction
recovery is done by SMON in the background along with normal user activity after
the database is opened, it is not factored into fast-start checkpointing calculations.
The timings for instance startup and the opening of all datafiles are taken
from the actual values for the currently running instance. For example, if it
takes 50 seconds to start up the instance and 10 seconds to open and set up
all the datafiles, those are the values used in calculating how aggressively
the checkpoint needs to advance in order to ensure MTTR is within the FAST_START_MTTR_TARGET
setting.
The time required to do cache recovery is determined by two factors:
- The amount of redo (changes in the database) that needs to be processed
during recovery. This is determined by the number of redo log blocks between
the last thread checkpoint and the end of the redo log.
- The number of data blocks that need to be processed during recovery, which
is determined by the number of dirty blocks (changes to blocks in memory which
have not yet been written to disk) in the cache at the time of the crash.
With fast-start checkpointing, Oracle automatically advances the thread checkpoint
to control the amount of redo needed for recovery and limit the number of dirty
buffers remaining in the cache so that recovery time is bounded. For example,
in a large environment it may take 50 seconds to start up the instance and 10
seconds to open and set up all the datafiles. The final component is cache recovery,
or the roll forward phase. With FAST_START_MTTR_TARGET set to 180, the Oracle
server uses fast-start checkpointing to limit the cache recovery time to 120
seconds. Oracle maintains statistics about previous recoveries to estimate how
long a future recovery would take. The statistics are weighted such that as
more, larger recoveries are completed, the estimates are more accurate. This
information is stored in the control file
Fast-Start Advisory
Starting in Oracle9i Release 2 (9.2), MTTR advisory is available to help
you evaluate the effect of different MTTR settings on system performance in
terms of extra physical writes.
How MTTR Advisory Works
When MTTR advisory is enabled, after the system runs a typical workload for
a while, you can query V$MTTR_TARGET_ADVICE, which tells you the ratio of estimated
number of cache writes under other MTTR settings to the number of cache writes
under the current MTTR. For instance, a ratio of 1.2 indicates 20% more cache
writes. By looking at the different MTTR settings and their corresponding cache
write ratio, you can decide which MTTR value fits your recovery and performance
needs. V$MTTR_TARGET_ADVICE also gives the ratio on total physical writes (including
direct writes), and the ratio on total I/Os (including reads). Enabling MTTR
Advisory Enabling MTTR Advisory involves setting two parameters:
- STATISTICS_LEVEL
- FAST_START_MTTR_TARGET
Make sure that STATISTICS_LEVEL is set to TYPICAL or ALL. To enable MTTR advisory,
set the initialization parameter FAST_START_MTTR_TARGET to a nonzero value.
If FAST_START_MTTR_TARGET is not specified, then MTTR advisory will be OFF.
When MTTR advisory is ON, it simulates checkpoint queue behavior under five
different MTTR settings: Current MTTR, 0.1, 0.5, 1.5, and 2.
Viewing MTTR Advisory
Starting
in Oracle9i Release 2 (9.2), a dynamic performance view is provided for viewing
statistics or advisories collected by MTTR advisory. Enterprise Manager also
provides a graph to view MTTR statistics, which maps MTTR target values against
changes in total I/O. The MTTR graph from Oracle Database 10g Enterprise Manager
is shown on the right.
If MTTR advisory has been turned on, V$MTTR_TARGET_ADVICE shows the advisory
information collected. Usually this view show five rows, corresponding to the
current MTTR, 0.1 times the current MTTR, 0.5 times the current MTTR, 1.5 times
the current MTTR and 2 times the current MTTR. However, if one or more of the
5 values are less than the smallest MTTR target the system can sustain, their
corresponding rows are replaced with a single row corresponding to the smallest
MTTR target the system can have. Similarly, if one or more of the 5 values are
larger than the worst-case MTTR target the system can have, their corresponding
rows are replaced with a single row corresponding to the worst-case MTTR target
the system can have. If MTTR advisory is currently OFF, the view shows information
collected the last time MTTR advisory was on.