on Oracle Linux
Adapted from an Oracle white paper written by Rick Stehno
Published April 2014
This article describes how to obtain performance gains and improve response times by combining Oracle's Sun Flash Accelerator F40 PCIe Card with the Database Smart Flash Cache feature of Oracle Database running on Oracle Linux with the Unbreakable Enterprise Kernel.
These are the major tasks you'll need to perform:
Note: The capability described in this article is supported only on the Oracle Linux and Oracle Solaris operating systems beginning with Oracle Database 11g Release 2. The procedures in this article apply to Oracle Linux.
Online transaction processing (OLTP), data warehousing (DW), and analytics are typical uses for Oracle Database. OLTP and DW require fast response times and high throughput, making it difficult for database administrators (DBAs) to maintain and scale their infrastructure as the number of users grows and the amount of data increases. While performance bottlenecks might appear in several areas, including the network and the processor, they are most often caused by slow hard disk drives.
Flash-based storage provides performance that falls between the performance levels of hard disk drives and DDR3 memory. For example, the typical response time for a small data read from a hard disk drive is 5 milliseconds. Flash-based devices do this in 50 microseconds to 300 microseconds. Initial implementations of flash-based drives (SSDs, or solid-state drives) were intended to replace a hard disk drive in direct-attach storage or RAID subsystems. Mounting SSDs on a PCIe card is a recent innovation that alleviates throughput constraints that are also caused by the storage interface.
Sun Flash Accelerator F40 PCIe Card offers 400 GB capacity with over 149,000 random input/output operations per second (IOPs) and 2.1 GB/sec bandwidth performance in a single low-profile PCIe card. Its low-latency and high random IOPS performance results in fast response and increased I/O throughput. It uses advanced onboard controllers for enhanced reliability and low CPU overhead. It presents itself to the operating system as a flash card with four SAS drives that can be used for nonpersistent (cache) and persistent (storage) data.
Sun Flash Accelerator F40 PCIe Card is designed for a high level of reliability and compatibility with Oracle hardware and storage systems and Oracle software. It's ideal for use with the Database Smart Flash Cache feature available with Oracle Database 11g Release 2 and later releases.
Oracle Database 11g Release 2 Enterprise Edition allows you to use flash devices to increase the effective size of the Oracle Database buffer cache (Level 2 cache) without adding more main memory. This capability is referred to as Database Smart Flash Cache.
Figure 1 shows a system with and without Database Smart Flash Cache. As you can see, the system treats Sun Flash Accelerator F40 PCIe Card as a transparent extension of the buffer cache. Because frequently accessed data is cached in the card, the database does not have to wait for data to arrive from slow hard disk drives. I/O service times can be up to 15 times faster.
Figure 1. A system with and without Database Smart Flash Cache.
When the database requests data I/O, the system first looks in the buffer pool. If the data is not found, the system then looks in the Database Smart Flash Cache buffer. If it does not find the data there, only then does it look in disk storage. Not only are performance and response times greatly improved, but also much better IOPS/$, IOPS/GB, IOPS/Watt, and server utilization efficiency are achieved.
Oracle recommends a flash cache size of 4 to 10 times the Oracle Database System Global Area (SGA) size. This amount will allow you to offload most of your disk I/O to flash. The I/O from disk will be stored in the flash buffer cache once it is evicted from the database buffer. All subsequent reads for that particular row are then done from flash. This is a clean (read) cache, because any dirty blocks (writes) are flushed to disk. This approach provides the necessary data protection, because any changes are already written to disk. Therefore, no RAID or mirroring is required.
Both OLTP and DW environments can experience improved transaction throughput and application response times.
Sun Flash Accelerator F40 PCIe Card is a block device optimized for 8K block sizing and alignment (consistent with that of Oracle databases). This section explains actions you can take to tune the card for maximum performance in an Oracle Linux environment with Unbreakable Enterprise Kernel.
The following steps configure the card as one file system that one database can use. Other options would be to create multiple aligned partitions on the card and allocate these partitions to other databases residing on the server for their own Database Smart Flash Cache.
To align the card, run the
sfdisk command to create the aligned partition. The following
sfdisk command will align the card on an 8k boundary:
echo "16,," | sfdisk -uS /dev/sda
fdiskcommand after creating the aligned partition to produce the following results on Sun Flash Accelerator F40 PCIe Card:
fdisk -lu /dev/sda Disk /dev/sda: 400.0 GB, 399999762432 bytes 255 heads, 63 sectors/track, 48630 cylinders, total 781249536 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sda1 16 781249535 390624760 83 Linux
Use the following commands to use EXT-2 with journaling turned off (the
noatime option is described below):
mkfs -t ext2 /dev/sda1 or mkfs.ext2 /dev/sda1
When mounting the new non-journaling device, use the following command:
mount -t ext2 -o noatime /dev/sda1 /mountpoint
Use the following command to create an EXT-4 file system:
mkfs -t ext4 /dev/sda1 or mkfs.ext4 /dev/sda1
After creating the EXT-4 file system, journaling will be on by default. Verify this by executing the
tune4fs -l /dev/sda1 | grep 'Filesystem features'
has_journal feature should be listed. To turn off journaling, execute the following:
tune4fs -O ^has_journal /dev/sda1
To verify that journaling is disabled, execute the following command to make sure
has_journal is not listed as enabled.
tune4fs -l /dev/sda1 | grep 'Filesystem features'
When mounting the new EXT-4 device, use the following command (the
noatime option is described below):
mount -t ext4 -o noatime /dev/sda1 /mountpoint
cat /sys/block/sda/queue/scheduler noop anticipatory [deadline] cfq
noatimefile system mount option in the
/etc/fstabfile. This option eliminates the need for the system to create writes to the file system when objects are only being read. This option also enables faster access to the files, plus it causes less wear on Sun Flash Accelerator F40 PCIe Card.
This example shows how the
/etc/fstab entry invokes the
/dev/sda1 /osfc ext2 defaults,noatime 1 2
An alternative to invoking the
noatime option is to specify it when executing the
mount command (see Step 3 above for an example of combining the
noatime option with the
ext2 file system option):
mount -o noatime /dev/sda1 /osfc
nr_requestsparameter will need to be modified to a value that is the same or larger than the new queue depth value. Here are examples of modifying both the
echo "256" > /sys/block/sda/queue/nr_requests echo "256" > /sys/block/sda/device/queue_depth
To ensure these settings persist across reboots, place these commands in the
To provide expanded capacity for the Database Smart Flash Cache, multiple Sun Flash Accelerator F40 PCIe Cards can be deployed using volume management software. Multiple cards can be mirrored, but in the following example, two cards were installed using Oracle Automatic Storage Management to expand the cache area capacity.
A benefit of using an Oracle Automatic Storage Management diskgroup for Database Smart Flash Cache is that multiple databases on the server will be able to share this diskgroup to create multiple Database Smart Flash Caches for the different databases.
To configure Oracle Automatic Storage Management over two Sun Flash Accelerator F40 PCIe Cards, do the following:
Oracle Linux provides an Oracle Automatic Storage Management tool to create the Oracle Automatic Storage Management disks and the diskgroup.
/usr/sbin/oracleasm createdisk D1 /dev/sda1 /usr/sbin/oracleasm createdisk D2 /dev/sdb1
SQL> create diskgroup Flash disk 'ORCL:D1', 'ORCL:D2' external redundancy;
Another option for creating an Oracle Automatic Storage Management diskgroup is to use the Oracle ASM Configuration Assistant, which uses a graphical user interface to create the diskgroup.
This section describes the changes needed to enable and configure an Oracle Database 11g Release 2 database with the Database Smart Flash Cache feature, which is supported by Oracle Linux and by Oracle Solaris.
Setting up the Database Smart Flash Cache feature is very easy and requires just a few steps to aggregate the Sun Flash Accelerator F40 PCIe Cards to a pool, specify the path to the flash devices, and specify the size of the flash devices, as follows.
Use a volume manager, for example, Solaris Volume Manager (SVM), which is free, Veritas Volume Manager (VxVM), or Oracle Automatic Storage Management.
No mirroring is needed, since it is a cache.
Specify the path to the flash card(s), for example:
db_flash_cache_file = </dev/svm-Flashcard or +FLASH/filename>
Specify the size of the flash area:
db_flash_cache_size = <flash cache size>
While the Database Smart Flash Cache is dynamic and very efficient due to the fact that it automatically migrates and evicts data as needed, DBAs have the option to pin objects/hot data to the Database Smart Flash Cache using the KEEP command. Normally a DBA would pin an object to the KEEP buffer pool, which resides in memory. By pinning an object to the Database Smart Flash Cache, real memory requirements are reduced, and performance is increased since more selected objects/hot data can then be accessed directly from the faster flash device instead of from a much slower disk. The syntax for pinning an object to the Database Smart Flash Cache is the following:
alter table|index <object_name> storage (flash_cache keep);
Install the required database patch to enable the Database Smart Flash Cache:
After installing the patch and bouncing the database, the
db_flash_cache_size parameters need to be set in the database to enable the Database Smart Flash Cache feature.
The following database settings were used in these tests when invoking the Database Smart Flash Cache:
SQL> alter system set db_flash_cache_file='/osfc/oradata/osfc/flash.dbf' scope=spfile; SQL> alter system set db_flash_cache_size=175g scope=spfile; SQL> show parameter flash NAME TYPE VALUE db_flash_cache_file string /osfc/oradata/osfc/flash.dbf db_flash_cache_size big integer 175G
These commands enabled the Database Smart Flash Cache inside the database using file system
/osfc and allocated 175 GB to the cache. Then database was bounced to enable it to use this cache.
To implement the Database Smart Flash Cache feature using multiple Sun Flash Accelerator F40 PCIe Cards, we installed and implemented Oracle Automatic Storage Management to create a diskgroup over multiple flash cards to increase the capacity of the Database Smart Flash Cache. For details on how to do this, see the "How to Configure Oracle Automatic Storage Management to Use Multiple Sun Flash Accelerator F40 PCIe Cards" section of this article.
The following database settings were used in the benchmarks when using multiple Sun Flash Accelerator F40 PCIe Cards with the Database Smart Flash Cache:
SQL> alter system set db_flash_cache_file= '+FLASH/flash.dbf' scope=spfile; SQL> alter system set db_flash_cache_size=250g scope=spfile; SQL> show parameter flash NAME TYPE VALUE db_flash_cache_file string +FLASH/flash.dbf db_flash_cache_size big integer 250G
The following database settings were used for all benchmarks:
Flash accelerates applications, increases productivity, and improves business responsiveness. Based upon the benchmarks that were executed for this article using Oracle's Sun Flash Accelerator F40 PCIe Card and the Database Smart Flash Cache feature of Oracle Database running on Oracle Linux with Unbreakable Enterprise Kernel, large performance gains were realized. Whether you are running Oracle Database or other I/O-intensive applications, similar performance gains and improved response times can be realized in the enterprise using the configuration presented in this article for workloads that
As a side benefit, implementing the Database Smart Flash Cache feature with Sun Flash Accelerator F40 PCIe Card reduces the hard disk IOPS for reads, and the reduction in IOPS for reads results in improved physical writes with less latency to disk. This not only improves application performance and response times, but increases server efficiency due to less storage I/O waiting.
These are significant benefits for customers running large databases. The Database Smart Flash Cache feature—used with Sun Flash Accelerator F40 PCIe Card and Oracle Linux with Unbreakable Enterprise Kernel or Oracle Solaris—provides a platform that can scale and perform to the demanding needs of growing enterprises.
|Revision 1.0, 04/18/2014|