by Art Licht
Published July 2012Pillar Axiom Storage System and Sun ZFS Storage Appliance
The case study described in this article explains how the Hybrid Columnar Compression feature of Oracle Database 11g Release 2 (126.96.36.199) was used with external NAS and SAN storage to evaluate the space and performance implications of using Hybrid Columnar Compression with external storage. This article also describes how you can enable the Hybrid Columnar Compression feature to achieve similar benefits.
The products used for the case study were Oracle's Sun ZFS Storage Appliance and Pillar Axiom storage system for SANs. The focus was on space savings as well as the effect that Hybrid Columnar Compression had on query performance. The investigation led to an analysis of the I/O differences between running queries on compressed versus uncompressed data.
The analysis showed that the Hybrid Columnar Compression feature can lead to a 19x reduction in required storage space and an average 6x increase in database query performance.
Oracle's Pillar Axiom 600 storage system provides integrated, dynamic management of I/O prioritization to ensure that critical applications perform optimally under any load condition. It is a modular storage platform built on three intelligent hardware assemblies—Pillar Axiom Slammers (storage controllers), Pillar Axiom Bricks (drive enclosures), and Pillar Axiom Pilot (management platform)—with a complete suite of data management capabilities, including support for the Oracle Database 11g Release 2 Hybrid Columnar Compression feature. In addition, Pillar Axiom Quality of Service technology allows you to prioritize data access based on the significance of key applications.
Oracle's Sun ZFS Storage Appliance family provides enterprise-class NAS capabilities and a complete suite of data services including snapshots, clones, replication, and industry-leading performance analytics, as well as native compression, deduplication, Oracle Database integration, and support for the Hybrid Columnar Compression feature.
To enable applications to run as efficiently as possible, Oracle Database has been optimized—through features such as Database Smart Flash Cache, partitioning, and compression—to reduce I/O to physical hard disk drives (HDDs).
There are multiple levels and types of compression. OLTP Table Compression, a part of the Oracle Advanced Compression Option of Oracle Database 11g, is well suited for compressing data that is frequently updated. Hybrid Columnar Compression is well suited to historical/archive data that needs to be online and accessible. For more information on the Oracle Advanced Compression Option or OTLP Table Compression, see "Advanced Compression Option with Oracle Database 11g."
Compression is an ideal technology for reducing required storage capacity. Deduplication is another approach for reducing required storage capacity. Deduplication is effective with backups when a lot of the data is the same, which often occurs in full backups that are run week after week. Deduplication can be an efficient way to turn full backups into infinite incrementals. Deduplication has not proven as effective for OLTP database backups.
Deduplication has also proven effective with online storage for virtual desktop environments when multiple copies are typically made of the same operating system. Where deduplication has not been proven effective is in the more traditional, online application space. This is where Oracle Database compression excels.
Data can be compressed in multiple ways. Within Oracle Database, there are multiple types of compression with variants of each. OLTP Table Compression can be applied at the partition, table, and tablespace level. Hybrid Columnar Compression is typically applied to the appropriate tables or table partitions. Table 1 lists some of the database-aware compression technologies available for use with Oracle Database.Table 1. Compression Technologies
|Table Compression Method||Create/Alter Table Syntax||Direct-Path Insert||Notes|
|Basic compression|| ||Rows are compressed with basic compression.||Does not maintain compression for Data Manipulation Language (DML) INSERT/UPDATE operations after bulk load.|
|OLTP compression|| ||Rows are compressed with OLTP compression.||Maintains compression for DML operations.|
|Warehouse compression (Hybrid Columnar Compression)|| ||Rows are compressed with warehouse compression.||Ideal for active data warehouses.|
|Archive compression (Hybrid Columnar Compression)|| ||Rows are compressed with archive compression.||Ideal for archive/historic data.|
Oracle Database-aware technologies have the advantage of compressing the data before it's sent to the storage system, which generally results in the movement of less data and higher performance. Additionally, because the compression is fully integrated with Oracle Database, queries often can run directly on the compressed data.
Hybrid Columnar Compression technology is a new method for organizing data within a database block. As the name implies, this technology uses a combination of both row and columnar methods for storing data. This hybrid approach achieves the compression benefits of columnar storage, while avoiding the performance shortfalls of a pure columnar format.
Hybrid Columnar Compression is enabled when Oracle Database detects that either the Pillar Axiom storage system or the Sun ZFS Storage Appliance is attached to the database server. It is also enabled for Oracle databases on Oracle Exadata storage. FC and iSCSI connectivity are supported using Oracle Automatic Storage Management with the Pillar Axiom storage system. NFS connectivity via Oracle Database 11g Direct NFS Client is supported with the Sun ZFS Storage Appliance.
Oracle Automatic Storage Management is an integrated, high-performance database file system and disk manager based on the principle that the database, rather than an administrator, should manage storage. Oracle Automatic Storage Management eliminates the need for administrators to directly manage potentially thousands of Oracle Database files.
Oracle Database 11g Direct NFS Client is an optimized NFS client that provides simpler, faster, and more scalable access to NFS storage located on NAS storage devices. Direct NFS Client is built directly into the Oracle Database kernel.
The following sections describe the case study that was performed, the architecture that was used, and the results achieved.
Using data typical of a retail application, we created the database described in Table 2.Table 2. Database
| || |
The total storage space required for this database was 735 GB.
Multiple copies of the database were created on the Sun ZFS Storage Appliance as well as the Pillar Axiom storage system so that each database had exactly the same hardware resources. In fact, the two database copies on the Sun ZFS Storage Appliance existed on the same physical HDDs to ensure that any performance differences were due to Hybrid Columnar Compression.
After creating the uncompressed version of the data, we created an OLTP Table Compression version. Applying OLTP Table Compression to the same database tables reduced the space requirement, as shown in Table 3.Table 3. Space Requirements with OLTP Table Compression
| || |
The total storage space required for this database was 243 GB, which is a 3x reduction in overall storage space. This reduction is typical for OLTP Table Compression with the Advanced Compression Option of Oracle Database 11g, which typically provides 2x to 4x compression ratios.
Applying the Query High compression level of Hybrid Columnar Compression to the same database tables reduced the space requirement to those shown in Table 4.Table 4. Space Requirements with Hybrid Columnar Compression
| || |
The total storage space required for this database was 38 GB, which is a 19x reduction in overall storage space. This reduction is slightly higher than typical for Query High compression, which normally provides 6x to 10x compression ratios.
Data also can be compressed in the storage array. When the storage array compresses the data, the application or database is unaware that the data has been compressed—the compression is transparent to the application.
Storage array–based compression requires that the data be uncompressed to transfer it to the server. There are multiple compression algorithms deployed in various storage systems. For this test, we chose Sun ZFS Storage Appliance LZJB compression because of its light use of the CPU. LZJB is a lossless compression algorithm. It includes a number of improvements to the LZRW1 algorithm, which is a member of the Lempel-Ziv family of compression algorithms.
Figure 1 is a screenshot from the Sun ZFS Storage Appliance showing the compression settings. In order to run queries on the data, the data must be uncompressed and transferred to the database server.
Figure 1. Sun ZFS Storage Appliance Array-Based Compression Configured as LZJB
The total storage space required for this database was 143 GB. When LZJB was used on the OLTP Table Compression compressed database, the size was reduced to 131 GB, as shown in Table 5.Table 5. Impacts of Compression Types
|COMPRESSION TYPE||SIZE||SPACE SAVINGS|
|No compression||735 GB||NA|
|OLTP Table Compression (Advanced Compression Option)||243 GB||3.0x|
|Hybrid Columnar Compression||38 GB||19.3x|
|Sun ZFS Storage Appliance LZJB compression||143 GB||5.1x|
|LZJB plus Advanced Compression Option||131 GB||5.6x|
Database-aware compression technologies, such as Advanced Compression Option and Hybrid Columnar Compression, allow the data to be transferred between the storage and server in the native compressed format. The data remains compressed not only on disk, but it also remains compressed in the Database Smart Flash Cache, on the network, in the database server buffer cache, as well as during Oracle Recovery Manager (Oracle RMAN) backups or log shipping with Oracle Active Data Guard.
Advanced Compression Option and Hybrid Columnar Compression are specific to Oracle Database, Enterprise Edition. Array compression, such as LZJB in the Sun ZFS Storage Appliance, is available and supports any application environment.
Space saved by all compression techniques is related to the specific data being compressed and the size of the data set. For this test, a small amount of retail data was used that consisted of retail transactional data, cash register receipts, and other data typical of retail environments. In both the NAS (Sun ZFS Storage Appliance) and SAN (Pillar Axiom storage system) environments, the Hybrid Columnar Compression space savings were identical at 19x.
As part of the case study, we ran the Oracle Advanced Compression Advisor on each table with OLTP and with Query High compression. In all instances, its predictions were more than 95 percent accurate. You can run the Oracle Advanced Compression Advisor with any storage type. See the "Running the Oracle Advanced Compression Advisor" section for more information.
To evaluate the effect that Hybrid Columnar Compression had on query performance, we built two different environments. A NAS environment was set up, and the storage was connected to the server via Direct NFS Client. This environment was intentionally made small with limited I/O bandwidth in order to evaluate the effect of Hybrid Columnar Compression in a limited I/O environment. It used a small, single-tray Sun ZFS Storage Appliance with only 1 GbE.
Each database existed on the same set of HDDs within the Sun ZFS Storage Appliance. All HDDs were striped across the same Sun ZFS Storage Appliance pool, which was configured as RAID-Z1. All databases were hosted on the same physical server, an Intel four-socket Nehalem system. Each database existed in its own cloned virtual machine (Oracle VM 3.0), and each had the same configuration parameters (DRAM, CPU, SGA, and so on). The server was connected to the Sun ZFS Storage Appliance via 1 GbE. See Appendix A for the hardware configuration details.
To evaluate the performance benefits, we created six queries, which are listed in Appendix B. The queries varied in complexity and I/O versus computation. It should be noted that it is possible to build an extremely large storage system that would eliminate all I/O waits and, therefore, create a CPU-limited architecture. That would require large and expensive systems with many drives, flash storage, and possibly even multiple arrays. Hybrid Columnar Compression is able to make storage much more efficient, reducing the requirement for extra storage and extra arrays.
Sun ZFS Storage Appliance performance monitoring software showed a high cache hit rate and low disk utilization, typical of a healthy storage system. Figure 2 shows that disk activity was less than 10 percent utilization, on average.
Figure 2. Disk Activity
Figure 3 shows cache hit rates that appeared to be quite high—more than 90 percent and slightly more than 1,100 accesses/sec. The 1,100 accesses/sec equated to slightly more than 100 MB/sec, the limit of gigabit Ethernet.
Figure 3. Cache Hit Rates
The performance-limiting factor in this case was the gigabit connectivity. Because of this, the queries on the uncompressed data were throttled by I/O. A snapshot from a database automated workload repository (AWR) report showed a large amount of wait I/O, as shown in Table 6.Table 6. AWR Report with Large Amount of Wait I/O
|LOAD AVERAGE BEGIN||LOAD AVERAGE END||%USER||%SYSTEM||%WIO||%IDLE|
The same set of queries was then run on the Hybrid Columnar Compression compressed tables stored on the same eight 7,200 RPM drives of the Sun ZFS Storage Appliance. Table 7 shows the database AWR report for same time period.Table 7. AWR Report with Hybrid Columnar Compression
|LOAD AVERAGE BEGIN||LOAD AVERAGE END||%USER||%SYSTEM||%WIO||%IDLE|
The I/O wait time was eliminated. Sun ZFS Storage Appliance analytics showed that the I/O requests coming to the storage system were greatly reduced. With the queries on the uncompressed tables, the system averaged approximately 1,100 I/O accesses/sec. Queries on the Hybrid Columnar Compression data reduced the I/O requests to approximately 375/sec.
As a result of the reduced I/O, the load average dropped and the CPUs were able to do more work in less time. Figure 4 shows 375 cache accesses/sec, down from more than 1,100 accesses/sec.
Figure 4. Cache Accesses per Second
We ran three unique queries on the Sun ZFS Storage Appliance, which varied in complexity and also varied the amount of I/O versus processing required. Table 8 shows the results of the three different queries.Table 8. Query Results
|QUERY||UNCOMPRESSED||COMPRESSED BY HYBRID COLUMNAR COMPRESSION|
|Query 1||52 minutes, 0 seconds||10 minutes, 52 seconds|
|Query 3||2 hours, 22 minutes, 44 seconds||16 minutes, 50 seconds|
|Query 5||7 minutes, 6 seconds||1 minute, 6 seconds|
In all cases, the queries on the Hybrid Columnar Compression data were faster than on the uncompressed data. On average, queries ran 7x faster.
The second environment was a high-performing environment in which FC was deployed. The SAN environment was set up to have significantly more I/O capability. Instead of eight 7,200 RPM drives, the Pillar Axiom storage system had 48 FC 15,000 RPM drives. The storage was connected to the server using 4 Gb FC. FC LUNs were presented to the server and imported to Oracle Automatic Storage Management. The SAN environment had 4x the bandwidth of the NAS environment (4 Gb versus 1 Gb). Both disk configurations were single-parity RAID.
Table 9 shows the results of the three queries.Table 9. Query Results
|QUERY||UNCOMPRESSED||COMPRESSED BY HYBRID COLUMNAR COMPRESSION|
|Query 2||29 minutes, 6 seconds||6 minutes, 58 seconds|
|Query 4||30 minutes 45 seconds||5 minutes, 2 seconds|
|Query 6||12 minutes, 4 seconds||4 minutes, 18 seconds|
Even in this higher I/O environment, the CPUs were waiting on I/O, as shown in Table 10.Table 10. Results in Higher I/O Environment
In all cases, the queries on the Hybrid Columnar Compression tables were faster than on the uncompressed data—4.5x faster, on average. Again, we observed the I/O wait disappear.
Queries on uncompressed data that cause the CPUs to wait on I/O, as measured by
iowait, will see the greatest benefit from Hybrid Columnar Compression because it reduces the amount of I/O that has to be done. For example, Query 5 on the uncompressed data required 4.5 million physical reads, while the same query on the Hybrid Columnar Compression compressed data required only 137 K physical reads, as measured by
autotrace. Query 5 read a table that was 35 GB uncompressed and slightly more than 1.08 GB when Hybrid Columnar Compression Query High compressed it. That is a 32x reduction in table size. The physical I/O reduction from running the query is 4,503,701 to 137,416, which also is a 32x reduction.
Hybrid Columnar Compression is typically used mostly with query-only or historical data. OLTP Table Compression can be utilized with transactional, read/write data and can typically yield 2x to 4x space savings. Hybrid Columnar Compression can typically yield 10x to 50x space savings.
In this test environment, OLTP Table Compression yielded 3x space savings and Hybrid Columnar Compression yielded 19x space savings across the tables. Each compression type has different use cases. When both are used appropriately, the greatest benefit can be obtained.
As shown in Figure 5, you can use database partitioning to segment the table by date and then apply OLTP Table Compression to the transactional data in the 2011 and 2012 partitions, which can exist on any storage device. Then you can apply Hybrid Columnar Compression to the historical data.
Figure 5. Database Partitioning to Segment the Table
If the database is a 50 TB database that has compression rates similar to those used in this case study, applying only Hybrid Columnar Compression to the historical data would reduce the storage from 50 TB to 15 TB.
The table size reduction has additional benefits because Oracle Database features such as Oracle RMAN, Oracle Active Data Guard, and Database Smart Flash Cache operate on the Hybrid Columnar Compression compressed data. This enables backups to run faster and restores to be quicker, because there is less data to transfer.
To evaluate the effectiveness of using Hybrid Columnar Compression in your existing environment, run the Oracle Advanced Compression Advisor, which is a standard part of Oracle Database 11g, to determine the potential compression benefits.
There are four different levels of Hybrid Columnar Compression: Query High, Query Low, Archive High, and Archive Low. Oracle Advanced Compression Advisor can be run with all four levels.
For more details on the Oracle Advanced Compression Advisor, see the Oracle Advanced Compression Advisor Website.
The following sections describe how to enable the Hybrid Columnar Compression feature of Oracle Database for use with the Sun ZFS Storage Appliance and the Axiom Pillar storage system.
Refer to My Oracle Support Note 10404530 at support.oracle.com (requires registration).
The following is sample
fstab entry for an NFS-mounted ZFS share. Note that when Direct NFS (dNFS) is used, the database sets up the
wsize parameters, as needed.
<Appliance IP address>:<Appliance share mount point> nfs nfsvers=3,proto=tcp,hard,intr,rsize=<# of bytes>,wsize=<# of bytes>
oracle, enable dNFS on the Oracle Database server, by doing the following:
rm -f libodm11.so
ln -s libnfsodm11.so libodm11.so
[Oracle @ myhost] $ sqlplus / as sysasm SQL> alter system set compatible='188.8.131.52.0' scope=spfile sid='*'; SQL> alter diskgroup data_hcc set attribute 'compatible.asm'='184.108.40.206.0'; SQL> alter diskgroup data_hcc set attribute 'compatible.rdbms'='220.127.116.11.0';
storage.typeattribute to a value of
SQL> alter diskgroup data_hcc set attribute 'storage.type'='AXIOM';
Note: Here is a way to check that the disk group is really Pillar Axiom storage. If you execute the previous command on non-Pillar Axiom storage, error messages similar to the following are displayed:
ORA-15287: could not set disk group attribute storage.type due to incompatible disks ORA-15285: disk '/dev/mapper/XXXXXXXX' violates disk group attribute storage.type
Compression technologies can save space and improve performance, and they can reduce the cost of storing and managing data. Two things need to be considered: the amount of spaced saved and the performance implications, if any, of saving that space.
In the tests described in this paper, array-based compression was able to save space, but it did not improve query performance because data had to be uncompressed to be transmitted to the database. With database-aware Hybrid Columnar Compression, no decompression was required to transmit the data.
In this particular test with a retail data schema, the Hybrid Columnar Compression feature of Oracle Database 11g Release 2 was able to reduce the capacity required to store the information by 19x and to improve query performance by nearly 6x.
This paper would not have been completed without the assistance of Maury Edmonds. He currently works as a systems engineer with the Oracle Solution Center, specializing in virtualization technologies, technical demonstration development, and customer proof-of-concept support. His help was instrumental in building out all the test environments for this paper's research.
Art Licht is Chief Architect for the Oracle Solution Centers in North America. Art joined Oracle as part of the acquisition of Sun Microsystems, where he was a distinguished engineer and Chief Technologist for the North American Storage group. Art is the author of numerous storage-related best practices and blueprints as well as the architect of various large, heterogeneous SANs. Art has led teams implementing some of the industry's largest database and SAP implementations.
|PRIMARY STORAGE||1||Sun ZFS Storage Appliance |
64 GB DRAM per controller
1 x 20 - 1 TB 7,200 RPM disk tray
4 write flash accelerators
4 read flash accelerators
1 GbE connectivity
|1||Pillar Axiom SAN array 1 Pillar Axiom Fibre Channel SAN Slammer 24 GB cache 4 Pillar Axiom Fibre Channel Bricks of 15 K RPM drives Pillar Axiom Pilot 4 Gb FC connectivity|
|PRIMARY SERVER||1||Intel Nehalem with 4 sockets |
96 GB DRAM
2 internal mirrored boot drives HDDs
|NETWORK||1||1 GbE network switch|
|1||Cisco MDS SAN|
The Oracle team estimates that it would take a significantly greater amount of third-party hardware to approximate the performance realized by implementing Hybrid Columnar Compression on Oracle's Pillar Axiom storage system or Oracle's Sun ZFS Storage Appliance. Because there are several variables that account for I/O waits to the database (for example, network and CPU utilization), it is not practical at this time to state how much more third-party hardware would be required.
select count(distinct(t.trx_nbr)) from DWB_RTL_SLS_RETRN_LINE_ITEM li ,DWB_RTL_TRX t ,DWR_SKU_ITEM p ,DWR_ORG_BSNS_UNIT s where li.trx_nbr = t.trx_nbr and li.day_key = t.day_key and li.sku_item_key = p.sku_item_key and t.bsns_unit_key = s.org_bsns_unit_key and li.actn_cd = 'Sale' and t.day_key = 20080905 and p.sku_item_desc = 'Chili Sauce' and s.state in ('AZ', 'CA', 'NM', 'TX')
select count (*) from DWB_RTL_SLS_RETRN_LINE_ITEM
column c format 999,999,999,999 set echo on set timing on select count(trx_nbr) c from DWB_RTL_SLS_RETRN_LINE_ITEM li
select to_char(cast(li.end_dt_time_stamp as date), 'YYYYMMDD') ,li.actn_cd ,count(distinct(t.trx_nbr)) from DWB_RTL_SLS_RETRN_LINE_ITEM li ,DWB_RTL_TRX t ,DWR_SKU_ITEM p ,DWR_ORG_BSNS_UNIT s where -- Joins li.trx_nbr = t.trx_nbr and li.day_key = t.day_key and li.sku_item_key = p.sku_item_key and t.bsns_unit_key = s.org_bsns_unit_key -- Filters and t.day_key between 20080905 and 20080907 and p.sku_item_desc = 'Chili Sauce' and s.state in ('AZ', 'CA', 'NM', 'TX') group by to_char(cast(li.end_dt_time_stamp as date), 'YYYYMMDD') ,li.actn_cd order by 1, 2
select state, count(*) from dwb_rtl_trx t, dwr_org_bsns_unit b where t.bsns_unit_key = b.org_bsns_unit_key group by state
select state, count(*), sum(qty) from dwb_rtl_trx t, dwr_org_bsns_unit b, dwb_rtl_sls_retrn_line_item l where t.bsns_unit_key = b.org_bsns_unit_key and l.trx_nbr = t.trx_nbr and l.day_key = t.day_key and t.day_key = 20080907 group by state;
|Revision 1.0, 07/03/2012|