TECHNOLOGY: Oracle Exadata
Smart Scans Meet Storage IndexesBy Arup Nanda
Understand how Oracle Exadata uses storage indexes to speed I/O.
The Oracle Exadata Database Machine brings database performance to a whole new level, but have you ever wondered what exactly makes it so fast? Several components of the Oracle Exadata Database Machine, such as Oracle Database 11g Release 2; Oracle Exadata’s Smart Flash Cache, Hybrid Columnar Compression, and SmartScan features; and InfiniBand interconnect, help deliver high performance. One of the key technologies that supports this performance is the storage index, which is not a regular database index. Storage indexes reside in the memory of the storage servers—also called storage cells—and significantly reduce unnecessary I/O by excluding irrelevant database blocks in the storage cells. This article shows you how storage indexes work and how to make sure they are used effectively.
Traditional Database I/O
Computational power in the form of CPU capability and memory has increased exponentially in the last several decades, whereas the power of the storage subsystem has not increased nearly as much. There are many reasons why storage performance has not improved with CPU and memory performance, and physics is one of them. A disk can spin only so fast, and each platter in the disk array can hold only a finite amount of information, so the current weakest link in the processing chain will continue to be storage and will account for the biggest part of compute response time.
Consider what happens when a user issues the following query against an Oracle Database instance that is not running on Oracle Exadata:
select avg(amt) from sales where cust_level = 3.
Assume there is no index on the cust_level column, meaning that the optimizer will choose a full table scan for this query. The Oracle Database server process corresponding to the session will issue a request to get all the database blocks of the sales table from the storage, examine the data, and discard the rows that do not satisfy the query. (When a row is requested by Oracle Database, that entire database block—typically 8 KB in size—has to go into memory from disk.)
The storage subsystem is concerned with bits and bytes—it does not have any idea about the actual data stored inside the database blocks. The database server alone can determine what data is in what blocks.
Oracle Exadata I/O and Smart Scan
Storage in Oracle Exadata changes query processing so that not all blocks have to go to the database server for that server to determine which rows might satisfy a query. Oracle Exadata’s Smart Scan feature enables certain types of query processing to be done in the storage cell. With Smart Scan technology, the database nodes send query details to the storage cells via a protocol known as iDB (Intelligent Database). With this information, the storage cells can take over a large portion of the data-intensive query processing. Oracle Exadata storage cells can search storage disks with added intelligence about the query and send only the relevant bytes, not all the database blocks, to the database nodes—hence the term smart scan.
With Oracle Exadata, full table scans with selected functions and operators such as =, >, and so on in predicates and index fast full scans can use smart scans. To see functions that can benefit from Smart Scan, you can issue the query
select name from v$sqlfn_metadata
While scanning storage, Oracle Exadata storage cells can identify which areas of the disk storage will definitely not contain the values the query is interested in and avoid reading those areas.
How do the storage cells know how to avoid reading the areas of disk storage that do not include the data a query is interested in? They use storage indexes. A storage index is an in-memory structure that holds some information about the data inside specified regions of physical storage. This information tells the storage cell which areas of the disk do not contain the values the query is interested in, so those areas are not accessed during a scan.
Figure 1 shows how the data inside a storage index is maintained. There are storage index entries for different columns: PROD_CODE, SALES_DT, and CUST_LEVEL. Each storage index entry covers a physical region of the table, contains minimum and maximum values of the columns in that region, and also indicates whether any of the rows in that region contain nulls. In this example, region 1 of the storage index represents rows 1 through 3 of the table and region 2 of the storage index represents table rows 4 through 6. The storage index entry for region 1 in Figure 1 shows the minimum and maximum values of the cust_level column in the region to be 1 and 2, respectively. The entry also shows that there are no null values in the rows of the table in region 1. The storage index entry for region 2 shows the minimum and maximum values of the cust_level column to be 3 and 4, respectively. One row of the table in region 2 includes a null value, so the null indicator in the storage index entry for region 2 shows Yes. For the sake of simplicity, Figure 1 shows the complete details of only one column (CUST_LEVEL) inside the storage index and the next column (PROD_CODE) is only partially displayed.
Figure 1: Conceptual representation of a storage index
In the earlier example query, the user issued the statement
select avg(amt) from sales where cust_level = 3
When this query is run against Oracle Exadata storage, the storage index entry on the CUST_LEVEL column in Figure 1 indicates that region 1 has values between 1 and 2, so no cust_level = 3 results will be found in that storage index region. Therefore, the storage cell does not access that region of the disk. Region 2 of the storage index shows the values to be between 3 and 4, so cust_level =3 will be found in at least one row there. The storage cell reads region 2 from disk.
Similarly, suppose the query is instead
select avg(amt) from sales where cust_level is null
In this case, the storage index indicates that rows 1, 2, and 3 of the table (region 1) will definitely not satisfy the WHERE cust_level IS NULL condition, whereas one or more of rows 4, 5, and 6 (region 2) may satisfy the condition.
As you can see, Oracle Exadata storage indexes do not locate the areas of the table that contain the values of interest to the user; rather, they identify the areas that definitely will not contain the values, thus eliminating them from I/O processing. In a manner of speaking, they act as negative indexes, just the opposite of traditional database indexes, which are for locating—not eliminating—the database blocks that may contain the information.
Storage indexes are not stored on disk; they are resident in the memory of the storage cell servers. They are created automatically after the storage cells receive repeated queries—with predicates—for columns. No user intervention is needed to create or maintain storage indexes. And because they are memory-resident structures, they disappear when the storage cells are rebooted.
To use storage indexes, Oracle Exadata queries must use smart scans, so not all types of applications can benefit from storage indexes. Applications with queries that include predicates and perform a lot of full table scans or fast full scans of indexes—typically those used in data warehousing environments—will benefit greatly from storage indexes. Online transaction processing (OLTP) applications, on the other hand, typically access a small number of rows through standard indexes and do not perform full table scans, so they may not benefit from storage indexes.
Checking for Savings
Let’s see how much I/O can be reduced with storage indexes. I’ll check for I/O savings on Oracle Exadata by querying the V$MYSTAT view.
This view includes the values of several metrics for the current session. The two metrics I am interested in are
Cell physical IO bytes saved by storage index. This metric shows how many bytes of I/O were eliminated by the application of storage indexes at the storage cell level.
Cell physical IO interconnect bytes returned by smart scan. This metric shows how many bytes of I/O were returned by a smart scan to the database server.
The code in Listing 1 returns the names and values for these two metrics. To save space in the output of Listing 1, I present the metrics with shorter names: “SI Savings” for cell physical I/O bytes saved by storage index and “Smart Scan” for cell physical I/O interconnect bytes returned by smart scan. And I show the values for both metrics, in megabytes.
Code Listing 1: Checking for I/O savings attributable to storage indexes
col stat_value format 9,999.9999 select decode(name, 'cell physical IO bytes saved by storage index', 'SI Savings', 'cell physical IO interconnect bytes returned by smart scan', 'Smart Scan' ) as stat_name, value/1024/1024 as stat_value from v$mystat s, v$statname n where s.statistic# = n.statistic# and n.name in ( 'cell physical IO bytes saved by storage index', 'cell physical IO interconnect bytes returned by smart scan' ) /
After starting the session on Oracle Exadata, I issue
select avg(amt) from sales where cust_level is null
and then execute the query in Listing 1 to see the effect of using storage indexes. Here is the output:
STAT_NAME STAT_VALUE —————————— —————————————— SI Savings 545.9234 Smart Scan 0.0012
As the output shows, the I/O savings attributable to storage indexes is about 546 MB and the storage cells returned 0.0012 MB to the database. These numbers are highly dependent on workload and environment, but the measurement concept is still the same. Using this query, you will be able to examine the savings in your specific case, particularly in those cases where storage indexes are not used or are not created yet.
If storage indexes are so great, why doesn’t Oracle Exadata use them all the time? The short answer is that they are created and used only when they will be beneficial.
First, data distribution plays a big role. Say, for instance, that I execute the following query against an example sales table on Oracle Exadata:
select avg(amt) from sales where sales_dt = '13-MAR-11';
I then execute the query in Listing 1 and examine the output to see the effect of storage indexes:
STAT_NAME STAT_VALUE —————————————— —————————————— SI Savings 0.0000 Smart Scan 0.9035
But why wasn’t the storage index used? Because the values of the sales_dt column of the sales table are randomly spread over the blocks of the table, with almost all blocks containing rows that satisfy
where sales_dt = '13-MAR-11'
The storage index, even if used, would not have been able to reduce the I/O, because it would have identified all the blocks as potentially satisfying the query, resulting in no I/O savings.
To remedy that situation, you can increase the likelihood that more matching rows will be found in fewer database blocks so that Oracle Exadata storage indexes find fewer blocks to send to the database server nodes. You can do that by reloading the table, ordered by the sales_dt column. Here are the SQL statements:
SQL> rename sales to sales_old; SQL> create table sales nologging as select * from sales_old where 1=2; SQL> insert /*+ APPEND */ into sales select * from sales_old order by sales_dt;
Now issue the query against the sales table:
select avg(amt) from sales where sales_dt = '13-MAR-11';
Checking the metrics by executing the code in Listing 1, I can see that 434 MB of I/O was saved with the storage indexes:
STAT_NAME STAT_VALUE ————————————— ————————————— SI Savings 434.0045 Smart Scan 0.0012
ConclusionStorage indexes in Oracle Exadata know the distribution of data in storage cells and help eliminate physical I/O within storage cells during a smart scan. This significant reduction in I/O results in even faster processing of data within the Oracle Exadata Database Machine. You can check to see whether storage indexes are being used in Oracle Exadata queries, and you can measure the I/O savings in a specific query.
Arup Nanda (firstname.lastname@example.org) has been an Oracle DBA for more than 14 years, handling all aspects of database administration, from performance tuning to security and disaster recovery. He was Oracle Magazine’s DBA of the Year in 2003.
Send us your comments