Cache large tables in the buffer cache and improve performance without expensive storage.
John, the lead DBA at Acme Bank, puts his coffee down and studies the group of visitors in his office. Caitlyn, the new chief technology officer, is frustrated by constant complaints about database performance issues due to long I/O response times. To enhance performance, she wants to increase the buffer cache of the database instance to allow more data in the cache, thereby reducing the frequent need to go to disk.
High memory prices used to keep the buffer cache size limited. “But if memory serves me right,” she chuckles at her own pun, “memory prices have fallen so drastically that we can afford to put a lot more memory in database servers now to have a really large buffer cache. But the problem is”—she pauses to cast a skeptical look at Debbie, a longtime DBA of the bank’s most critical database—“I’ve been told that adding memory won’t keep the tables in the buffer cache.”
Debbie, who has weathered many such confrontations in her career, explains, “Our applications issue a lot of full table scans, and full table scans cause Oracle Database to perform a direct path read (DPR) on table blocks that are not in the buffer cache.” Therefore, increasing the buffer cache after adding physical memory to the server, she correctly concludes, is not going to help, because the buffer cache is not even used. The applications could be rewritten with hints to force indexes to skip DPRs and cause buffer cache use, but changing thousands of queries in all of Acme’s applications would be a very expensive and time-consuming process and no one wants to even entertain that possibility. The only way to improve the performance of full table scans, Debbie declares with finality, is to improve the throughput of the I/O subsystem, such as by using flash storage, but that would be substantially expensive. With all that inexpensive memory on the server, Caitlyn implores, isn’t there a way to force Oracle Database to take advantage of it to avoid having to invest in expensive storage?
Yes there is, John assures them: a new feature in Oracle Database 12c Release 22.214.171.124 enables even full table scans to be loaded and kept in an available large buffer cache.
Before John can explain the new feature, everyone in the office wants to know why a full table scan doesn’t use the buffer cache. When a session connected to the Oracle Database instance selects data from a table, John elaborates, the database server process reads the appropriate data blocks from the disk and puts them into the buffer cache by default. Each block goes into a buffer in the buffer cache. The reason is simple: if another session wants some data from those blocks, it can be served from those cached blocks much faster than being served from disk. The buffer cache is limited and usually smaller than the entire database, so when the cache is full and a new database block comes in, Oracle Database forces old buffers that have not been accessed in a long time out of the cache to make room for the new blocks coming in.
However, John continues, consider the case of a full table scan query that selects all the blocks of the table. If that table is large, its blocks will consume a large portion of the buffer cache, forcing out the blocks of other tables. It’s unlikely that all the blocks of a large table will be accessed regularly, so having those blocks in the cache does not actually help performance. But forcing out the blocks of other tables, especially popular blocks, degrades the overall performance of the applications running against the database. That is why, John explains, Oracle Database does not load the blocks into the buffer cache for full table scans.
With the room satisfied by John’s explanation of why full table scans don’t use the buffer cache, Caitlyn has another question: How does Oracle Database decide which blocks in the buffer cache stay or go?
John explains that traditionally the database instance uses the least recently used (LRU) list to track how long ago a buffer in the buffer cache was accessed. When a buffer is first accessed by a session, it is placed at the top of the LRU list. Over time, as other buffers are accessed by sessions, they move to the top, which causes the original buffer to slide downward from the top of the list. When the original buffer is accessed again, it moves up the list toward the top, but if it’s not accessed for some time and other buffers are, it falls all the way to the bottom and eventually is bumped off the list. At that point, the buffer is removed from the cache. The actual process, John hastens to add, is a bit more complicated, because the instance also takes into account other factors such as how many times the buffer has been touched (using what is known as a touch count counter).
Although this process works well in many cases, John continues, there are exceptions. Because the decision to cache or remove a buffer is based on the frequency and recentness of access, not all the blocks of the table might be in the buffer cache, because not all buffers are accessed equally. If even a single buffer is not found in the cache, a full table scan on that table has to go to disk to get the rest of the data.
But in Oracle Database 12c Release 126.96.36.199, John announces to the room, the decision about the candidates for the buffer cache for full table scans has changed in three dramatic ways:
Now, taking advantage of the new generous available memory size, Debbie can cache even big tables by using this new feature, which John describes as “automatic big table caching.”
But Debbie looks skeptical. The database size is huge, and the buffer cache, albeit larger, is still much smaller than the database size. In fact, there are some tables that are quite big. When they are moved into the buffer cache, they will force a lot of popular buffers out. So, she muses, the risk of flooding the buffer cache still remains.
The risk is still lurking, John agrees. Therefore, instead of risking the entire buffer cache, Oracle Database enables just a portion to be used for caching big tables. This is controlled by an initialization parameter. Debbie wants to allocate up to 40 percent of the buffer cache for this purpose, so John uses the following SQL statement:
alter system set db_big_table_cache_percent_target = 40;
This sets aside 40 percent of the buffer cache for big table caching. The other 60 percent will not see any buffers from full table scans on large tables, so popular buffers can still be present in the buffer cache. John informs everyone that 90 percent is the maximum amount that can be allocated. This is a dynamic parameter, so John does not have to recycle the database instance for this parameter to take effect.
Code Listing 1: Checking big table cache statistics (at first setup)
select * from v$bt_scan_cache; BT_CACHE_ALLOC BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP CON_ID —————————————— ——————————————— ———————————— ———————————————— ——————————————— —————— .210158085 40 0 0 1000 0 select * from v$bt_scan_obj_temps; no rows selected
Two dynamic performance views show John how the big table cache is working. He selects from the views, as shown in Listing 1, and explains their workings to the people in his office. The first view, V$BT_SCAN_CACHE, shows a summary of the big table cache. The V$BT_SCAN_CACHE view includes the following columns:
BT_CACHE_ALLOC. Even though John allocated 40 percent to the big table cache, not all of that cache might be required. As the demand for buffers in the big table cache rises due to incoming big tables, more space will be allocated. This column shows the ratio of the buffer cache used by the big table cache right now to the overall size of the buffer cache; in our example, it’s 0.210158085, or about 21 percent, because there is no big table in the buffer cache yet.
BT_CACHE_TARGET. This column shows the target allocation percentage, 40 percent, as set by John earlier.
OBJECT_COUNT. This column shows how many objects are in the big table cache. Because John just set up this cache, there are no objects yet; hence, this shows 0.
MEMORY_BUF_ALLOC. This column shows how many buffers are allocated to the objects in the big table cache right now. Again, because John just set up the big table cache, this shows 0.
MIN_CACHED_TEMP. As John explained earlier, temperature is a new way to designate the usefulness of an object (for example, a table) in the big table cache. The more often a table is accessed, the higher its temperature and, hence, the more beneficial it is for the table to be in this cache. This column shows the minimum temperature of objects that will be considered for this cache. It shows 1,000; therefore, objects with temperatures below 1,000 will not be considered.
CON_ID. This column shows the container ID for a database in a multitenant environment.
The second view, V$BT_SCAN_OBJ_TEMPS, shows the details of the big table cache. The V$BT_SCAN_OBJ_TEMPS view includes the following columns:
TS#. This column shows the tablespace number an object resides in. You can join this view with the TS$ table to get the tablespace name.
DATAOBJ#. This column shows the data object number of the object. You can join this view with DBA_OBJECTS to get the object name.
SIZE_IN_BLKS. This column shows the number of blocks of this object that were considered for the big table cache in this database instance.
TEMPERATURE. This column shows the temperature of the object.
POLICY. This column shows how the object was cached: either partially or in its entirety.
CACHED_IN_MEM. This column shows how many blocks of this object are in the big table cache.
CON_ID. This column shows the container ID for a database in a multitenant environment.
Pointing to Listing 1, John notes that selecting from the V$BT_SCAN_OBJ_TEMPS view returns no rows. This is not surprising, he adds, because he just set up the big table cache and no table has been cached yet. To demonstrate the effect of this new caching behavior, John picks a large table in the database named T1. First, he uses the following SQL query to find out how many blocks are used by this table:
select blocks from user_tables where table_name = 'T1'; BLOCKS ——————————————— 335952
This table has 335,952 blocks—indicating that the table is pretty large. He executes a full table scan on the table:
select count(1) from t1;
After the full table scan, he selects from the two views described earlier. This time the views show different data, as shown in Listing 2. Pointing to the output from the V$BT_SCAN_CACHE view, John notes that the BT_CACHE_ALLOC column now shows 0.400012911, indicating that about 40 percent of the buffer cache is allocated to the big table cache. This means that the big table cache is now using the complete 40 percent he allocated, up from the 21 percent he observed earlier.
Code Listing 2: Checking big table cache statistics (after full table scan operation)
select * from v$bt_scan_cache; BT_CACHE_ALLOC BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP CON_ID —————————————— ——————————————— ———————————— ———————————————— ——————————————— —————— .400012911 40 1 49570 1000 0 select * from v$bt_scan_obj_temps; TS# DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY CACHED_IN_MEM CON_ID —————— —————————— ———————————— ——————————— ——————— ————————————— ————————— 196612 95956 335952 1000 MEM_PART 49570 0
The output in Listing 2 also shows 1 for the OBJECT_COUNT column, indicating that one object is in the cache. The MEMORY_BUF_ALLOC column shows 49,570, which is how many buffers are in the big table cache.
Next John points to the output from the V$BT_SCAN_OBJ_TEMPS view in Listing 2, which shows the specific details of how the cache is used. The DATAOBJ# column shows 95,956. John uses the following query to find out which object that is:
select object_name from dba_objects where data_object_id = 95956; OBJECT_NAME ————————————————— T1
The object is the same T1 table on which John forced the full table scan. That table is now in the buffer cache. However, he notes that of the 335,952 total blocks of the table, only 49,570 are in the cache, as shown in the CACHED_IN_MEM column. The reason, he explains, is that the big table buffer cache is too small to hold all the blocks of the table. He confirms this by checking the value of the POLICY column, which shows MEM_PART, indicating that only a part of the table, not the entire table, is loaded into memory.
“But we have additional memory now,” observes Debbie. “Sure,” agrees John. “That enables us to increase the total system global area [SGA] size, which will increase the buffer cache as well.” He increases the SGA size, using the following SQL statement:
alter system set sga_target = 3G;
He also increases the big table cache percentage to 90 percent:
alter system set db_big_table_cache_percent_target = 90;
After this he executes a few full table scans on the T1 table as he did earlier and checks the views again as shown in Listing 3. There are several interesting pieces of data in the output, he notes. First, the big table cache allocation is now .865177265, or 86.5 percent, of the buffer cache. The temperature of the object in the big table cache is 4,000—up from 1,000 earlier. This increase was due to the increased number of full table scans on the table. The CACHED_IN_MEM column shows 335,952, the same as the number of blocks of the table, indicating that the entire table has been cached.
Code Listing 3: Checking big table cache statistics (after increasing cache size)
select * from v$bt_scan_cache; BT_CACHE_ALLOC BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP CON_ID —————————————— ——————————————— ———————————— ———————————————— ——————————————— —————— .865177265 90 1 175625 1000 0 select * from v$bt_scan_obj_temps; TS# DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY CACHED_IN_MEM CON_ID —————— —————————— ———————————— ——————————— ——————— ————————————— ————————— 196612 95956 335952 4000 MEM_ONLY 335952 0
He confirms that by checking the POLICY column, which shows MEM_ONLY, which means that all access to the table is from memory only, not from disk. And this all-memory access occurred even though the table was accessed via full table scans only. Oracle Database determined that the table had been accessed more than others (as indicated by the temperature). Hence, it will be beneficial to cache it in memory—and there is enough memory available to cache it. John’s audience loves to hear that.
Debbie still looks a bit unconvinced, though. The database doesn’t have just one big table as John showed, she points out. It has many such big tables, and the pattern of access to them will be unpredictable. The memory can’t just grow infinitely, so how will the database system decide how much of which of those tables to cache, she asks.
That is easy to show through a demo, John replies. He performs a full table scan on table T2:
SQL> select count(1) from t2;
Then he checks the views again, as shown in Listing 4. He directs the attention of his colleagues to the second line in the second query, in which the DATAOBJ# column shows a new object: 95964. Using the query he used earlier, John ascertains that this object is table T2, which is exactly what he used in the second full table scan. However, the POLICY column shows MEM_PART, which means that the table is only partially in memory, unlike table T1, which is completely in memory. But table T2 was accessed most recently, Debbie points out, so shouldn’t the database have edged blocks of the T1 table (which were accessed earlier than T2) out of the cache to make room for all the blocks of T2?
Code Listing 4: Checking big table cache statistics (after adding second table)
select * from v$bt_scan_cache; BT_CACHE_ALLOC BT_CACHE_TARGET OBJECT_COUNT MEMORY_BUF_ALLOC MIN_CACHED_TEMP CON_ID —————————————— ——————————————— ———————————— ———————————————— ——————————————— —————— .866228351 90 2 277731 1000 0 select * from v$bt_scan_obj_temps; TS# DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY CACHED_IN_MEM CON_ID —————— —————————— ———————————— ——————————— ——————— ————————————— ————————— 196612 95956 335952 4000 MEM_ONLY 335952 0 196612 95964 334149 1000 MEM_PART 102106 0
No, replies John, this is precisely where the TEMPERATURE column comes in. Even though T1 was accessed earlier than T2, it has been accessed more often (as shown by its temperature, 4,000) than T2 (whose temperature is 1,000). Therefore, T2 was given lower priority than T1.
Continuing the demo, John performs several more full table scans on T2. After that he checks the big table caching statistics, shown in Listing 5. Pointing to the output, he notes that table T1 (with a DATAOBJ# value of 95,956) still has a temperature of 4,000, which is less than that of T2 (with a DATAOBJ# value of 95,964), which has now heated up to a temperature of 5,000. This changes the priority of the tables. Because T2 has a higher temperature, more buffers of that table are now in memory. In fact, all of table T2 is now in memory, which John confirms by checking the value of POLICY, which is MEM_ONLY. The buffers of table T1 are only partially in the cache, as indicated by MEM_PART under POLICY.
Code Listing 5: Checking big table cache statistics (after temperature change)
select * from v$bt_scan_obj_temps; TS# DATAOBJ# SIZE_IN_BLKS TEMPERATURE POLICY CACHED_IN_MEM CON_ID —————— —————————— ———————————— ——————————— ——————— ————————————— ————————— 196612 95964 334149 5000 MEM_ONLY 334149 0 196612 95956 335952 4000 MEM_PART 107376 0
Debbie is now fully convinced. Not only will this enable them to take advantage of the additional memory, reducing the burden on disks without any configuration changes to the applications or the database objects, but the amount of memory for this special cache can also be dynamically controlled without recycling of the database. Caitlyn is ecstatic to hear that the database performance can improve without an investment in expensive storage. The audience members thank John and file out of his office.
READ more about big table caching.
Prior to Oracle Database 12c Release 188.8.131.52, the only way to improve the performance of a full table scan of a large table was by investing in faster but significantly more expensive storage.
The big table cache feature in Oracle Database 12c Release 184.108.40.206 enables DBAs to allocate a portion of the buffer cache to a big table cache for full table scans, which can reduce disk access and improve performance significantly.
Arup Nanda has been an Oracle DBA since 1993, handling all aspects of database administration. He was Oracle Magazine’s DBA of the Year in 2003 and received an Oracle Excellence Award for Technologist of the Year in 2012.