TECHNOLOGY: Ask Tom
On External Table Queries, Data Consistency, and NothingBy Tom Kyte
Our technologist queries the operating system, locks manually, and uses the right NULL.
In some of my customer’s databases, the DBAs are using datafiles with automatic extension but with many datafiles sharing the same file system, such as
tablespace A, datafiles /u01/oradata/ts_A_file01.dbf autoextend unlimited
and so on. The requested extension is that all datafiles must be capable of growing by at least 20 percent of their current size, so if, for example, ts_A_file01.dbf currently is 100 GB and ts_B_file01.dbf is 200 GB, we must ensure that at least 20 + 40 = 60 GB is free in the /u01/oradata file system.
The question is: how can we monitor this in a single query inside the database? Right now we have a complex script gathering free space from the df command in a text file, opening a cursor, calculating the current allocated space from DBA_DATA_FILES, and reading the df data via an external table.
This can be done in a single SQL query. To accomplish this, I’ll need to be able to query disk free (df) interactively—without a complex set of maneuvers such as running a script or redirecting output. I’ll start by making it so that the df output can be queried as if it were in a table. I’ll be relying on a feature first added in Oracle Database 11g Release 2 but subsequently back-ported to version 10.2.0.5 as well (so this works in Oracle Database 10g Release 2 and above). This feature is the preprocessor directive. (I’ve written about this previously.)
To start I’ll create a directory where I can place a small shell script that will produce the df output:
SQL> create or replace 2 directory exec_dir 3 as '/home/tkyte/df' 4 / Directory created.
Next, I’ll create a shell script named run_df.sh in that directory. This script will contain only the following:
#!/bin/bash /bin/df –Pl
And the output of that script will look something like Listing 1.
Code Listing 1: Output of the df shell script
SQL> !./run_df.sh Filesystem 1024-blocks Used Available Capacity Mounted on /dev/mapper/VolGr... 18156292 10827600 6391528 63% / /dev/sda1 101086 12062 83805 13% /boot tmpfs 517520 0 517520 0% /dev/shm
Note that in the run_df.sh script, I used explicit pathnames to run df—I did not rely on the environment and on the path environment variable in particular. This is very important: when coding scripts for external tables—when coding scripts in general!—you always want to use explicit paths to run the program you actually intend to run. You don’t have any real control over the environment this script will run in, so relying on the environment being set a certain way is a recipe for disaster.
So, now that I have the script and the directory, I am ready to create the external table. As you can see from the output in Listing 1, all I need to do is have the external table skip the first record and then parse each subsequent line, using white space as a delimiter. This is something an external table can do easily, as shown in Listing 2.
Code Listing 2: Creating the df external table
SQL> create table df 2 ( 3 fsname varchar2(100), 4 blocks number, 5 used number, 6 avail number, 7 capacity varchar2(10), 8 mount varchar2(100) 9 ) 10 organization external 11 ( 12 type oracle_loader 13 default directory exec_dir 14 access parameters 15 ( 16 records delimited 17 by newline 18 preprocessor 19 exec_dir:'run_df.sh' 20 skip 1 21 fields terminated by 22 whitespace ldrtrim 23 ) 24 location 25 ( 26 exec_dir:'run_df.sh' 27 ) 28 ) 29 / Table created.
With the df external table created, I can now review the df output easily in a query, as shown in Listing 3.
Code Listing 3: Querying the df external table
SQL> select * from df; FSNAME BLOCKS USED AVAIL CAPACITY MOUNT ——————————————————————————————— ———————— ———————— ——————— ——————— —————— /dev/mapper/VolGroup00-LogVol00 18156292 10827600 6391528 63% / /dev/sda1 101086 12062 83805 13% /boot tmpfs 517520 0 517520 0% /dev/shm
Note: Hopefully you see how this approach could work easily for ps, ls, du, and so on—all the UNIX utilities could easily be considered “tables” now!
With this data available to me in the df external table, it is easy to start working on the query. All I need to do is join df to DBA_DATA_FILES with a join condition that matches the longest mount point possible to each filename. Before I do this, however, I’m going to change my df output for testing purposes. Because I have a rather small file system with just one mount point and I want to test my logic, I’m going to “fake” my df data by changing the script as shown in Listing 4.
Code Listing 4: Fake data for df (for testing purposes)
SQL> !cat run_df.sh #!/bin/bash #/bin/df -Pl echo Filesystem 1024-blocks Used Available Capacity Mounted on echo /dev/mapper/VolG... 18156292 10827600 6391528 63% / echo /dev/mapper/VolG... 18156292 10827600 6391528 63% /home/ ora11gr2/app/ora11gr2/oradata/ora11gr2/ORA11GR2 echo /dev/sda1 101086 12062 83805 13% /boot echo tmpfs 517520 0 517520 0% /dev/shm
Instead of running df, I’ll fake the output by echoing two possible mount points for the datafiles. Also, I’ll be able to change the available amount of data to test for the 20 percent datafile extension condition.
My single-query solution to the issue in this question is in Listing 5. Here’s what’s happening on some of the lines:
Code Listing 5: Single-query monitoring solution
SQL> with fs_data 2 as 3 (select /*+ materialize */ * 4 from df 5 ) 6 select mount, 7 file_name, 8 bytes, 9 tot_bytes, 10 avail_bytes, 11 case 12 when 0.2 * tot_bytes < avail_bytes 13 then 'OK' 14 else 'Short on disk space' 15 end status 16 from ( 17 select file_name, mount, avail_bytes, bytes, 18 sum(bytes) over 19 (partition by mount) tot_bytes 20 from ( 21 select a.file_name, 22 b.mount, 23 b.avail*1024 avail_bytes, a.bytes, 24 row_number() over 25 (partition by a.file_name 26 order by length(b.mount) DESC) rn 27 from dba_data_files a, 28 fs_data b 29 where a.file_name 30 like b.mount || '%' 31 ) 32 where rn = 1 33 ) 34 order by mount, file_name 35 / MOUNT FILE_NAME BYTES TOT_BYTES AVAIL_BYTES STATUS ————— ——————————————————————————————— ————————— —————————— ——————————— —————— / /home/ora11gr2/app/ora11gr2/ 360710144 2410283008 6544924672 OK oradata/ora11gr2/example01.dbf / /home/ora11gr2/app/ora11gr2/ 1101004800 2410283008 6544924672 OK oradata/ora11gr2/sysaux01.dbf / /home/ora11gr2/app/ora11gr2/ 924844032 2410283008 6544924672 OK oradata/ora11gr2/system01.dbf / /home/ora11gr2/app/ora11gr2/ 131072 2410283008 6544924672 OK oradata/ora11gr2/system02.dbf / /home/ora11gr2/app/ora11gr2/ 23592960 2410283008 6544924672 OK oradata/ora11gr2/users.dbf /home/or /home/ora11gr2/app/ora11gr2/ 144703488 376438784 6544924672 OK a11gr2/a oradata/ora11gr2/ORA11GR2/ pp/ora11 datafile/o1_mf_big_tabl_ gr2/orad 7y3thv78_.dbf ata/ora1 1gr2/ORA 11GR2 /home/or /home/ora11gr2/app/ora11gr2/ 231735296 376438784 6544924672 OK a11gr2/a oradata/ora11gr2/ORA11GR2/ pp/ora11 datafile/o1_mf_undotbs_ gr2/orad 78w1hprj_.dbf ata/ora1 1gr2/ORA 11GR2 7 rows selected.
On lines 3 and 4, I query the df external table. I purposely use a materialize hint to force the optimizer to load the df data into the equivalent of a global temporary table, because the query would tend to read and reread the external table over and over and the results of the df table could change as the query runs. This provides the logical equivalent of a consistent read on the df data. Also, if the query plan did involve rereading the external table, I would receive an error message at runtime:
KUP-04108 unable to reread file string
The documentation explains the error:
Cause: The query that is executing requires that a datasource for an external table be read multiple times. However, the datasource is a sequential device that cannot be reread. Examples of this type of datasource are a tape or pipe.
Action: There are a few ways around this problem. One is to rewrite the query so that the external table is referenced only once. Another option is to move the datasource to a rereadable device such as a disk file. A third option is to load the data for the external table into a temporary table and change the query so that it references the temporary table.
On lines 27–30, I join DBA_DATA_FILES to df data with a WHERE clause, using LIKE. This will join every file in DBA_DATA_FILES to every possible mount point in the df output. I know that the goal, however, is to find the “longest” matching mount point, so to accomplish that I assign—on lines 24–26—a ROW_NUMBER to each row. This ROW_NUMBER will be sequentially assigned to each duplicated row in DBA_DATA_FILES, so if the FILE_NAME matches more than one MOUNT, each FILE_NAME occurrence will be assigned a unique, increasing ROW_NUMBER value. This ROW_NUMBER will be assigned after the data is sorted by the length of the MOUNT, from big to small.
Once I have that data, I apply a WHERE clause to save only the first entry for each FILE_NAME value—that predicate is WHERE rn = 1, on line 32. At the same time, I’ve added another column—TOT_BYTES—on lines 18-19. This will enable me to verify the 20 percent threshold.
The last step is to format and output the data. I print the columns I’m interested in and add a CASE statement on lines 11–15 to verify that 20 percent of the total bytes of storage allocated on a given mount point does not exceed the remaining available bytes of free storage.
So, now you see how to use external tables to query operating system output such as df, ps, and ls. Additionally, you can use them to query anything that writes to standard out, including gunzip, sed, and so on. If you’d like to see an interesting example of using an external table to query multiple gzipped compressed files, take a look at bit.ly/PQgOol.
Question on Locking
If Oracle Database provides automatic locking and resolves deadlock situations automatically, why are there manual locking and deadlock solutions?
There are many reasons. The automatic locking falls into the 80/20 rule: 80 percent of the time, it is sufficient. But developers need to understand when locking is something they must be concerned about. Here is one of the simplest reasons: loss of update detection.
Suppose you have a table containing a primary key ID, names, addresses, and phone numbers. Further, there is an application that can read a row from this table, put it onscreen, and let end users update the values. When a user clicks a button, the application will issue update table t set name = :name, address = :address, phone = :phone where id = :id.
Now, think about what would happen if two people pulled up your record at about the same time and one of them changed the address and the other changed your phone number. What would happen if they both clicked the button now?
One update would succeed—say the one that changed your address. All would be well and good so far.
The second update, to change your phone number, would end up updating the entire row again, reverting your address to what it was before, and the address update would be lost.
In a client/server application, you might employ what is known as pessimistic concurrency control. Before the application lets you type over the address or the phone number, it will issue a SELECT * FROM t WHERE id = :id AND name = :name AND address = :address AND phone = :phone FOR UPDATE NOWAIT to manually lock the record as long as the record still exists. If it does, great—you can modify the columns and be assured that the update will succeed and not overwrite anyone else’s changes. If the SELECT returns zero rows, you’ll know that someone else modified the record and that you have to requery the record before you can modify it. If the query returns an error message, you’ll know that someone else is currently working on it.
That is one reason for manual locking. Here is another situation. Suppose you have a business rule that says, “The total number of employees in a department cannot exceed 100.” How do you enforce that? You have to make sure that when you make an insert into the EMP table, no one else is inserting into that same department. You have to serialize. You might use a DBMS_LOCK (user-defined lock) to do this, you might use the LOCK TABLE command, you might use a select * from dept where deptno=:x for update nowait to serialize at the department level, and so on. But you need a manual lock.
Automatic locking covers 80 percent of everything you need. You need manual locking for the rest to ensure logical data consistency.
NULLS and Cardinality
A few issues ago, I wrote up part of a presentation I delivered at Oracle OpenWorld in 2011. It was from “Five Things You Probably Didn’t Know about SQL.” In that article, I said I’d be writing up other sections of that presentation, and that is what I’m doing here. This time I’ll take a look at NULLs and their possible effect on cardinality estimates and index use.
I’ve written many times (seebit.ly/Pjrbyr, for example) about how important cardinality estimates are to the cost-based optimizer (CBO). In short, if the optimizer guesses incorrectly how many rows will flow through steps in a query plan, you’ll probably get a bad plan and a poorly performing query. For example, if the optimizer believes that it is going to retrieve 2 rows out of 1,000,000, it will likely choose to use an index if one is available. But what if, at runtime, you discover you are not getting just 2 rows but 500,000? Then using the index will be the worst-possible approach.
Getting the right estimated cardinality values is important to the optimizer, and anything you do to throw that off—to make it harder to get the right estimate—is bad.
But what does this have to do with NULL values? Nothing really—it has to do with what can happen when developers do not use a NULL value when they should have. Many times developers fear using NULLs: they do not understand them, and they do not believe they can be indexed, so they avoid them. They will use a “fake” value—such as 01-JAN-9999—to represent a missing date value. This is a bad idea for many reasons; the first I’ll show you is how it can throw off cardinality estimates.
I’ll start with a table that uses a NULL value. This table will represent “effective-dated” records—that is, each record has an effective start date and end date. If a record is “current”—that is, it doesn’t have an end date yet—the end date will be NULL. Here is the CREATE TABLE statement for that table:
SQL> create table t 2 pctfree 20 3 as 4 select a.*, 5 case when mod(rownum,100) <= 50 6 then last_ddl_time 7 end end_date 8 from all_objects a; Table created.
In this table, about half of the rows have an END_DATE and half are NULL. Next, I’ll create an index on the END_DATE column for retrieval purposes:
SQL> create index t_idx 2 on t(end_date); Index created.
If I look at the data, I’ll discover that many of the values are clumped together in one month:
SQL> select count(*) 2 from t 3 where end_date 4 between to_date('01-sep-2010', 'dd-mon-yyyy') 5 and to_date('30-sep-2010', 'dd-mon-yyyy'); COUNT(*) ————————————— 36267
My ALL_OBJECTS view has about 72,000 records in it, so table T has about 72,000 records, and half of them fall in the month of September 2010. Next I’ll gather statistics on this table, and these statistics will include histograms on the END_DATE column. If you are curious about how or why they have histograms, refer to the “Why Does My Plan Change?” section of bit.ly/PQnpPB.
SQL> begin 2 dbms_stats.gather_table_ stats(user, 'T'); 3 end; 4 / PL/SQL procedure successfully completed.
Now I’ll look at the data to understand it before I proceed:
SQL> select count(*), 2 count(distinct end_date), 3 count(end_date), 4 min(end_date), 5 max(end_date) 6 from t; CNT CNTD CNT2 MIN MAX —————— ————— —————— ————————— ————————— 72228 703 36850 01-OCT-02 30-SEP-11
As you can see, there are 72,228 rows in the table, with a low value of October 1, 2002, and a high value of September 30, 2011. I know that about half of these rows fall into September 2010 (there is data skew). Also, it has 703 distinct dates, which will affect the type of histogram the optimizer can store. Because there are more than 255 distinct values, the optimizer will have access to a height-balanced, rather than frequency-based, histogram.
Now if I execute a query against this table, asking for the rows in September 2010, the optimizer will be able to intelligently come up with a query plan, as shown in Listing 6.
Code Listing 6: Query of NULLs and a good plan
SQL> set autotrace traceonly explain SQL> select * 2 from t 3 where end_date 4 between to_date( '01-sep-2010', 'dd-mon-yyyy' ) 5 and to_date( '30-sep-2010', 'dd-mon-yyyy' ); Execution Plan —————————————————————————————————————————————————————————————————————————— Plan hash value: 1601196873 —————————————————————————————————————————————————————————————————————————— | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | —————————————————————————————————————————————————————————————————————————— | 0 | SELECT STATEMENT | | 36024 | 3588K| 339 (1)| 00:00:05 | |* 1 | TABLE ACCESS FULL| T | 36024 | 3588K| 339 (1)| 00:00:05 | —————————————————————————————————————————————————————————————————————————— Predicate Information (identified by operation id): —————————————————————————————————————————————————————————————————————————— 1 - filter(“END_DATE”<=TO_DATE(' 2010-09-30 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "END_DATE">=TO_DATE(' 2010-09-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
Because the optimizer correctly estimated the cardinality of this query at 36,024, it knew to do a full table scan. Using an index would have been a bad idea.
However, what if the developer decides to use a fake date instead of NULL? What will happen then? I’ll update the data and gather statistics again:
SQL> update t 2 set end_date = 3 to_date( '01-jan-9999', 'dd-mon-yyyy' ) 4 where end_date is null; 35378 rows updated. SQL> commit; Commit complete. SQL> begin 2 dbms_stats.gather_table_ stats(user, 'T'); 3 end; 4 / PL/SQL procedure successfully completed.
Now when I run the query in Listing 6 again, I see the results in Listing 7.
Code Listing 7: Result of rerunning the Listing 6 query against fake data—and a bad plan
—————————————————————————————————————————————————————————————————————————— | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| —————————————————————————————————————————————————————————————————————————— | 0 | SELECT STATEMENT | | 175 | 18375 | 10 (0)| | 1 | TABLE ACCESS BY INDEX ROWID| T | 175 | 18375 | 10 (0)| |* 2 | INDEX RANGE SCAN | T_IDX | 175 | | 2 (0)| —————————————————————————————————————————————————————————————————————————— Predicate Information (identified by operation id): —————————————————————————————————————————————————————————————————————————— 1 - filter("END_DATE"<=TO_DATE(' 2010-09-30 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "END_DATE">=TO_DATE(' 2010-09-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
The optimizer has incorrectly estimated the cardinality for this predicate, leading it to think an index would be appropriate when it is not. This is all due to the fake values.
Here is another reason to avoid fake values. In Washington DC, a car owner decided to get a vanity (personalized ) license plate for his car. The license plate the owner chose was NO TAGS. The owner chose it as a joke, but guess what value the programmers at the Department of Motor Vehicles (DMV) had decided to use (instead of NULL) to represent a missing license plate value? The title of the news article (yhoo.it/NRUisL) about this is “DC Man’s ‘NO TAGS’ Vanity Plate Earns Him $20,000 in Tickets.”
Easy to fix, you might say: just don’t get NO TAGS as a license plate. Well, not so fast. Another motorist, Nick Vautier, wanted his initials—NV—as his tag. In California, where he lives, the DMV had decided to use NV, for “not visible,” as its “no tags” value.
Alabama uses XXXXXXX as its “no tags” value, and a news article about the car owner—Scottie Roberson—who got that license plate, mentioned a possible solution: “Birmingham Traffic Engineer Gregory Dawkins says the city may change the system to keep Roberson from receiving more tickets. He says “maybe we just need to leave that part blank altogether.’”
See bit.ly/OgmnJq for other examples. There is no good default value to represent data that is missing. Just use NULL.
Tom Kyte is a database evangelist in Oracle’s Server Technologies division and has worked for Oracle since 1993. He is the author of Expert Oracle Database Architecture (Apress, 2005, 2010) and Effective Oracle by Design (Oracle Press, 2003), among other books.
Send us your comments