by Yuli Vasiliev
Published September 2013
Can Linux utilities be considered database tables? Apparently, they can—with the preprocessor directive feature available for external tables in Oracle Database 11g. It works as follows. You wrap an operating system utility in a bash script, specifying the script as the preprocessor program when creating an external table. Then, to invoke the utility and get the generated results as table rows, you just query the external table within an SQL statement. This is done in a manner similar to querying a regular relational table. In other words, this approach allows you to have operating system data structured in tables, just like other database data. You might ask: what advantages does it offer? The main advantage is that you can easily join operating system data to database data—all within a single SQL query.
It's interesting to note that the approach above is not the only one you can use to launch operating system utilities from within the database. For example, with the CREATE_PROGRAM procedure of DBMS_SCHEDULER, you can have an external executable be launched on some schedule or directly from within PL/SQL code.
To follow the examples provided in this article, you'll need a Linux operating system and Oracle Database installed on it. If you don't have Oracle Database installed on your Linux system (or you don't have Linux installed), the simplest and quickest way to prepare your working environment is to install Oracle VM VirtualBox (it's available for Microsoft Windows, Mac OS X, Oracle Solaris, and Linux), and then install a virtual guest appliance that provides preconfigured Oracle software within it. This appliance contains Oracle Linux as the operating system and is preconfigured with Oracle Database, Enterprise Edition 11g Release 2, which is ready to be used right away.
In Oracle Database, the external tables feature can be used as a data loading tool, providing a good alternative to SQL Loader. Also, you can use external tables to query operating system data, filling an external table with the output generated by an operating system utility. This approach is based on using the preprocessor directive feature available for external tables, starting with Oracle Database 11g. In a nutshell, you write a shell script that invokes a system utility and then specify that script as the preprocessor program in an external table. The general steps are as follows:
Figure 1 shows how the process of querying the operating system with external tables might look:
Figure 1: General steps for querying the operating system with an external table.
This approach might work for a number of Linux utilities, including those that show disk space and system information, as well as those that perform monitoring/debugging, and so on. Thus, you might easily use this approach with commands such as
uname, and so on. Please note, however, that in most cases, when it comes to performing database-related tasks, you may avoid using operating system utilities directly and instead rely on the database SQL commands and database objects.
The following example illustrates how you might use an external table to wrap up the input of the
ls command issued against a Database File System file system containing files stored in an Oracle database.
Note: The Database File System is an Oracle Database 11g feature that provides a standard file system interface to store and access files in the database. These files are stored in a database table as SecureFiles LOBs. At the same time, since the file system is mounted on a regular mount point, you can access it like any other file system in Linux. Details of Database File System are out of the scope of this article. For a brief introduction, see the "Introducing the Oracle Database File System" section in the Oracle Database SecureFiles and Large Objects Developer's Guide.
This example assumes you have a DBFS file system created and mounted in your Linux system. If not, you can do that now by following the steps below (these are the general steps; for details, refer to "DBFS File System Client").
dbfs_create_filesystem.sqlscript to create the file system.
kernel-develpackage and the
FUSEpackage are installed (this step is required on Linux).
After completing the above steps, you can work with the DBFS file system that was created and mounted. The first step might be to copy some files to the file system.
Next, you might enter the root directory of the file system and then issue the
ls command to make sure that this is like any other file system in your system:
$ cd /mnt/dbfs/staging_area $ ls -a -l -X total 9 -rw-r--r-- 1 oracle oracle 1110 Jul 4 21:48 test01.bmp -rwxrwxrwx 1 oracle oracle 1398 Jul 4 21:49 test02.bmp -r--r----- 1 oracle oracle 1765 Jul 4 21:52 test01.java -rw-r--r-- 1 oracle oracle 71962 Jul 4 21:50 test01.jpg -rwxrwxrwx 1 oracle oracle 10197 Jul 4 21:50 test02.jpg -rw-r--r-- 1 oracle oracle 9546 Jul 4 21:51 test04.jpg drwxr-xr-x 7 root root 0 Jul 4 21:28 .sfs -rw-r--r-- 1 oracle oracle 477 Jul 4 21:52 test01.xml
Note the use of the
-X parameter, which instructs
ls to sort the output alphabetically by file extension.
Now you can create an external table that will hold the output above as if it were relational data. Here are the steps to follow:
$ mkdir /home/oracle/ext_tbl_dir
ex_ls.sh shellscript with the following content:
#!/bin/bash /bin/ls /mnt/dbfs/staging_area -a -l -X
# chmod +x /home/oracle/ext_tbl_dir/ex_ls.sh
CONNECT /AS SYSDBA;
ex_ls.shscript, in this particular example) is located.
CREATE OR REPLACE DIRECTORY my_dir AS '/home/oracle/ext_tbl_dir';
my_dirdirectory to the
dbfs_user1schema (here, you're using the same user to interact with the DBFS file system and the external table):
GRANT READ,WRITE,EXECUTE ON DIRECTORY my_dir TO dbfs_user1;
CREATE TABLE ls_tbl( file_priv VARCHAR2(11), file_links NUMBER, file_owner VARCHAR2(25), file_owner_gr VARCHAR2(25), file_sz NUMBER, file_month VARCHAR2(3), file_day NUMBER, file_tm VARCHAR2(6), file_nm VARCHAR2(25)) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY my_dir ACCESS PARAMETERS ( records delimited by newline preprocessor my_dir:'ex_ls.sh' skip 1 fields terminated by whitespace ldrtrim ) LOCATION(my_dir:'ex_ls.sh') ) /
After completing the above steps, you can issue a query against the
ls_tbl external table to make sure that everything is OK so far:
SELECT file_nm "file", file_priv "privileges", file_owner "owner", file_sz "size" FROM ls_tbl WHERE file_sz>0;
The output generated might look like this:
file privileges owner size -------------------- ----------- --------------- ---------- test01.bmp -rw-r--r-- oracle 1110 test02.bmp -rwxrwxrwx oracle 1398 test01.java -r--r----- oracle 1765 test01.jpg -rw-r--r-- oracle 71962 test02.jpg -rwxrwxrwx oracle 10197 test04.jpg -rw-r--r-- oracle 9546 test01.xml -rw-r--r-- oracle 477
As you can see, the row order specified by the
-X parameter of the
ls command remains the same in the table query output above. To recap, using
ls sort the outputted files alphabetically by extension. Doing this by means of SQL would be still possible, of course, but a bit tricky.
Wherever the data in an external table has come from, you can query it as if it were regular relational data, joining it, if necessary, with some other data that you can obtain from within the database. Turning back to the example in the preceding section, you might want to consolidate the external table data and the data available via the DBFS_CONTENT view.
Schematically, this might look like Figure 2:
Figure 2: Joining the data obtained from an external table and the DBFS_CONTENT view.
The following query provides an example of such a join:
SELECT e.file_nm, e.file_owner, dbms_lob.getlength(d.filedata) FROM ls_tbl e, dbfs_content d WHERE e.file_nm = substr(d.pathname,instr(d.pathname,'/',-1,1)+1,length(d.pathname));
The output generated by the query above should look like this:
FILE_NM FILE_OWNER DBMS_LOB.GETLENGTH(D.FILEDATA) -------------------- --------------- ------------------------------ test01.bmp oracle 1110 test02.bmp oracle 1398 test01.java oracle 1765 test01.jpg oracle 71962 test02.jpg oracle 10197 test04.jpg oracle 9546 .sfs root test01.xml oracle 477 8 rows selected.
The example above illustrates how you might join the output generated by a Linux utility with regular SQL data. Actually, you might use this same approach to join the operating system data to any data available from within the database, including the data dictionary. An example of joining
df to DBA_DATA_FILES can be found in Oracle Magazine's article "On External Table Queries, Data Consistency, and Nothing," by Tom Kyte.
As mentioned, using the preprocessor feature of an external table is not the only way to launch an operating system utility from within the database. Thus, you might use the CREATE_PROGRAM procedure of DBMS_SCHEDULER to wrap an external executable in a program that might be then called from within the database. Moreover, you could specify some schedule on which the program will be executed automatically.
The following example illustrates how you might define a scheduler program that wraps a shell script that invokes the
ls command and saves the output into an external text file. Then, you'll be able to define an external table that derives its data from that text file, thus making the
ls output available in the form of relational data. Figure 3 gives a high-level view of how it works.
Figure 3: Launching a Linux utility with the help of DBMS_SCHEDULER.
The following steps walk you through how to create such a scheduler program wrapping a shell script, as well as how to create an external table upon the text file to which the shell script saves its output:
#!/bin/bash /bin/ls /mnt/dbfs/staging_area -a -l -X > /home/oracle/ext_tbl_dir/flist.txt
CONNECT /AS SYSDBA;
BEGIN DBMS_SCHEDULER.CREATE_PROGRAM( program_name => 'sys.ls_to_file', program_type => 'executable', program_action => '/home/oracle/ext_tbl_dir/ls_to_file.sh', enabled => TRUE); DBMS_SCHEDULER.CREATE_JOB( job_name => 'sys.ls_job', program_name => 'sys.ls_to_file', start_date => '15-JUL-13 1.00.00AM US/Pacific', repeat_interval => 'FREQ=DAILY;BYHOUR=1;BYMINUTE=0', enabled => TRUE); END; /
BEGIN DBMS_SCHEDULER.RUN_JOB('sys.ls_job'); END; /
After this step, you should have the
flist.txt file containing the
ls_to_file.sh script output.
CREATE TABLE ls_ext_tbl( file_priv VARCHAR2(11), file_links NUMBER, file_owner VARCHAR2(25), file_owner_gr VARCHAR2(25), file_sz NUMBER, file_month VARCHAR2(3), file_day NUMBER, file_tm VARCHAR2(6), file_nm VARCHAR2(25)) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY my_dir ACCESS PARAMETERS ( records delimited by newline skip 1 fields terminated by whitespace ldrtrim ) LOCATION('flist.txt') ) /
SELECT count(*) FROM ls_ext_tbl; COUNT(*) ---------- 10
It's important to note that the
ls_job job created in Step 3 will be executed automatically on the schedule specified. So, launching the job manually is possible but not required.
This article illustrated Oracle Database features that can be useful for Linux administrators, including how to use external tables to query operating system data and then join that data with the database data. It also covered how to launch Linux utilities from within PL/SQL code, using the DBMS_SCHEDULER package.
Yuli Vasiliev is a software developer, freelance author, and consultant currently specializing in open source development, Java technologies, business intelligence (BI), databases, service-oriented architecture (SOA) and, more recently, virtualization. He is the author of a series of books on Oracle technology, the most recent one being Oracle Business Intelligence: An Introduction to Business Analysis and Reporting (Packt, 2010).
|Revision 1.0, 08/21/2013|