How to XQuery Non-JDBC Sources from JDBC


By Yuli Vasiliev


Learn to use XQuery from within JDBC code to access data in non-JDBC sources.

Published March 2011

It has been possible to use XQuery, the query and programming language for manipulating XML data, from JDBC code since the release of Oracle Database 10g Release 2, when XQuery first appeared in Oracle XML DB. However, using the XQuery and XML DB features that first became available with Oracle Database 11g Release 2 -- like extension-expression pragmas and indexing unstructured XML content – can now significantly improve the efficiency of your JDBC code making XQuery queries.

This article provides some examples of using XQuery from within JDBC, accessing data in non-JDBC sources. It also covers the XQuery API for Java (XQJ; JSR 225), an alternative to JDBC  that provides a standard Java API for interacting with an XQuery engine.

Before You Start

To follow the article examples, you’ll need access to an Oracle 11g Release 2 database. This release of Oracle Database comes with the JDBC drivers compatible with JDK 1.5. Moreover, the JDBC Thin driver that will be used in the examples, as well as the JDBC OCI driver, support JDK 1.6 (although it’s not required here). So, what you’ll need to do is to add the following jar files to the CLASSPATH environment variable:

ORACLE_HOME/jdbc/lib/ojdbc5.jar
ORACLE_HOME/jlib/orai18n.jar

The next step is to check out whether the Java compiler and the Java interpreter are available in your system so that you’re sure that the code you’ll be working with can be compiled and run. It’s important to note that Oracle Database comes with both Java compiler and Java interpreter. So, make sure you have ORACLE_HOME/jdk/bin added to the path environment variable. Then, from an operating system prompt, you can issue the commands Javac and Java, with a version option to make sure you’re using the right software.

Note that in Oracle Database 11g Release 1 and later, you can configure fine-grained access control to external network resources using the access control list (ACL) feature. By default, Oracle Database denies access to an external network resource. Since the examples used here illustrate using XQuery queries accessing network resources through their URLs, you’ll have to create an ACL allowing connection to those resources by the database user you’re going to use. For example, if you’re going to use the HR demonstration schema, connecting to the resources located on the localhost, then you might issue the following PL/SQL block utilizing the CREATE_ACL and ASSIGN_ACL procedures of the DBMS_NETWORK_ACL_ADMIN supplied PL/SQL package:

CONN /AS SYSDBA

BEGIN
DBMS_NETWORK_ACL_ADMIN.CREATE_ACL(
  acl => 'localhost.xml',
  description => 'localhost ACL',
  principal => 'HR',
  is_grant => true,
  privilege => 'connect');
DBMS_NETWORK_ACL_ADMIN.ASSIGN_ACL(
  acl => 'localhost.xml',
  host => 'localhost');
END;
/
COMMIT;


Once the above code has been completed, the connect privilege for localhost is granted to HR – meaning you can now use this schema to access the localhost resources through the UTL_TCP, UTL_HTTP, UTL_SMTP, and UTL_MAIL PL/SQL packages. It's important to emphasize that XQuery uses UTL_HTTP behind the scenes when it comes to querying network resources through the HTTP protocol.

Diagrammatically, this might look like the figure below:

 xquery-jdbc-f1 

XQuery-ing Web Documents

The need to query a non-JDBC source like a Web document, using its URL, is not the only case when you might want to turn to XQuery, of course. You can consult the “Using XQuery with Oracle XML DB” chapter in the Oracle XML DB Developer's Guide to learn about all the exciting possibilities that are open to you with Oracle XQuery. Just to summarize here, you can use XQuery to efficiently query, construct, and transform any data that can be expressed in XML, including relational data. For example, you might issue an XQuery statement against a relational source to transform relational data into XML.

The examples in this article, however, will illustrate how you might take advantage of the new Oracle XQuery features when querying Web documents such as XML and HTML, which by no means represent JDBC-accessible data sources.

At this point you might be wondering: If such documents are not JDBC-accessible, how can I access them from within JDBC code? To exclude possible misunderstandings here, it’s important to understand that although the documents you’re XQuery-ing may not be JDBC-accessible, the XQuery engine being used resides within an Oracle Database, which indeed represents a JDBC-accessible source.

In other words, your JDBC code knows nothing about the source it's querying -- just passing the XQuery statement to the XQuery engine that evaluates that statement and access the source specified in it. The XQuery engine, in turn, is known to your JDBC code, since it’s an integrated part of the Oracle Database, which is JDBC-accessible indeed. So, in this context, the database can be thought of as an intermediary between your JDBC code and sources it accesses through XQuery.

Schematically, this architecture might look like this:

xquery-jdbc-f2 

Whatever data source you query with Oracle XQuery from JDBC, you first have to obtain a JDBC connection to the Oracle database you want to use to process the XQuery statements being issued.

This is best understood by example. Suppose you want to query the following XML document accessible through http://localhost/bonuses.xml:

<?xml version="1.0" ?>
<employees>
 <employee>
  <empno>100</empno>
  <ename>Steven King</ename>
  <email>SKING</email>
  <bonus>3000</bonus>
 </employee>
 <employee>
  <empno>171</empno>
  <ename>William Smith</ename>
  <email>WSMITH</email>
  <bonus>2000</bonus>
</employee>
</employees>


Then, you might use the following JDBC code to access the above document:

import java.sql.*;
import oracle.jdbc.*;
import oracle.jdbc.pool.OracleDataSource;
class JDBCXQuery
{
 public static void main (String args[]) throws SQLException
 {
  OracleDataSource ods = new OracleDataSource();
  ods.setURL("jdbc:oracle:thin:hr/hr@localhost:1521/orcl11g");
  Connection conn = ods.getConnection();
  Statement stmt = conn.createStatement();
  String qry = "SELECT * FROM XMLTable("+
      "'for $i in $h/employees/employee " +
       "return $i '" +
       "PASSING xmlparse(document " +
       "httpuritype('http://localhost/bonuses.xml').getCLOB()) as \"h\" " +
       "COLUMNS ename VARCHAR2(45) PATH '/employee/ename', " +
       "        bonus NUMBER(10,2) PATH '/employee/bonus' )";
  ResultSet rs = stmt.executeQuery(qry);
  System.out.println("Bonuses:");
  while(rs.next())
       System.out.println(rs.getString(1) + ": $" + rs.getFloat(2));
 }
}


The above should produce the following output:

Bonuses:
Steven King: $3000.0
William Smith:  $2000.0


Multi-target XQuery Queries

The preceding example illustrated how you might use the XMLTable SQL function to decompose the results of an XQuery query into regular SQL rows, so that you can float them down with the next() method of a ResultSet instance. However, Oracle XQuery can be much more flexible than that.

For example, having got a virtual table generated by XMLTable, you can use it in a join query, linking its rows to the ones derived from regular relational tables. Here is an example:

import java.sql.*;
import oracle.jdbc.*;
import oracle.jdbc.pool.OracleDataSource;
class MultitargetXQuery
{
 public static void main (String args[]) throws SQLException
 {
  OracleDataSource ods = new OracleDataSource();
  ods.setURL("jdbc:oracle:thin:hr/hr@localhost:1521/orcl11g");
  Connection conn = ods.getConnection();
  Statement stmt = conn.createStatement();
  String qry = "SELECT d.department_name, b.ename, b.bonus FROM XMLTable("+
      "'for $i in $h/employees/employee " +
       "return $i '" +
       "PASSING xmlparse(document " +
       "httpuritype('http://localhost/bonuses.xml').getCLOB()) as \"h\" " +
       "COLUMNS empno NUMBER(6) PATH '/employee/empno', " +
       "        ename VARCHAR2(45) PATH '/employee/ename', " +
       "        bonus NUMBER(10,2) PATH '/employee/bonus' ) b," +
       "employees e," +
       "departments d " +
       "WHERE (e.employee_id = b.empno) AND (d.department_id=e.department_id)";
  ResultSet rs = stmt.executeQuery(qry);
  System.out.println("Bonuses:");
  while(rs.next())
       System.out.println("Department: " + rs.getString(1) +": " + rs.getString(2) + ": $" + rs.getFloat(3));
 }
}


As you can see in the above example, the virtual table generated by XMLTable participates in a join, along with the employees and departments relational tables from the HR demonstration schema. The join conditions are highlighted in bold. Examining the select list of this join query, however, you may notice that it doesn’t include any column of the employees table. That’s right: this table, in this particular example, acts as the intermediary between the rows generated by XMLTable and the departments’ rows. The output of this program should be the following:

Bonuses:
Department: Executive: Steven King: $3000.0
Department: Sales: William Smith:  $2000.0


Using Oracle XQuery Extension-Expression Pragmas

The examples discussed so far should have helped you to get a grasp on how XQuery can be used in JDBC. What they’ve not illustrated, however, are those new Oracle XQuery features, which were first introduced in Oracle Database 11g Release 2 and which might help you increase the efficiency of your JDBC code.

One of those features is extension-expression pragmas, which, among other things, provides the ability to treat an invalid XPath expression operating on XML schema-based data, as if its targeted node does not exist, not raising an invalid XPath exception and, therefore, making it unnecessary to catch this exception in JDBC code. The example discussed in this section will illustrate how you might take advantage of this new feature.

From the preceding example, you learned how you might intermix the rows generated as a result of an XQuery evaluation with the rows selected from relational tables, within a single SQL statement. In the following example, you’ll look at how you might intermix the rows generated by two different XQuery queries – each evaluated by a separate XMLTable. The target of the first XQuery query is the bonus.xml document used in the preceding examples, and the target of the second one is the XML schema-based XMLType table purchaseorder from the OE demonstration schema. So, the first thing you need to do is to grant the connect privilege for localhost to OE:

BEGIN
DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE(
  acl => 'localhost.xml',
  principal => 'OE',
  is_grant => true,
  privilege => 'connect');
END;
/
COMMIT;


Now you can compile and run the following code, which includes, in one of the XQuery expressions, an XPath expression that targets a non-existent node in the purchaseorder document:

import java.sql.*;
import oracle.jdbc.*;
import oracle.jdbc.pool.OracleDataSource;
class XQueryPragmas
{
 public static void main (String args[]) throws SQLException
 {
  OracleDataSource ods = new OracleDataSource();
  ods.setURL("jdbc:oracle:thin:oe/oe@localhost:1521/orcl11g");
  Connection conn = ods.getConnection();
  Statement stmt = conn.createStatement();
  String qry = "SELECT e.ename, po.reference FROM XMLTable("+
      "'for $i in $h/employees/employee " +
       "return $i '" +
       "PASSING xmlparse(document " +
       "httpuritype('http://localhost/bonuses.xml').getCLOB()) as \"h\" " +
       "COLUMNS ename VARCHAR2(25) PATH '/employee/ename', " +
       "        email VARCHAR2(25) PATH '/employee/email') e, " +
       "purchaseorder p, " +
       "XMLTable(" +
       "'for $i in $po/PurchaseOrder " +
       "return $i ' " +
       "PASSING OBJECT_VALUE as \"po\" " +
       "COLUMNS reference VARCHAR2(30) PATH '/PurchaseOrder/Reference', " +
       "        usr VARCHAR2(25) PATH '/PurchaseOrder/User', " +
       "        orddt VARCHAR2(25) PATH '/PurchaseOrder/Date') po " +
     "WHERE po.usr = e.email " +
     "ORDER BY po.orddt"; 
  ResultSet rs = stmt.executeQuery(qry);
  while(rs.next())
       System.out.println("User: " + rs.getString(1) + "PORef" + rs.getString(2));
 }
}


Unsurprisingly, because the /PurchaseOrder/Date node does not conform to the XML schema associated with the purchaseorder XML document and, therefore, may not exist in this document, the above code will end up with the following error:

ORA-19276: XPST0005 – XPath step specifies an invalid element/attribute name: (Date)


To fail gracefully, you might of course, create an exception block in the program to catch such errors. However Oracle XQuery provides another solution to this problem, without JDBC code having to generate any exception. In particular, you can put pragma #ora:invalid_path empty #  right before a questionable XPath expression. So, the updated query string would look like this:

...
  String qry = "SELECT e.ename, po.reference FROM XMLTable("+
      "'for $i in $h/employees/employee " +
       "return $i '" +
       "PASSING xmlparse(document " +
       "httpuritype('http://localhost/bonuses.xml').getCLOB()) as \"h\" " +
       "COLUMNS ename VARCHAR2(25) PATH '/employee/ename', " +
       "        email VARCHAR2(25) PATH '/employee/email') e, " +
       "purchaseorder p, " +
       "XMLTable(" +
       "'for $i in $po/PurchaseOrder " +
       "return $i ' " +
       "PASSING OBJECT_VALUE as \"po\" " +
       "COLUMNS reference VARCHAR2(30) PATH '/PurchaseOrder/Reference', " +
       "        usr VARCHAR2(25) PATH '/PurchaseOrder/User', " +
       "        orddt VARCHAR2(25) PATH '(#ora:invalid_path empty #) {/PurchaseOrder/Date}') po " +
     "WHERE po.usr = e.email " +
     "ORDER BY po.orddt"; 
...


Now, if you recompile XQueryPragmas and then run it again, this should produce the following output: 

User: Steven King   PORef:  SKING-20021009123337503PDT
User: William Smith PORef:  WSMITH-20021009123335450PDT
User: Steven King   PORef:  SKING-20021009123337703PDT
User: Steven King   PORef:  SKING-20021009123338294PDT
User: Steven King   PORef:  SKING-20021009123337974PDT
User: Steven King   PORef:  SKING-20021009123336392PDT
User: Steven King   PORef:  SKING-20021009123336131PDT
User: Steven King   PORef:  SKING-20021009123336822PDT
User: Steven King   PORef:  SKING-20021009123336622PDT
User: Steven King   PORef:  SKING-20021009123336321PDT
User: Steven King   PORef:  SKING-20021009123335560PDT
User: Steven King   PORef:  SKING-20021009123336952PDT
User: Steven King   PORef:  SKING-20021009123337153PDT
User: William Smith PORef:  WSMITH-20021009123337924PDT
User: William Smith PORef:  WSMITH-20021009123338154PDT
User: William Smith PORef:  WSMITH-20021009123336412PDT
User: William Smith PORef:  WSMITH-20021009123335751PDT
User: William Smith PORef:  WSMITH-20021009123335741PDT
User: William Smith PORef:  WSMITH-20021009123335650PDT
User: Steven King   PORef:  SKING-20021009123337383PDT


The important thing to note here is that the Oracle XQuery extension-expression pragmas in general, and #ora:invalid_path empty # discussed in this example in particular, are not bound to a certain function – say, XMLTable – and can be used with the others functions performing XQuery expression evaluation. For example, the SELECT statement discussed here could be rewritten in a way that the second appearance of XMLTable (whose XQuery expression argument targets the purchaseorder document) is omitted in favor of a series of the XMLCast(XMLQuery(…)) combinations. However, you still would be able to use the pragma. This is how the implementation of the query’s ORDER BY clause based on XMLCast(XMLQuery(…)) might look like:

ORDER BY XMLCast(XMLQuery('(#ora:invalid_path empty #){$p/PurchaseOrder/Date}' 
  PASSING OBJECT_VALUE AS "p" RETURNING CONTENT) AS VARCHAR2(30))


Indexing

XMLIndex, first introduced in Oracle Database 11g Release 1 and enhanced in Release 2, now supports structured as well as unstructured components. Using XMLIndex with an unstructured component can be especially powerful for indexing Web documents, because in this case the domain of information that has little structure is very large. On the other hand, a structured component can be useful when you encounter document-centric data containing structured islands, organizing such islands in a relational format. For example, a Web document representing an article may contain metadata such as title, author, pubDate, which can be considered relational islands inside that document and, therefore, can be profitably organized in a relational format.

To see XMLIndex in action, let’s walk through a simple example. The first thing we will do is create a new database schema and grant all the required privileges to it:

CONN /as sysdba

CREATE USER usr IDENTIFIED BY usr;        

GRANT connect, resource TO usr;     

BEGIN
DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE(
  acl => 'localhost.xml',
  principal => 'USR',
  is_grant => true,
  privilege => 'connect');
END;
/
COMMIT;


The next step is to create a table to be indexed, in the newly created schema:

CONN usr/usr

CREATE TABLE otn_xml(
  link VARCHAR2(200) PRIMARY KEY,
  item XMLType);


As you can see, the otn_xml table created here contains an XMLType column, which you can index with XMLIndex. Before doing that, though, it would be interesting to look at the execution statistics of a query issued against unindexed XML data stored in the table, so that you’ll be able to compare it with the statistics generated for this same query when the data has been indexed. Thus, you first have to populate the otn_xml table with data. This can be done with the following JDBC code:

import java.sql.*;
import oracle.jdbc.*;
import oracle.jdbc.pool.OracleDataSource;
class XMLIndexXQuery
{
 public static void main (String args[]) throws SQLException
 {
  OracleDataSource ods = new OracleDataSource();
  ods.setURL("jdbc:oracle:thin:usr/usr@localhost:1521/orcl11g");
  Connection conn = ods.getConnection();
  Statement stmt = conn.createStatement();
  String qry =
      "INSERT INTO otn_xml(item, link) "+
      "SELECT * FROM XMLTable(" +
        "'for  $i in $h/rss/channel/item "+
        "return
{$i/title, $i/link, $i/pubDate, $i/description}' " +
        "PASSING xmlparse (document httpuritype
('http://feeds.delicious.com/v2/rss/OracleTechnologyNetwork/otntecharticle').getCLOB()) as \"h\" "+
        "COLUMNS item XMLType PATH '/', "+
             "link VARCHAR2(200) PATH '/item/link') "+
      "WHERE link NOT IN (SELECT link FROM otn_xml)";
  int rows = stmt.executeUpdate(qry);
  conn.commit();
  System.out.println("Number of rows inserted now is: "+ rows);
 }
}


Since the above JDBC program accesses an Internet resource (the OTN - New Articles RSS document), you must be connected to the Internet during program execution. The program extracts the crucial information about the latest OTN articles from the RSS document and put it into the otn_xml table created as discussed earlier. You can safely run the program any number of times – it won’t save the information about the same article twice, not will it raise a primary key violation error. This is because the condition specified in the WHERE clause of the subquery in the INSERT statement allows only those rows that are not already in the otn_xml table.

Now that the table is populated, you can go ahead and query it. To play with queries, turn back to your SQL prompt tool and connect as usr/usr. Assuming you’re using SQL*Plus, set the AUTOTRACE variable on explain to view the execution statistics of the statements being issued, as follows:

SET AUTOTRACE ON EXPLAIN


Then, issue a query that targets a single record in the otn_xml table, searching for a certain article title being extracted from the XML data stored in the item XMLType column. Such a query might look like this:

SELECT XMLCast(XMLQuery('$t/item/title' PASSING o.ITEM AS "t" RETURNING CONTENT) AS VARCHAR2(200)) FROM otn_xml o 
 WHERE XMLExists('$t/item[title="Enhancing Oracle Database Performance with Flash Storage"]' PASSING o.item AS "t");


Here is the output with the execution plan included:

XMLCAST(XMLQUERY('$T/ITEM/TITLE'PASSINGO.ITEMAS"T"RETURNINGCONTENT)ASVARCHAR2(20
--------------------------------------------------------------------------------
Enhancing Oracle Database Performance with Flash Storage  

Execution Plan
----------------------------------------------------------                     
Plan hash value: 4149380736                                                    
                                                                               
--------------------------------------------------------------------------------------------------------------

| Id  | Operation                           | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                    |                        |    15 | 30030 |     9  (12)| 00:00:01 |

|*  1 |  FILTER                             |                        |       |       |            |          |
 
|   2 |   TABLE ACCESS FULL                 | OTN_XML                |    15 | 30030 |     3   (0)| 00:00:01 |

|   3 |   NESTED LOOPS                      |                        |     2 |     8 |     6  (17)| 00:00:01 |

|   4 |    COLLECTION ITERATOR PICKLER FETCH| XMLSEQUENCEFROMXMLTYPE |     1 |     2 |     2   (0)| 00:00:01 |

|*  5 |    COLLECTION ITERATOR PICKLER FETCH| XMLSEQUENCEFROMXMLTYPE |     2 |     4 |     4  (25)| 00:00:01 |

--------------------------------------------------------------------------------------------------------------


Now, let’s create XMLIndex on the item XMLType column of the otn_xml table.

 CREATE INDEX otn_idx ON otn_xml(item) INDEXTYPE IS XDB.XMLIndex;


Re-running this same query, you should see the following execution plan:

Execution Plan
----------------------------------------------------------                     
Plan hash value: 3507075359                                                    
                                                                                
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT              |                             |     1 |  3034 |     4  (25)| 00:00:01 |
 
|   1 |  SORT GROUP BY                |                             |     1 |  3524 |            |          |

|*  2 |   TABLE ACCESS BY INDEX ROWID | SYS74798_OTN_IDX_PATH_TABLE |     1 |  3524 |     2   (0)| 00:00:01 |

|*  3 |    INDEX RANGE SCAN           | SYS74798_OTN_IDX_PIKEY_IX   |     1 |       |     1   (0)| 00:00:01 |

|   4 |  NESTED LOOPS                 |                             |     1 |  3034 |     4  (25)| 00:00:01 |

|   5 |   SORT UNIQUE                 |                             |     1 |  3022 |     2   (0)| 00:00:01 |

|*  6 |    TABLE ACCESS BY INDEX ROWID| SYS74798_OTN_IDX_PATH_TABLE |     1 |  3022 |     2   (0)| 00:00:01 |

|*  7 |     INDEX RANGE SCAN          | SYS74798_OTN_IDX_VALUE_IX   |     1 |       |     1   (0)| 00:00:01 |

|   8 |   TABLE ACCESS BY USER ROWID  | OTN_XML                     |     1 |    12 |     1   (0)| 00:00:01 |

-------------------------------------------------------------------------------------------------------------

Note that the above execution plan shows almost no improvement as compared with the one generated for this same query accessing the unindexed data, shown previously. The fact is that, by default, an XMLIndexis created with an unstructured component; structured component must be included explicitly. You can add a structured component to an existing XMLIndex by registering a new parameter representing that structured component and then altering the index, adding the registered parameter. In our example, though, let’s recreate the index so that it includes only a structured component:

DROP INDEX otn_idx;

CREATE INDEX otn_idx ON otn_xml(item) INDEXTYPE IS XDB.XMLIndex
PARAMETERS (
'XMLTable otn_ptab ''/item''
COLUMNS
   title VARCHAR2(200) PATH ''title'',
   pubDate DATE PATH ''pubDate''');

As you might guess, the above index targets the title and pubDate nodes under the item root node, which will speed up access to the content of those nodes.

Now if you rerun the query introduced in this section earlier, you should note a significant improvement in the execution plan generated:

Execution Plan
----------------------------------------------------------  
Plan hash value: 296194532  
-------------------------------------------------------------------------------------------------------        
| Id  | Operation                    | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |          
-------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT             |                        |     3 |   378 |     2   (0)| 00:00:01 |    
                          
|   1 |  SORT AGGREGATE              |                        |     1 |   114 |            |          |      

|   2 |   TABLE ACCESS BY INDEX ROWID| OTN_PTAB               |     1 |   114 |     1   (0)| 00:00:01 | 

|*  3 |    INDEX RANGE SCAN          | SYS74802_74803_RID_IDX |     1 |       |     1   (0)| 00:00:01 |

|   4 |  NESTED LOOPS SEMI           |                        |     3 |   378 |     2   (0)| 00:00:01 |
 
|   5 |   INDEX FAST FULL SCAN       | SYS_C0011149           |    15 |   180 |     2   (0)| 00:00:01 | 
    
|*  6 |   TABLE ACCESS BY INDEX ROWID| OTN_PTAB               |     1 |   114 |     0   (0)| 00:00:01 | 

|*  7 |    INDEX RANGE SCAN          | SYS74802_74803_RID_IDX |     1 |       |     0   (0)| 00:00:01 |

------------------------------------------------------------------------------------------------------- 

The following query uses the LIKE condition in the WHERE clause to perform a test involving pattern matching within the /item/description node:

SELECT x.title, x.description FROM otn_xml o, XMLTable(
      '/item'
       PASSING o.item
       COLUMNS title VARCHAR2(200) PATH 'title',
       description VARCHAR2(2000) PATH 'description') x
WHERE x.description LIKE '%Architect%';


The output produced might look like this:

TITLE                                                                           
--------------------------------------------------------------------------------
DESCRIPTION                                                                    
--------------------------------------------------------------------------------
Agile Enterprise Architecture                                                   
Can enterprise architecture and agile development play in the same sandbox?
Voices from the community weigh in in the Architect Community column in the Nov-Dec 2010 issue of Oracle Magazine.

Execution Plan
----------------------------------------------------------                     
Plan hash value: 3907039892                                                    
                                                                               
-------------------------------------------------------------------------------------------------------------          
| Id  | Operation                          | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |          
-------------------------------------------------------------------------------------------------------------   
|   0 | SELECT STATEMENT                   |                        |  6126 |    11M|   722  (44)| 00:00:09 |  
|   1 |  NESTED LOOPS                      |                        |  6126 |    11M|   722  (44)| 00:00:09 |           
|   2 |   TABLE ACCESS FULL                | OTN_XML                |    15 | 30030 |     3   (0)| 00:00:01 |  
|*  3 |   COLLECTION ITERATOR PICKLER FETCH| XMLSEQUENCEFROMXMLTYPE |   408 |   816 |    48  (44)| 00:00:01 |  
-------------------------------------------------------------------------------------------------------------  


Examining the above execution plan, you can probably guess that the XMLIndex you’ve got doesn’t help here. This is fairly predictable since that index doesn’t include an unstructured component at the moment, which can be particularly effective in such searches. So, let’s add an unstructured component to the index:

ALTER INDEX otn_idx PARAMETERS('PATH TABLE path_tab');


Then, rerun the query to check to see if there is any improvement. Here is the execution plan you should see now:

Execution Plan
----------------------------------------------------------   
Plan hash value: 445613365     
       
-----------------------------------------------------------------------------------------------------------  
  
| Id  | Operation                     | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |    
  
----------------------------------------------------------------------------------------------------------- 
 
|   0 | SELECT STATEMENT              |                           |     1 |  1534 |     6   (0)| 00:00:01 | 
 
|*  1 |  FILTER                       |                           |       |       |            |          |   
  
|*  2 |   TABLE ACCESS BY INDEX ROWID | PATH_TAB                  |     1 |  3524 |     2   (0)| 00:00:01 |      
  
|*  3 |    INDEX RANGE SCAN           | SYS74802_OTN_IDX_PIKEY_IX |     1 |       |     1   (0)| 00:00:01 |         
 
|*  4 |  FILTER                       |                           |       |       |            |          |   
  
|*  5 |   TABLE ACCESS BY INDEX ROWID | PATH_TAB                  |     1 |  3524 |     2   (0)| 00:00:01 |         
 
|*  6 |    INDEX RANGE SCAN           | SYS74802_OTN_IDX_PIKEY_IX |     1 |       |     1   (0)| 00:00:01 |     
 
|*  7 |  FILTER                       |                           |       |       |            |          |  
 
|   8 |   NESTED LOOPS                |                           |     1 |  1534 |     4   (0)| 00:00:01 | 
 
|*  9 |    TABLE ACCESS FULL          | PATH_TAB                  |     1 |  1522 |     3   (0)| 00:00:01 |   
 
|  10 |    TABLE ACCESS BY USER ROWID | OTN_XML                   |     1 |    12 |     1   (0)| 00:00:01 | 
 
|* 11 |   FILTER                      |                           |       |       |            |          | 
 
|* 12 |    TABLE ACCESS BY INDEX ROWID| PATH_TAB                  |     1 |  3524 |     2   (0)| 00:00:01 | 
  
|* 13 |     INDEX RANGE SCAN          | SYS74802_OTN_IDX_PIKEY_IX |     1 |       |     1   (0)| 00:00:01 | 
 
----------------------------------------------------------------------------------------------------------- 


From the above, performance improvement should be evident.

XQJ

XQJ is a Java API that enables XQuery queries. XQJ allows you to execute XQuery queries, bind data to them, and then process their results. XQJ is similar to JDBC in many ways, but is used to process XQuery queries only. Oracle XML Developer's Kit (Oracle XDK) supports XQJ, providing you the packages to build XQJ programs. These packages can be found in the following Oracle XDK files: xdk\lib\xqjapi.jar, xdk\lib\xqjori.jar. So, these files must be included to the classpath when compiling a XQJ program. During execution, though, some more JARs may be necessary.

Consider the following example in which the XQJ API is used to connect to the OTN - New Articles RSS document (this same document was used in the preceding example) and then extract those article titles that include the ‘Oracle Database’ substring.

import javax.xml.xquery.*;
 import javax.xml.stream.*;
 import javax.xml.namespace.QName;
class XQJExtSource
 {
  public static void main (String args[]) throws Exception
  {
     XQDataSource xds = new oracle.xquery.xqj.OXQDataSource();
     XQConnection conn = xds.getConnection();
    XQPreparedExpression pEx = conn.prepareExpression(
       "declare variable $doc external;"+
       "for $c in $doc//item "+
       "where fn:contains($c/title,'Oracle Database') "+
       "return $c/title"
     );
    java.net.URL doc = new java.net.URL("http://feeds.delicious.com/v2/rss/OracleTechnologyNetwork/otntecharticle");
     java.io.InputStream  inpt=doc.openStream();;
    pEx.bindDocument(new javax.xml.namespace.QName("doc"), inpt, null, null);
    XQResultSequence rslt = pEx.executeQuery();
     while (rslt.next()){
       System.out.println(rslt.getAtomicValue());
     }
     conn.close(); 
     inpt.close();
  }
 }


Compilation of the above program with the xdk\lib\xqjapi.jar, xdk\lib\xqjori.jar files included to the classpath still may end up with an error, since neither of these files contains the javax.xml.stream.XMLStreamReader class required when the compilation process comes to the XQPreparedExpression.bindDocument() method (at least as of the XDK coming with Oracle Database 11.2.0.1.0). Although the bindDocument() method used in this particular example employs an java.io.InputStream value as the second parameter, the absence of javax.xml.stream.XMLStreamReader makes the compiler generate an error. If the JDK version your database comes with is earlier than 6.0, you won’t find this class in the JDK jars either. The fact is that the XMLStreamReader class belongs to Streaming API for XML (StAX), which ships as part of Java Standard Edition 6 runtime, not part of earlier versions. So, you must have a copy of JDK 6 or at least JRE 6 on your system. You can download it from the Java SE Downloads page at http://www.oracle.com/technetwork/java/javase/downloads/index.html. You’ll find javax.xml.stream.XMLStreamReader in jre\lib\rt.jar.

At runtime, the above program must also have access to the following Oracle XDK files: xquery.jar and xmlparserv2.jar. Also, orai18n-collation.jar from the ORACLE_HOME\jlib directory must be included.

The output generated by the above program might look like this (depends on the actual content of the RSS page, of course):

Enhancing Oracle Database Performance with Flash Storage
 Enhancing Oracle Database Performance with Sun Flash Storage


It’s important to emphasize that the above example illustrated how you might access a Web document through its URL. In contrast, accessing an XML document stored in the local file system or Oracle XML DB Repository does not require you to declare that document as external and then use the XQPreparedExpression.bindDocument() method to bind it to the query. Instead, you can just use the fn:doc XQuery function within the query itself.

Thus, rewriting the preceding program to access the OTN - New Articles RSS document downloaded to the /home/myfiles directory on the local file system and saved as otntecharticle.xml, you might produce the following code:

import javax.xml.xquery.*;
class XQJFileSource
 {
  public static void main (String args[]) throws Exception
  {
     XQDataSource xds = new oracle.xquery.xqj.OXQDataSource();
     XQConnection conn = xds.getConnection();
    XQPreparedExpression pEx = conn.prepareExpression(
       "for $c in fn:doc('/home/myfiles/otntecharticle.xml')//item "+
       "where fn:contains($c/title,'Oracle Database') "+
       "return $c/title"
     );
    XQResultSequence rslt = pEx.executeQuery();
     while (rslt.next()){
       System.out.println(rslt.getAtomicValue());
     }
     conn.close(); 
   }
 }


Conclusion

As you learned in this article, the advantage of using XQuery in JDBC is that you get the ability to access a wide range of different sources, including those considered normally non-JDBC sources. Thus, having got just a URL, you can issue an XQuery query accessing a corresponding HTML, XML, RSS, or any other document on the Web.

Also, you learned that SQL\JDBC is not the only way to enable XQuery queries in Java. Alternatively, you might use XQJ, which makes it possible to issue XQuery queries directly, without an SQL wrapper.


Yuli Vasiliev is a software developer, freelance author, and consultant currently specializing in open source development, Java technologies, business intelligence (BI), databases, and service-oriented architecture (SOA). He is the author of Oracle Business Intelligence: The Condensed Guide to Analysis and Reporting (Packt, 2010) as well as a series of other books on the Oracle technology.