The Oracle + PHP Cookbook

Easy Trees with Oracle and PHP
by Nick Bollweg

Use a few elegant queries and functions to make working with your hierarchical data a breeze.

Downloads for this article:
 Sample code and listings
 Oracle Database 10g Express Edition
 Oracle Instant Client
 Oracle JDeveloper Extension for PHP

Published December 2005

Almost every data-driven application relies on some form of hierarchical data of varying complexity: products in categories, messages in folders, employees in departments. Of course, you will need to display this data at some point to yield a catalog, inbox, or organization chart. With Oracle's vendor-specific SQL extensions and PHP's uncanny knack for handling arrays, you can retrieve and display a tree that is inherently highly optimized in a way that's concise and easy to maintain.

Because the queries and functions discussed here will be less process-intensive and create more readable code, this HowTo is useful at implementation time as well as when refactoring existing code. If your data has tree-like data shapes (currently displayed or yet to be tapped for their value), this HowTo will be valuable. Users of the latest and greatest RDBMS are in luck as new features make some tricky hierarchical tasks even easier, although versions since Oracle8i have the basic underlying functionality.

Know Your Data

The basic problem is that most users want to use and display data stored in flat tables in a more meaningful way. Some of the most common data shapes of this kind are:

  • Taxonomies: Kingdom, Phylum, Class, or Country, City, County, State
  • Genealogies: Grandparent, Parent, Child
  • Organizations: President, Manager, Employee, or Category, Subcategory, Item, Sub-item
While the values in and position of each standard query's result row refer to only that row, a hierarchical query returns a result whose rows have a location in a tree structure. To tease this structural information out of non-hierarchical results, it is necessary to walk through each of the values, inspecting and building another data structure along the way. Avoiding this process, while letting Oracle do what it does best, cuts out an entire step of developer interaction with the data.

In the examples below, you'll use the last of the above patterns, the business organization. For the specific data, you'll use the relatively straightforward HR database fragment provided in the Oracle Database 10g Express Edition starter database.

figure 1

Provided you have PHP compiled with the OCI8 extension, no special setup is required to use the following queries and methods. Using a data abstraction class such as Pear:DB, ADOdb, or PHP 5.1's PDO can increase your productivity, requiring less code for the same functionality. Not to discourage this practice, but the hierarchical methods described in this HowTo are an extension to SQL available only to Oracle users. Your code will not be portable to other vendors' RDMBS, one of the main goals of abstraction classes. Since all of the abstraction classes implement similar functionality, the examples below will use the base OCI methods.

The CONNECT BY Connection

The first query provides an illustration of CONNECT BY in action:

SELECT ENAME, JOB, EMPNO, MGR
      FROM EMP 
CONNECT BY MGR = PRIOR EMPNO
START WITH MGR IS NULL
The important elements to note here are:
  • CONNECT BY: This tells the query processor that you are interested in a hierarchy. The following expression will tell the processor what columns it needs to look at to understand the hierarchy.
  • PRIOR: This special reserved word says that the following value is in a row higher up in the chain of rows. In this case, you're asking for the situation where an employee's manager is another employee in the table.
  • START WITH: This clause describes where to start in the hierarchy. The nature of this data dictates that a NULL signifies an employee who reports to no other employee, so you look for an employee with a NULL manager. Other choices made at data modeling and implementation time may require more care; NOCYCLE is discussed below.
However, when the query is run, the result still looks like a flat table in no apparent order:
ENAME      JOB            EMPNO        MGR
---------- --------- ---------- ----------
KING       PRESIDENT       7839
JONES      MANAGER         7566       7839
SCOTT      ANALYST         7788       7566
ADAMS      CLERK           7876       7788
FORD       ANALYST         7902       7566
SMITH      CLERK           7369       7902
...
Look more closely: Each of the first three employees after the president is directly below its respective manager. After that, the pattern is more complicated: Ford is an analyst reporting to Jones, but under Adams. That's the essential problem of this flat display of the tree, as only one child can be below its parent. To help us understand this structure, you'll need the first of several pseudocolumns made available automatically when using hierarchical queries. Changing the list of selected columns in the above query to include LEVEL yields:
ENAME      JOB            EMPNO        MGR      LEVEL
---------- --------- ---------- ---------- ----------
KING       PRESIDENT       7839                     1
JONES      MANAGER         7566       7839          2
SCOTT      ANALYST         7788       7566          3
ADAMS      CLERK           7876       7788          4
FORD       ANALYST         7902       7566          3
SMITH      CLERK           7369       7902          4
...
Each row now carries an indication of how deep in the tree structure it lies; Ford and Scott have the same level, and no lower-leveled employee appears between them. They are siblings. This is the essential piece of information you need to handle the results in an optimal way. One additional task you may want to perform is some sorting of our results—if you ORDER your carefully-crafted CONNECT BY query BY ENAME, the delicate tree structure will be obliterated. To account for this issue, add SIBLINGS, a construct introduced in Oracle9i, to ORDER BY; each set of rows under a common parent will be made into an ordered list based on your criteria.
SELECT ENAME, JOB, EMPNO, MGR, LEVEL
             FROM EMP 
       CONNECT BY MGR = PRIOR EMPNO
       START WITH MGR IS NULL
ORDER SIBLINGS BY ENAME;
Your query is ready; let's go get the data. Before fetching the results, you need to connect to the database, create a statement with the query above, and have Oracle parse it. (For a complete listing, see the sample code.) With the statement prepared, you can fetch your results:
$nrows = oci_fetch_all($stmt, $results, 0, 0, 
                                OCI_FETCHSTATEMENT_BY_ROW);
Note that you must use the OCI_FETCHSTATEMENT_BY_ROW flag to pull the data in a row-based representation rather than the default column-based representation. This is critical for providing your yet-to-be-written code with an easily managed structure.

Working with Data

Next, you have to write the workhorse of this pattern. array_map allows you to apply, with very little syntactic overhead, a user callback function to each element of an array in a highly optimized fashion. This strikes a balance between programmer time and processor cycle savings, letting you write faster code concisely. This example concocts a bulleted list of the organization, showing each employee and her job. Such lists are easy to visualize; plus, they nicely degrade and style with CSS.

function treeFunc( $current ){
        // the previous row's level, or null on the first row
        global $last;
        
        // structural elements
        $openItem =     '<li>';
        $closeItem =    '</li>';
        $openChildren = '<ul>';
        $closeChildren =        '</ul>';
        $structure = "";
        
        if( !isset( $current['LEVEL'] ) ){
                // add closing structure(s) equal to the very last
                // row's level; this will only fire for the "dummy"
                return str_repeat($closeItem.$closeChildren, 
                        $last);
        }
        
        // add the item itself
        $item = "{$current['ENAME']} <i>{$current['JOB']}</i>";
        
        if ( is_null( $last ) ) {
                // add the opening structure in the case of
                // the first row
                $structure .= $openChildren; 
        } elseif ( $last < $current['LEVEL'] ) {
                // add the structure to start new branches
                $structure .= $openChildren;
        } elseif ( $last > $current['LEVEL'] ){
                // add the structure to close branches equal to the 
                // difference between the previous and current                  
                // levels
                $structure .= $closeItem.
                           str_repeat( $closeChildren.$closeItem, 
                                $last - $current['LEVEL'] );
        } else {
                $structure .= $closeItem;
        }
        
        // add the item structure
        $structure .= $openItem;
        
        // update $last so the next row knows whether this row is
        // really its parent
        $last = $current['LEVEL'];
        
        return $structure.$item;
}
Much of the above code reflects the level of the row; that, combined with the level of the row above it, tells you all you need to know about the data in a single pass. If the current level is greater than the previous, the tree needs to grow taller. If current level is less or the same, the tree needs to grow wider, though at which level depends on whether it's the former or the latter. One pitfall in understanding this section is that because each row knows only the previous row's level—all closing formatting must be done by the following row. In the case of the very last row, it is necessary to insert a "dummy" row to clean up the formatting.

Let array_map Do the Walking

Now you've got a set of results and a function that knows what to do with it. The goal of the PHP you write here is to allow you to put these two together in the most expedient manner possible:

// you need this value accessible inside the formatting 
// function; in an object-oriented approach, this can
// be a class variable
global $last;

// set a value not possible in a LEVEL column to allow the 
// first row to know it's "firstness"
$last = null;
        
// add a dummy "row" to cap off formatting
$results[] = array();

// invoke our formatting function via callback
$formatted = array_map("treeFunc", $results);array_map( "treeFunc", $results );

//output the results
echo implode("\n", $formatted);implode( "\n", $formatted );
The above code allows you to isolate your database and result set interaction operations from individual-row formatting operations, keeping the code readable and maintainable. The result of this work is:
• KING  
                              
PRESIDENT 
    • BLAKE  
                              
MANAGER 
       • ALLEN  
                              
SALESMAN 
       • JAMES  
                              
CLERK 
       • MARTIN  
                              
SALESMAN 
       • TURNER  
                              
SALESMAN 
       • WARD  
                              
SALESMAN 
    • CLARK  
                              
MANAGER 
       • MILLER  
                              
CLERK 
    • JONES  
                              
MANAGER 
       • FORD  
                              
ANALYST 
           • SMITH  
                              
CLERK 
       • SCOTT  
                              
ANALYST 
           • ADAMS  
                              
CLERK 
                            

Special Features of Oracle9i and Oracle 10g

Oracle9i enables some tasks that previously required building a data structure and recursing through it: The function SYS_CONNECT_BY_PATH takes a column name that will be appended to the value for all descendants, plus a delimiter. With it, you can build out unique IDs based on a row's position in the hierarchy, explode on the delimiter to determine a particular nth generation ancestor or, carry out many other tricks.

Oracle 10g provides an even easier method if you are interested a row's very first ancestor: CONNECT_BY_ROOT. This keyword, at any level, will return the level 1 ancestor's value of the column it precedes, much like PRIOR above. In addition, Oracle 10g introduced several constructs to avoid some of the work-arounds and extensive checking needed avoid common hierarchy issues. Typically if the query processor finds itself inspecting a row more than once to determine ancestry, it throws an error. Some data structures inevitably contain such loops, and data owners consider them healthy and good. In acquiescence to this fact, adding NOCYCLE after CONNECT BY will allow results to be returned, plus populate the pseudocolumn CONNECT_BY_ISCYCLE to be analyzed or displayed as appropriate.

CONNECT_BY_ISLEAF, like CONNECT_BY_ISCYCLE, is a pseudocolumn that provides whether a particular row has any child rows. This could have especially useful ramifications for CSS-based formatting later in the process.

Where to Next?

With the HTML on the page, there are still interesting and important choices to make. Does your tree need to be active, allowing end users to expand and collapse branches? Do you want output in a more spatial fashion? Should rows with different field values be formatted differently? These choices are wholly dependent on your data and output goals, and they can be implemented rapidly by modifying the various structural elements in treeMethod.
// construct an array acceptable as a callback type
$cbm = new ConnectByMapper();
$formatted = array_map(array($cbm,"treeMethod"), $results);array_map( array( $cbm,"treeMethod" ), $results );
An object-oriented approach to the problem is also included in the sample code, providing a starting point for further experimentation.

Conclusion

While not available in all implementations of SQL, the hierarchical method of querying has enough benefits to encourage the savvy developer to know and make use of them, especially if your data is riddled with multi-level hierarchies. Particularly suited to the output of SQL queries, PHP is ready to work with your data in a functional or object-oriented fashion, and leave your code ready for you or the next maintainer to make enhancements. Consider these advantages in your next development process, or look for ways to gain them in your existing code!


Nick Bollweg is a Senior Programmer at the Sidney Kimmel Cancer Center at Johns Hopkins Medical Institute in Baltimore, Maryland.

Send us your comments