The Oracle + PHP Cookbook

 

Building a Locator with ZIP Data


by Nick Bollweg

Help your users understand your data in new ways, and answer the question: "Do you know where your data is?"

Downloads for this article:
 Oracle Database 10g Express Edition
 Zend Core for Oracle
 US Census ZIP data

Published October 2006

If your organization has any data about objects, people and places that exist in the real world, it is likely that you have an address for them. Very few databases escape from having an address entity somewhere, or several address-related attributes in some other entity. Most of the time, developers need to show a selection of these locations based on some spatial criteria readily available in the database, and this is usually not that much of hassel. State of Maryland? New York City? No problem! However, other questions like "What is the closest X to Y in Z?" or "How many X do we have near Y?" can give a developer the shivers. If you're a big enterprise with the most recent software, you enhance your database with additional components, like Oracle Locator (available in all editions of Oracle Database 10g, including the free-as-in-beer Oracle Database 10g Express Edition) or Oracle Spatial (an option for Enterprise Edition). But what if there's only desire for this information, and no desire to pay for it?

 

ZIP code data is relatively well understood, easy to validate and those address entities and attributes probably include it. Combine this with a little external data, some effort and either some tricky geometry or the power of Oracle Locator, and you can display distance-based results for anything with a ZIP code. This approach will be approximate at best, and shouldn't be used for mission critical applications, but for a quick store locator or visualization technique it should do the trick.

 

If you anticipate having relatively low traffic and yet will be making your ZIP-enabled application available on a public website, you can use a service-based map delivery application to show your locations in glorious JavaScript. In this recipe, we'll explore integrating with Google Maps by means of Phoogle.

 

Follow the Data

 

To take advantage of this technique, you will need some data, which correlates US Postal Service ZIP codes to their latitudes and longitudes. Many providers are available for this service, often providing periodic updates to the tens of thousands of frequently changing ZIP codes. For this recipe, I've used the somewhat dated, but readily available 1999 ZIP data from the US Census Bureau. The following example will use the table ZIP, which has the following definition:
CREATE TABLE ZIP 
( ZIP_CODE VARCHAR2(5),
        CITY VARCHAR2(64), 
        STATE VARCHAR2(2), 
        LATITUDE NUMBER(10, 7), 
        LONGITUDE NUMBER(10, 7),
  ZIP_GEO_LOCATION SDO_GEOMETRY, --only use if you have Locator
        PRIMARY KEY ( ZIP_CODE ) VALIDATE );
You'll also need some locations; really any sort of data will do as long as it has ZIP codes defined. The example uses a table called LOCATION, which has the following definition:
CREATE TABLE LOCATION 
        ( LOCATION_ID NUMBER,
                ADDRESS VARCHAR2(128), 
                CITY VARCHAR2(32), 
                STATE VARCHAR2(2),
                ZIP_CODE VARCHAR2(5),
     LOCATION_GEO_LOCATION SDO_GEOMETRY, --only use if you have Locator
                PRIMARY KEY ( LOCATION_ID ) VALIDATE );
If you are using Oracle Locator, you also need what are called spatial indexes. Locator queries which make use of these indexes are much more readable than those done with the Great Circle approximation described below and offer better performance. To set up these queries, you'll need create some metadata in USER_SDO_GEOM_METADATA about the index you are going to create, then create the index itself.
INSERT INTO USER_SDO_GEOM_METADATA --A view based on MDSYS.SDO_GEOM_METADATA_TABLE
        (TABLE_NAME, COLUMN_NAME, DIMINFO, SRID) 
VALUES 
        ('LOCATION', 'LOCATION_GEO_LOCATION', 
   SDO_DIM_ARRAY 
     (SDO_DIM_ELEMENT('LONG', -180.0, 180.0, 0.5), 
     SDO_DIM_ELEMENT('LAT', -90.0, 90.0, 0.5)), 
   8307);


INSERT INTO USER_SDO_GEOM_METADATA --A view based on MDSYS.SDO_GEOM_METADATA_TABLE
        (TABLE_NAME, COLUMN_NAME, DIMINFO, SRID) 
VALUES 
        ('ZIP', 'ZIP_GEO_LOCATION', 
   SDO_DIM_ARRAY 
     (SDO_DIM_ELEMENT('LONG', -180.0, 180.0, 0.5), 
     SDO_DIM_ELEMENT('LAT', -90.0, 90.0, 0.5)), 
   8307);
One pitfall I ran into here is an apparent bug in the trigger that does a few housekeeping things on insert of a new spatial query: the ownership of the new index (as remembered in MDSYS.SDO_GEOM_METADATA_TABLE, only accessible by a DBA, not the actual schema owner) was given to ANONYMOUS, and not the database user HR. I ended up having to run this query as DBA to permit index creation to function properly:
UPDATE
        MDSYS.SDO_GEOM_METADATA_TABLE 
SET
        SDO_OWNER = 'HR'
WHERE
        SDO_TABLE_NAME IN ('ZIP', 'LOCATION')
If everything checks out upon inspection of SDO_GEOM_METADATA_TABLE, you are ready to insert your spatial indexes.
CREATE INDEX 
        ZIP_SIDX 
ON 
        ZIP(ZIP_GEO_LOCATION)
INDEXTYPE IS 
        MDSYS.SPATIAL_INDEX;

CREATE INDEX 
        LOCATION_SIDX 
ON 
        LOCATION(LOCATION_GEO_LOCATION)
INDEXTYPE IS 
        MDSYS.SPATIAL_INDEX;
If you don't have Oracle Locator, some indexes are crucial to efficient operation: here are the ones I've defined:
CREATE INDEX IDX_ZIP_LAT ON ZIP ( LATITUDE );
CREATE INDEX IDX_ZIP_LAT_D ON ZIP ( LATITUDE DESC );
CREATE INDEX IDX_ZIP_LON ON ZIP ( LONGITUDE );
CREATE INDEX IDX_ZIP_LON_D ON ZIP ( LONGITUDE DESC);
CREATE INDEX IDX_LOCATION_ZIP ON LOCATION ( ZIP_CODE );

Get to Know the Neighbors


Oracle Locator provides several powerful query mechanisms that make querying for spatial data much more logical. For the relatively complex query described in the next several sections, the following "nearest neighbor" query using sdo_nn will suffice and yield significantly better performance:
SELECT 
                ADDRESS,
                CITY,
                STATE,
                l.ZIP_CODE,
                SDO_NN_DISTANCE (1) DISTANCE
        FROM
                LOCATION l
 WHERE
                'TRUE' =  SDO_NN(
                        l.LOCATION_GEO_LOCATION,
                        (SELECT ZIP_LOCATION FROM ZIP WHERE ZIP_CODE = :zip ),
                        'sdo_num_res=999 unit=mile')
  AND
                SDO_NN_DISTANCE (1) < :miles
ORDER BY
                DISTANCE
The corresponding wrapper method for the Locator method is similarly concise:
public function getLocationsWithin( $zip, $miles ){
  $q = << query from above >>;
        $bind = array("miles" => $miles * 1, "zip" => $zip);
        $results = array();
        $this->q( $results, $q, $bind );
        $this->d( $results );
        return $results;
}
If you read the next three sections because you don't have Locator or are morbidly curious, you'll see there is a fair amount of work being done here for you. The only reason not to use the Locator method, really, is if a database upgrade for your organization is impossible at this time.

 

The Great Circle


The piece of geometry that allows us to find distances on a globe is a concept called the Great Circle. Several resources give a thoroughgoing explanation of the mathematics involved. For our solution, though, we only need a cursory understanding of the concept: a formula making use of SQL functions and a few shortcuts.

The Great Circle describes any circle drawn on a sphere that has the same circumference as the sphere. The Earth has two famous Great Circles: the Equator and the Prime Meridian. Another way to think about a Great Circle is a place where you could cut a sphere and be left with two equal pieces or hemispheres. However, the Earth is unfortunately not spherical... it's a little squashed on the top and bottom. This will complicate our calculations some, but luckily this has been solved by other, cleverer people.

 

The User is Center of the World


The math concepts used here are adapted from an article by John Weeg that implements a ZIP-based PL\SQL distance function. He references another link which is, unfortunately, no longer available but his formulae work just fine. The first step is to look up the latitude and longitude of your user's desired center from which other locations must be within a given distance. Also at this stage we can calculate the circumference of the earth, as relevant for this location:
select  LATITUDE,
                LONGITUDE,
                3963 - ( 13 * sin( latitude * 0.017453293 ) ) as RADIUS
        from
                ZIP
        where 
                ZIP_CODE = :zip
The explicit values above are the radius of the Earth(3963) and π/8. Since the above-mentioned Earth squashing has the effect of diminishing the planet's circumference as one approaches the north and south poles, we have to subtract a certain amount for every mile we go from the equator. Also, as you will have to correct in other places, Oracle's trigonometric functions operate with radians, not degrees as latitude and longitude are usually provided: this conversion can be performed by multiplying by π/8. One small optimization here would be to store and pre-calculate these degree values.

 

Squaring the Circle


Weeg's technique, with the luxury of PL/SQL's functional approach, can set intermediate variables to accomplish his goals. Since you would like to implement this solution as purely query-driven, this query will be a bit involved. You've already gathered the information regarding the user's location, so some values and strings can be calculated as PHP variables and inserted into the SQL statement:
  • $slat and $slon: the latitude and longitude of the user's location
  • dlat and $dlon: a query fragment that determines the difference between current row's latitude/longitude from the users' latitude/longitude
  • $ep: the one-eighth π discussed previously
  • rad: the radius of the Earth at the user's location
as in:
SELECT 
                        address,
                        city,
                        state,
                        l.zip_code,
                        distance
                FROM
                        location l,
                        (SELECT
                                $rad * 2 * ASIN(
                                        LEAST(
                                                1,
                                                SQRT(
                                                        POWER( SIN( $dlat / 2 ), 2 ) + 
                                                        ( 
                                                                COS( $slat ) * 
                                                                COS( latitude * $ep  ) * 
                                                                POWER( SIN( $dlon / 2 ), 2 )
                                                        )
                                                )
                                        )
                                )
                                 AS distance,
                                zip_code
                        FROM 
                                zip
                        WHERE
                                latitude BETWEEN $minlat AND $maxlat AND
                                longitude BETWEEN $minlon AND $maxlon) d
                WHERE
                        l.zip_code = d.zip_code
                        AND distance <= :miles
                ORDER BY 
                        distance ASC
The four values $minlat, $maxlat, $minlon, $maxlon are optimizations that help Oracle avoid performing the distance calculation on every row in the ZIP table; by describing a square twice as wide as the desired mile radius, this should cut down the distance calculations considerably.

 

Getting Started


Now that you know the nature of the data, you can think about implementation. As this method doesn't use the Oracle Spatial extension, which could handle this process in a jiffy, all you need to try this out on your desktop is a working Oracle database and a PHP-enabled Web server. Real time-savers to this end are Oracle Databasee XE and Zend Core for Oracle. In addition to the external data mentioned previosuly, if you want to experiment with the service-based mapping made available by Google, you'll need to grab an API key from Google and PHP Google Maps API; this recipe uses Phoogle Maps, a PHP class that integrates with the Google Maps API and the geocoder.us database. Note that to use the Phoogle example you will need to have your sample application in the same directory for which your Google Maps API Key is registered. The Phoogle-less page does not have this restriction.

 

Wrap It Up


I've chosen to use a wrapper class to encapsulate all of the calculations necessary to gather the information required. Here are some of the important methods:
public function getLocationsWithin( $zip, $miles ){
        $this->startLocation = $this->getStartLocation( $zip, $miles );
        $ep = $this->eighth_pi;
        $slon = $start['LONGITUDE'][0] * $ep;
        $slat = $start['LATITUDE'][0] * $ep;
        $rad = $start['RADIUS'][0];
        $dlon = "abs( ( longitude * $ep ) - $slon )";
        $dlat = "abs( ( latitude * $ep ) - $slat )";
        $maxd = ( $miles / $rad * 360 );
        $maxlat =  $start['LATITUDE'][0] + $maxd;
        $minlat =  $start['LATITUDE'][0] - $maxd;
        $maxlon = $start['LONGITUDE'][0] + $maxd;
        $minlon = $start['LONGITUDE'][0] - $maxd;
        $q = << long query from above >>;
        $bind = array("miles" => $miles * 1);
        $results = array();
        $this->q( $results, $q, $bind );
        $this->d( $results );
        return $results;
}
The above method, which is called by the output page, performs the text manipulation to provide you with a query to return a list of locations with their distances. It then queries the database and returns the results.
private function getStartLocation($zip){
        $ep = $this->eighth_pi;
        $q = << short query from above >>;
        $results = array();
        $bind = array( "zip"=> trim( $zip ) );
        $this->q( $results, $q, $bind );
        return $results;
}
After looking up the user's location, this method returns the useful information above the location: latitude, longitude, and corrected radius of the Earth.

 

On the Same Page


Once the tricky work of getting the locations themselves from the database, putting them on the page should be pretty easy. In this implementation, I've chosen to display the content as a table as it is, in fact, tabular data. Here's my very simple implementation, which could certainly benefit from some CSS beautification:
define("DBUSER", "HR");
define("DBPWD", "HR");
define("DBXN", "//localhost/testdb");

$zip = @$_REQUEST['zip'];
$miles = @$_REQUEST['miles'];

echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"; ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
         <title>ZIP Code-based locator</title>
 </head>
 <body>
 <div>
 <form action="index.php" method="GET">
 Find locations within <input name="miles" value="<?php echo $miles ?>" size="3" maxlength="3"/> miles of ZIP
                              

   <input name="zip" maxlength="5" size="5" value="<?php echo $zip ?>"/>
 <input type="submit" name="go" value="go" />
</form>
<?php
if( $zip ){
        // open a connection.
        if( $conn = oci_connect( DBUSER,  DBPWD, DBXN ) ) {
                include_once( "Locator.class.php" );
                $loc = new Locator( $conn );
                //$loc->setDebug( true );
                $locations = $loc->getLocationsWithin( $zip, $miles );  
                oci_close( $conn );
                
                echo "<h2>".count( current( $locations ) )." Locations Found</h2>";
                ?><table><tbody><?php
                foreach( array_keys( $locations ) as $col) {
                        ?><th><?php
                        echo $col;
                        ?></th><?php
                }
                foreach( current( $locations ) as $row => $value) {
                        ?><tr><?php
                        foreach( array_keys( $locations ) as $col) {
                                ?><td><?php
                                echo $locations[$col][$row];
                                ?></td><?php
                        }
                        ?></tr><?php
                }
                ?></table></tbody><?php
        }else {
                ?>Cannot connect to database.<?php
        }
}

?>
</div>
</body>
</html>
                            

Put a Place to the Name


The relatively recent availability of high interactivity browser-based maps was made even more interesting by the two largest providers, Google and Yahoo!, providing APIs allowing developers to use these maps in their own applications. Within days of their release, wrapper classes were available. The class we'll look at here, Phoogle, makes simple maps so simple that it will add almost no additional time to your development cycle aside from customization, if so desired. Why not use an API like for this whole problem address by this recipe? To gain the kind of informational advantages of the ZIP to latitude/longitude information in our local database for a remote service, all the location data would need to be available to the provide service; this way, we maintain how outside users are able to access our information.

Here's an example of how use of Phoogle could be added to the file above: before the header of the HTML is generated, you need to instantiate Phoogle:

require_once 'phoogle.php';
$map = new PhoogleMap();
$map->setAPIKey( "< your key here >" );
The header tag should be changed to include this call to the Phoogle object:
<head>
         <title>ZIP Code-based locator</title>
         <? $map->printGoogleJS(); ?>
</head>
After retrieving the locations, but before generating the tabular view, you'd need the following to add the code to add your addresses, center the map, and draw it on the page.
foreach( $locations['ADDRESS'] as $row => $address) {
                        $map->addAddress("$address, 
                                                        {$locations['CITY']}, 
                                                        {$locations['STATE']} 
                                                        {$locations['ZIP']}");
                }
$map->centerMap( $loc->startLocation['LATITUDE'][0],
                                $loc->startLocation['LONGITUDE'][0] ) ;
$map->showMap();
That's about it. This will draw a small map on the page on the client side, adding very little additional processing overhead. Furthermore, the developer can resize the map, toggle controls/views, and set the zoom characteristics of the map. Again, further customizations are possible, including custom images, events, and text.

 

Conclusion


A ZIP-based locator can add some highly useful information to your organization's page without much effort or overhead, incurring only a few tens of thousands of records. An alternative would be to subscribe to a Web service-based geocoding service, which will not suffer from the aging data problem that any subscription data may have. However, for short projects, this approach has the advantage of being self-contained and not requiring sharing of your content with anyone except with those whom you want: your users!

Using a third-party mapping service is a quick way to have some real-world visualization of your information, but should probably be seen as a stepping stone: if your users really like this sort of content, you'll probably want to investigate a more substantial solution. If you can get your hands on Oracle Locator, your work will be much lessened with improved results. Enjoy working with these methods! Nick Bollweg [ nick.bollweg@gmail.com] is a freelance developer in the Minneapolis, Minn., area.

Send us your comments