Advanced Data Cleansing with Oracle9i Warehouse Builder


Today, more companies realize the need for Business Intelligence and establishing customer relationships to gain competitive advantage. Regardless of industry, using the universal foundation of – customer data – enables businesses to gain a clear picture of customers, analyze their buying patterns, and predict future sales. Customer data usually resides in multiple disparate sources and needs to be consolidated. Therefore, companies are turning to solutions such as data warehouses or data marts to help identify and manage their customer relationships. Data must be filtered for unreasonable values and transformed to match other related data. A data warehouse or data mart initiative cannot be expected to deliver a tangible Return On Investment (ROI) unless the data within it is reliable and accurate.

Name and address cleansing in Oracle9i Warehouse Builder

In Oracle Warehouse Builder (OWB), name and address cleansing is a built-in transformation. During cleansing, individual names or organization names and/or addresses are:

  • Parsed into individual elements for improved correction and matching
  • Standardized which involves modification of components to a standard version acceptable to a postal service or suitable for record matching. Standardization of components include:
    • Street and city misspellings/improper abbreviations
    • Directional identifiers: N, S, W, NW
    • Street types: ST, AVE, RUA, PSO, and JLN
    • State/province/neighborhood
    • Business keywords (INC, Corp)
    • Standardized version of first and middle names as separate components
  • Validated and Corrected by using a database of postal data for address components (such as street name, city and zip code). Only addresses will be validated or corrected, names are standardized
  • Augmented, by adding new data elements for mailing, better target marketing or other purposes. Examples are ZIP+4, probable gender, consumer versus business record identification, county codes, and geocodes (including latitude and longitude)
For example, for source data of: Sue Smith, 187 Prospct Str, Norwod MA The end result after parsing, standardization, correction/validation and augmentation would be include the following components: Sue|Susan|Smith|187|Prospect|St|Norwood|MA|02062|2345

Name and Address Mapping Operator

Oracle Warehouse Builder features a Name and Address Mapping Operator that works with OWB Name & Address 9.0 to support parsing, standardization, postal matching, and geocoding of name and address data. OWB Name & Address 9.0 (a server component which needs to be installed in addition to Warehouse Builder) is an API to a set of data libraries of Postal, Directories and geocoders. Currently, Oracle Warehouse Builder uses Trillium for the underlying data libraries. The client component is an operator in the mapping editor of the Oracle9i Warehouse Builder. Name and Address records are parsed, postal matched, and geocoded in that order. Over 30 countries across the world are supported, with seven having postal matching support. The OWB Name & Address 9.0 is currently available for four (4) different regions:
  • North America (US & Canada)
  • Latin America (including Mexico)
  • Europe, Middle East and Africa
  • Asia Pacific (including Australia

Based on the specified input roles, the name/address parser will analyze and standardize the input data and return the requested output components. Input roles can be selected from a list of predefined role types that vary depending on the selected parsing option. An input role specifies the type of data that name/address input item represents. Output Components are defined by adding output parameters, then assigning an output component type to each parameter using the associated predefined list. The Output Name, Output Address and Output Status are what the user maps to targets. Figure below shows a screenshot of a mapping using the Name and Address component in OWB.



Based on the user defined input roles and output components, Oracle Warehouse Builder generates the necessary PL/SQL code which is then passed on to the name and address server for processing.

More Info
Oracle Warehouse Builder on OTN 
Oracle Data Warehousing
Oracle Warehouse Builder Overview - part 1 , part 2

Oracle9iDS Daily Features
Archives
   
E-mail this page
Printer View Printer View
Oracle Is The Information Company About Oracle | Oracle RSS Feeds | Careers | Contact Us | Site Maps | Legal Notices | Terms of Use | Privacy