You are here: Processor Library > Transformation > Normalize Whitespace

Normalize Whitespace

The Normalize Whitespace processor normalizes all the whitespace in String values so that multiple spaces in between words are normalized to a single space character. It also removes leading and trailing whitespace.

Whitespace is defined in OEDQ as:

Use

Normalize Whitespace is often used before parsing free text fields, to ensure that all values have regular spacing. It is also often useful after other transformations, which may leave extra spaces. For example, when text fields have words or numbers stripped from them, this may leave additional spaces in between words.

Configuration

Inputs

Any String or String Array type attributes where you wish to normalize whitespace. Number and Date attributes are not valid inputs.

Note that if you input an Array attribute, the transformation will apply to all array elements, and an Array attribute will be output.

Options

None

Outputs

Data attributes

Data attribute

Type

Purpose

Value

[Attribute Name].WhitespaceNormalized

Derived

A new attribute with normalized spacing between words.

The original attribute value, with whitespace normalized.

Flags

None

Execution

Execution Mode

Supported

Batch

Yes

Real time Monitoring

Yes

Real time Response

Yes

Results Browsing

The Normalize Whitespace transformer presents no summary statistics on its processing.

In the Data view, each input attribute is shown with its new derived attribute with whitespace normalized to the right.

Output Filters

None

Example

In this example, the Normalize Whitespace processor is used to normalize the spaces between words in an attribute containing the first line of an address:

Oracle ® Enterprise Data Quality Help version 9.0
Copyright © 2006,2011 Oracle and/or its affiliates. All rights reserved.