You are here: Processor Library > Matching > Matching Transformations > Denoise

Match Transformation: Denoise

The Denoise transformation allows values to be stripped of 'noise' characters - either when clustering or comparing values, in the same way as the main Denoise processor. This increases matching accuracy, as noise characters can detract from the ability to find matching records. For example, the values "Castle (Investments) Ltd" and "Castle Investments Ltd" are a strong match, but without removing the parentheses from the former value, they would have a character edit distance of 2.

Use

Use the Denoise transformation when matching records using an identifier where values were entered using a free text field. Free text fields cause the same data to be entered in many formats, and can also cause typographical errors which may include the insertion of 'noise' characters such as ( and ). The Denoise transformation allows such errors to be overcome when matching.

Options

Option

Type

Purpose

Default Value

Noise characters Reference Data

Reference Data

List of noise values (characters or text Strings)

*Noise Characters

Noise characters

Free text

Additional noise characters

Note: All characters are treated as additional individual denoise characters. The value is not considered as a text String to remove where it appears.

None

Example

Example configuration

In this example, the Denoise transformation is used to strip noise characters from company names when matching. The following noise characters are used:

& + ( ) - *

Example transformations

The following table illustrates some example Denoise transformations using the above configuration:

Value

Transformed value

Castle (Investments) Ltd

Castle Investments Ltd

Castle Investments Ltd

Castle Investments Ltd

Ipswich & Norwich Co-op

Ipswich Norwich Coop

Ipswich + Norwich Co-operative

Ipswich Norwich Cooperative

Barclays Bank - Cambridge

Barclays Bank Cambridge

Barclays Bank (Cambridge)

Barclays Bank Cambridge

George & Sons ***in administration***

George Sons in administration

Oracle ® Enterprise Data Quality Help version 9.0
Copyright © 2006,2011 Oracle and/or its affiliates. All rights reserved.