The Denoise transformation allows values to be stripped of 'noise' characters - either when clustering or comparing values, in the same way as the main Denoise processor. This increases matching accuracy, as noise characters can detract from the ability to find matching records. For example, the values "Castle (Investments) Ltd" and "Castle Investments Ltd" are a strong match, but without removing the parentheses from the former value, they would have a character edit distance of 2.
Use the Denoise transformation when matching records using an identifier where values were entered using a free text field. Free text fields cause the same data to be entered in many formats, and can also cause typographical errors which may include the insertion of 'noise' characters such as ( and ). The Denoise transformation allows such errors to be overcome when matching.
|
Option |
Type |
Purpose |
Default Value |
|
Reference Data |
List of noise values (characters or text Strings) |
*Noise Characters |
|
|
Noise characters |
Free text |
Additional noise characters Note: All characters are treated as additional individual denoise characters. The value is not considered as a text String to remove where it appears. |
None |
Example configuration
In this example, the Denoise transformation is used to strip noise characters from company names when matching. The following noise characters are used:
& + ( ) - *
Example transformations
The following table illustrates some example Denoise transformations using the above configuration:
|
Value |
Transformed value |
|
Castle (Investments) Ltd |
Castle Investments Ltd |
|
Castle Investments Ltd |
Castle Investments Ltd |
|
Ipswich & Norwich Co-op |
Ipswich Norwich Coop |
|
Ipswich + Norwich Co-operative |
Ipswich Norwich Cooperative |
|
Barclays Bank - Cambridge |
Barclays Bank Cambridge |
|
Barclays Bank (Cambridge) |
Barclays Bank Cambridge |
|
George & Sons ***in administration*** |
George Sons in administration |
Oracle ® Enterprise Data Quality Help version 9.0
Copyright ©
2006,2011 Oracle and/or its affiliates. All rights reserved.