|
RegEx Replace |
The RegEx Replace processor provides a way to perform advanced text replacements by matching text values to a regular expression, and replacing the matching value with a specific value, or with a value derived from the matched text - for example replacing the whole of a String that matched a regular expression with only the first group in the expression.
Regular expressions are a standard technique for expressing patterns and manipulating Strings that are very powerful once mastered.
Tutorials and reference material about regular expressions are available on the Internet, including:
and in books, including:
There are also software packages available to help you master regular expressions, such as RegExBuddy, and online libraries of useful regular expressions, such as RegExLib.
Use RegEx Replace for advanced text transformations, for example where you need to replace a String that matches a specific pattern by regular expression with a specific value, or where you need to consider the context of a piece of text before deciding whether or not to standardize it.
For example, for an attribute with a fixed number of valid values, you may wish to transform all values over a few alphabetic characters in length that do not match the list of specific valid values to 'Other'. You can do this by running a List Check, and transforming the unmatched values using RegEx Replace.
Note that backslashes (\) and dollar signs ($) are special characters in the replacement String. Dollar signs are used as references to groups within the regular expression used to match against. Backslashes are used to escape literal characters in the replacement String.
A single String attribute.
Option |
Type |
Purpose |
Default Value |
Regular expression |
Regular expression |
The regular expression to be matched |
None |
Replacement |
Any value |
The replacement String used to replace the matched values |
None |
Note:Wherever the regular expression is matched in the input data, it is replaced with the replacement String. For example, using the regular expression a*b, and the replacement String "-", the input value "aabfooaabfooabfoob" would yield a new attribute value of "-foo-foo-foo-". |
Data attribute |
Type |
Purpose |
Value |
[Attribute Name].RegExReplaced |
Derived |
A new attribute with the result of the RegEx replace |
The result of the RegEx replace. Note that if the regular expression was not matched, the original input attribute value is carried forward. |
Flag attribute |
Purpose |
Possible Values |
[Attribute Name].RegExReplaceSuccess |
To indicate whether the RegEx Replace was successful or not |
Y/N |
Execution Mode |
Supported |
Batch |
Yes |
Real time Monitoring |
Yes |
Real time Response |
Yes |
The RegEx Replace processor produces a summary view of its results, showing the following statistics:
Statistic |
Meaning |
Transformed |
The number of records which matched the regular expression, and therefore underwent a transformation. |
Untransformed |
The number of records which did not match the regular expression, and therefore did not undergo a transformation. |
The following output filters are available from the RegEx Replace processor:
In this simple example, RegEx Replace is used to replace any values in a job title attribute that were found to be invalid by an upstream List Check processor with the value 'Other'.
Regular expression: (.*.)
Replacement String: Other
Results (successful replacements):
Oracle ® Enterprise Data Quality Help version 9.0
Copyright ©
2006,2011 Oracle and/or its affiliates. All rights reserved.