Equal Attributes Profiler |
The Equal Attributes Profiler searches records across a number of attributes for pairs of attributes where values are frequently equal - for example where FirstName and GivenName attributes are both stored, and normally the same. A threshold option is used to drive whether or not to relate pairs of attributes together, depending on the percentage of values in each attribute that have the same value.
Use the Equal Attributes Profiler to find possibly redundant attributes, or pairs of attributes where values are normally equal, but in some cases are not. The Equal Attributes Profiler can help find bad data where two values in related attributes do not relate to each other when they should.
Any attributes that you wish to examine for equal attribute linkage.
|
Option |
Type |
Purpose |
Default Value |
|
Equal attribute threshold |
Percentage |
Controls the percentage of values that must be equal in two attributes for those two attributes to be considered as related, and to appear in the results. |
80% Note that the value must be between 50% and 100% inclusive. |
|
Treat nulls as equal? |
Yes/No |
Controls whether or not pairs of Null values are considered to be equal, and therefore whether or not they will be considered when appraising the Equal attribute threshold (above). |
Yes |
None
None
|
Execution Mode |
Supported |
|
Batch |
Yes |
|
Real time Monitoring |
Yes |
|
Real time Response |
No |
The Equal Attributes Profiler requires a batch of records to produce its statistics; that is, in order to find meaningful relationships between pairs of attributes, it must run to completion. Therefore, its results are not available until the full data set has been processed, and this processor is not suitable for a process that requires a real time response.
When executed against a batch of transactions from a real time data source, it will finish its processing when the commit point (transaction or time limit) configured on the Read Processor is reached.
The Equal Attributes Profiler provides a summary view of any pairs of attributes that have a high enough percentage of equal values. The top-level view shows the following statistics for each pair of related (equal) attributes:
|
Statistic |
Meaning |
|
|
Equal |
The number of records where the values for both the related attributes were the same. |
|
|
Null pairs |
The number of records where the values for both the related attributes were null.
|
|
|
Not equal |
The number of records where the values for the related attributes were not the same. |
Additional Data
Click on the Additional Data button to display the above statistics as percentages of the records analyzed.
Drill-down on the number of records where the pair of attributes matched exactly to see a breakdown of the frequency of occurrence of each matching value. Drill-down again to see the records.
Alternatively, drill-down on the number of records where the pair of attributes were not equal to see the records directly. If there should be a relationship between attributes, these will be the records where the relationship is broken.
In this example, a Customer table is analyzed to see if any of its attributes are commonly equal to each other, using the default configuration. The Equal Attributes Profiler finds that the DT_PURCHASED and DT_ACC_OPEN attributes are normally equal:
Summary View
By drilling down on the number of records where the two fields were equal, you can see a view of all the pairs of equal values:
Oracle ® Enterprise Data Quality Help version 9.0
Copyright ©
2006,2011 Oracle and/or its affiliates. All rights reserved.