Split Records from Array |
The Split Records from Array processor allows you to create many records from a single record, by splitting out new records for each element in an input array.
Use Split Records from Array where data that should be represented in many records has been wrongly captured in a single record; that is, to normalize data that has been wrongly denormalized.
Often, the denormalized data that you need to split out needs to be pre-processed, before using this processor. For example, in the following Orders table, multiple orders (with multiple order numbers and product descriptions) have been wrongly entered into a single record, using free text fields:
|
Order_ID |
Order_Number |
Product_Desc |
|
O574112 |
2788143 / 2788144 |
Home PC Package / Colour Printer |
In this case, the Order_Number and Product_Desc attributes both need simple pre-processing using the Make Array from String processor to create arrays using the / character as a separator. The arrays can then be input into Split Records from Array to split out the records as follows:
|
Order_ID |
Order_Number.normalized |
Product_Desc.normalized |
|
O574112 |
2788143 |
Home PC Package |
|
O574112 |
2788144 |
Colour Printer |
Note above that many array attributes may be input to this processor. In this case, the number of output records for each input record will correspond to the number of elements in the array attribute with the largest number of elements. Data from each attribute that is not input is simply copied to all of the output records created from each input record. For example, if you split the following record, inputting Title.array and FirstName.array:
|
Cust_ID |
Title.array |
FirstName.array |
Surname |
|
13451 |
{Mr}{Mrs} |
{John}{Dorothy}{James} |
Smith |
...then the output records will be as follows:
|
Cust_ID.normalized |
Title.array.normalized |
FirstName.array.normalized |
Surname.normalized |
|
13451 |
Mr |
John |
Smith |
|
13451 |
Mrs |
Dorothy |
Smith |
|
13451 |
|
James |
Smith |
Note that Title.array.normalized is Null for the last record, as there is no array element in Title.array that corresponds with the third element in FirstName.array.
One or more Array attributes that you wish to use to split records.
None.
|
Data attribute |
Type |
Purpose |
Value |
|
[Attribute Name].Split |
Added |
Holds the normalized attribute value for each input array attribute after record splitting. |
The normalized attribute value after record splitting, where each element of the input array will be output as a single value. |
|
Flag attribute |
Purpose |
Possible Values |
|
[Attribute Name].SplitFlag |
To indicate whether the split from array was successful |
Y/N |
|
Execution Mode |
Supported |
|
Batch |
Yes |
|
Real time Monitoring |
Yes |
|
Real time Response |
Yes |
|
Statistic |
Meaning |
|
Input records |
The number of records that were input (that is, before splitting) |
|
Output records |
The number of records that were output (that is, after splitting). Drill-down to see all output records |
|
Split % |
The percentage of input records that were split into multiple output records. Drill-down to see the output records where a split occurred (that is, records where one of the input array attributes contained more than one element) |
|
Note: To see a full view of all the output records after splitting records, click on the Data View in the results browser. |
None
In this example, a data set of People's Names is being prepared for matching. The data contains a number of aliases and alternative spellings for people's names. These are pre-processed into a single Aliases.Array attribute, and then split out using Split Records from Array so that each name can be matched separately.
Data View (All records)
Oracle ® Enterprise Data Quality Help version 9.0
Copyright ©
2006,2011 Oracle and/or its affiliates. All rights reserved.