|
|
 SQL Server SSIS Sample Component: RegexBrief DescriptionRegex is an SSIS dataflow component that applies a configured regular expression against an incoming column, matching, extracting, or splitting, as configured by the user. On This PageQuick Details| File Name: | SetupRegex.msi | | Version: | 1.1 | | Date Published: | 1/26/2006 | | Language: | English | | Download Size: | 126 KB | | Estimated Download Time: | 1 min 56K |
OverviewRegex is an SSIS dataflow component that applies a configured regular expression against an incoming column.
- Rows with no match can be routed to the error output, fail the data flow, or be ignored.
- The matched string, as well as any matching substrings, are placed in new dataflow columns.
Test the component by extracting all the saint names from cities in AdventureWorks.Person.
- Configure an oledb data source to read from AdventureWorks.Person.
- Add a Regex component.
- Configure the RegularExpression custom property to have the value "(San|Santa) (?'SaintName'.*)".
- On the Input Columns tab, check the column "City"
- Add downstream transforms (ExportColumn's work with no further configuration) to sink the "Matched" and "NoMatch" outputs.
- Set ErrorRowDisposition on the "Matched" output to RD_RedirectRow.
- Put a data viewer on the "Matched" output path.
The component, so configured, will add the following output columns:
- "0" contains the entire match, e.g. "San Francisco".
- "1" contains the Spanish word for "saint", i.e. "San" or "Santa".
- "SaintName" contains the remainder of the string, e.g. "Francisco".
When I run the package, I get 529 rows sent to the "Matched" output, and 19085 rows sent to "NoMatch". The viewer shows that only those cities containing "San" or "Santa" are represented.
If you don't need a match column, delete it from the "Matched" output and it no longer appears in the downstream rows. For example, delete column "1" and you no longer get the "San" or "Santa". If you are using the component solely to split rows, you can delete all the output columns.
To put a deleted column back in the flow, locate the missing column in the External Columns collection of the "Matched" output, and record its ID. Then, add a column to the Output Columns collection, and set its ExternalMetadataColumnID to the recorded value.
When you modify the regular expression, e.g. to add a new substring match, existing output columns are not disturbed, if they continue to exist in the regular expression. This prevents the wholesale disruption of downstream component configuration.
Modify the ErrorRowDisposition on the "Matched" output to vary component behavior:
- To fail the data flow on no match, set it to RD_NotUsed.
- To redirect unmatched rows down the error output, set it to RD_RedirectRow.
- To send all rows down the default output, nulling column "0" on no match, set it to RD_IgnoreFailure. Look at the viewer to see NULL values in the match columns.
Modify the TruncationRowDisposition on the individual columns to adjust component behavior when the matched string is too long for the output column. You can also modify the output column length to provide more room.
This component is part of a series of components that illustrate increasingly complex behavior, each one exercising a greater proportion of the SSIS object model. If studying in order, this component follows UnDoubleError and precedes InvokeClr.
This component was built to provide an introduction to a component that uses the output column external metadata collection to maintain a snapshot of previously configured output columns, and to support selecting and unselecting of those columns for inclusion in the downstream data flow.
This is the behavior relied on in source components to persist a record of the external data source, and so distinguish newly present columns from those that the user opted not to introduce from the data flow. This is also how external data type changes are detected, and distinguished from user-specified conversions. Top of page
System Requirements- Supported Operating Systems: Windows 2000; Windows Server 2003; Windows XP
SQL Server 2005; Visual Studio 2005 Top of page
InstructionsInstall and build as directed in readme.txt Top of page
Top of page
What Others Are Downloading Top of page
|