| |
| Data Exchange is the buzzword of today's
organizations. Data tends to be everywhere and in various formats.
Often management is not worried about the source of the data, but
rather the way in which data can be presented to them from the various
sources. What this means is that, you would need applications that
consolidate data from various sources into some meaningful
centralized data store so that you can derive reports off it. This
centralized data store can be a SQL Server database. What is missing
in the puzzle then is, how do you aggregate data from the various
sources. There are many different ways you can do this aggregation:
|
|
|
| |
You could
use custom applications that fetch data from your various sources
and then update the relevant SQL Server database.
|
| |
You could use the
Data Transformation Services (DTS) framework that comes with
SQL Server. |
|
|
|
Before we delve into the features
of DTS, let's see why the first method is not suitable. Writing
your own custom applications for data aggregation is a very tedious
task, since you need to write connection code for the various sources
and the destination and also provide facilities for massaging
the data before it reaches its destination. This is quite a lot
of code and if suddenly the process for massaging some data changes,
you need to recompile your code, test it and do a whole lot of other
maintenance chores. This is where DTS steps in. DTS is a set
of tools that lets you extract, transform, and consolidate data
from disparate sources into single or multiple destinations supported
by the DTS connectivity options. By using DTS tools to graphically
build DTS packages or by programming a package with the DTS object
model, you can create custom data movement solutions tailored to
the specialized business needs of your organization. Since most
of the underlying plumbing is taken care for you, all you need to
do is only concentrate on the business task at hand. |
| |
| In this article, we will see the basic
building blocks of DTS and see a few examples of how you can use DTS
to develop some very simple applications. |
| |
DTS Building Blocks
A DTS package is the underlying framework on which
the other elements are built. A DTS package is an organized collection
of connections, tasks, transformations, and workflow constraints
assembled either with a DTS tool or programmatically and saved to
one of the stores that DTS supports. The various stores that
are supported are: Microsoft SQL Server, SQL Server 2000 Meta Data
Services, structured storage file, or a Microsoft Visual Basic file.
Each package contains one or more steps that are executed sequentially
or in parallel when the package is run. When executed, the package
connects to the various data sources, copies data and database objects,
transforms data, and notifies other users or processes of events.
A package consists of the following elements: |
| |
| |
Connections
|
| |
Tasks |
| |
Transformations |
| |
Workflow |
|
| |
DTS Tasks
A DTS task is a discrete set of functionality which is executed
as a single step inside a package. Each task defines a single
work item to be performed as part of the data movement and transformation
process, or as a job to be executed. DTS supplies a number of tasks
that are part of the DTS object model which can be accessed graphically,
through DTS Designer, or programmatically (via a COM API). These
tasks, which can be configured individually, cover a wide variety
of data copying, data transformation, and notification situations.
For example, you have tasks that will let you: |
| |
| |
Import
and Export data |
| |
Transform data |
| |
Copy various database
objects |
|
| |
And many more too. Because DTS
is based on an extensible COM model, you can create your own custom
task and then use them to design your data movement solution. These
custom tasks can also be integrated with the designer. |
| |
DTS Transformations
A DTS transformation is a set of one or more functions that
operate on a piece of data. The data that it operates on is
called the source data and the output of the transformation
is usually what you want at the destination. Sometimes
you might also want to apply more than one transformation before
the data reaches its destination. Transformations make it easy to
implement complex data validation, data scrubbing, and conversions
during the import and export process. Apart from stock set of transformations
that DTS supplies (like SubString, Trim etc), you can also write
custom transformation functions using a scripting language like
VBScript. |
| |
DTS Package Workflow
A workflow allows you to control the flow of execution of the
various tasks inside a DTS package. Workflows allow you set
a precedence constraints that allow you to link
tasks together based on whether the first task executes, executes
successfully, or executes unsuccessfully. You can use precedence
constraints to build conditional branches in a workflow. Steps without
constraints are executed immediately, and several steps can execute
in parallel. You can also manually control the workflow behavior
using ActiveX scripts.
One final note. Packages can also be versioned so that you progressively
build your solution and invoke the various versions based on
demand. You can also protect your packages with a password so that
you can protect who can access your package. Finally, all DTS packages
can be executed from the command line using the command line tool
for DTS.
Ok, now that we have some of the basics out of the way, let's see
some examples of how to work with DTS. We will develop some sample
DTS packages that illustrate some of the core concepts of DTS. Before
we get into the examples, let's see where you can find DTS. Open
the SQL Server enterprise manager and locate the Data Transformation
Services node. The following figure shows where you can find this
node. |
| |
 |
| |
Sample A, Simple Import
As our first case study, let's consider that we have the list of
products as an XL sheet as shown in the following figure: |
| |
 |
| |
We can see that we have 3 columns
of information: the product code, the name of the product and the
item in stock. Our requirement is to import this piece of information
into a SQL Server database table. Let's see how we can write a DTS
package for achieving this. Open the DTS designer by choosing the
Local Packages node, right-clicking the node and choosing New Package
from the context menu. You will now be presented with the DTS designer
canvas. Now follow these steps: |
| |
| |
From
the connection toolbar, choose the Microsoft Excel 97-2000 icon
and drag and drop it on the designer surface. You will now see
the Connection Properties dialog box. In this you will need
to provide a name for the connection and the source of the XL
sheet. This is shown in the following figure: |
| |
|
 |
| |
| |
The designer now
places the XL icon on the designer canvas. You have just finished
configuring the source for the solution. |
| |
From the connection
toolbar, choose the Microsoft OLEDB Provider for SQL Server
icon and drag and drop it on the designer surface. You will
now see the Connection Properties dialog box. In this you will
need to provide a name for the connection, the SQL Server, the
connection information and the database to connect to. This
is shown in the following figure: |
| |
 |
| |
| |
The designer now
places the server icon on the designer canvas. At this stage,
you have finished configuring the destination for the solution. |
| |
From the task toolbar,
click on the Transform data task icon (the
one with a gear and arrow in between). When you move the cursor
to the designer surface, the cursor icon changes to select
source connection. Click on the XL icon for the source.
The cursor now changes to select destination connection.
Click on the server icon for the destination. |
| |
The designer now
draws an arrow mark from the source (XL sheet) to the destination
(SQL Server) indicating that we want to move data from the XL
sheet to the SQL Server. |
| |
Choose the transformation
arrow, right-click and choose Properties. You
are now presented with the Transform Data Task Properties
dialog box. Here, you have various tabs to enter information. |
| |
In the Source
tab, enter a name for the transformation and choose the Table/View
option. Note that the designer automatically selects the first
sheet of the XL file and shows that name. If you click on the
Preview button, you can see the data from the
XL sheet. This is shown in the following figures. |
| |
 |
| |
 |
| |
| |
Choose the Destination
tab and click on the Create button. This will
show a dialog box with a CREATE TABLE script. This is required,
since we need to create a new table to hold the information
from the XL sheet. If a table already exists, you can select
it from the Table name combo box. In the CREATE
TABLE dialog box, click OK. The dialog box now shows the new
table that will be created. This is shown in the following figures: |
| |
 |
| |
 |
| |
|
| |
Note that SQL Server
automatically infers the data type from the XL sheet. You can
change this in the CREATE TABLE dialog box if required. |
| |
Click on the Transformations
tab. You will now see that SQL Server has automatically mapped
the source to the destination. For now, let us not change anything
in this tab. Click on the Ok button in the
dialog box. |
|
| |
| With this we complete the basic DTS
package. You can now execute this package by clicking on the green
arrow icon on the main toolbar. This will execute the package and
also displays a status of the execution. This is shown in the following
figure: |
| |
 |
| |
You can now see if the data has
been successfully imported by executing the following query in the
query analyzer tool (in the pubs database). |
| |
| SELECT
* FROM [Sheet1$] |
| |
| That was a really simple task!! But
if you look at it, we have accomplished an amazing piece of functionality
with a few drag and drops. Imagine if you had to code all this manually!!
Finally, you can save the package by clicking on the Package/Save
menu. This will show the following dialog box: |
| |
 |
| |
Click on the OK
button to save the package to the SQL Server repository. If you
choose Structured Storage File or Visual
Basic File in the Location combo box,
the designer will create an appropriately named file. Note that
a Structured Storage File option is very efficient
for transport across a network. |
| |
Sample B, Simple Export
One of the most common questions that I see coming in the SQL Server
newsgroups is how to export a table as a CSV file. Although there
are many methods to do this, DTS offers a solution, where you can
also transform the data during the process of export and thus is
more configurable. In this example, we will see how to export the
contents of a table into a flat-file and doing some transformation
in the process. |
| |
| |
Start
by creating a new package using the methods outlined in the
earlier example. |
| |
From the connection
toolbar, choose the Microsoft OLEDB Provider for SQL
Server icon and drag and drop it on the designer surface.
Configure the connection to connect to the pubs
database. |
| |
From the connection
toolbar, choose the Text File (Destination)
icon and drag and drop it on the designer surface. In the Connection
Properties dialog box, give the name of a file and
then click on the Properties button. This is
shown in the following figure. |
| |
 |
| |
| |
The properties dialog
box allows you to configure the file format. Here, we will choose
to have a delimited text file as shown in the following figure. |
| |
 |
| |
| |
Once you have done
this, click on the Finish button and on the
Ok button to close all the dialog boxes. |
| |
From the task toolbar,
click on the Transform data task icon and connect
the SQL Server and the text file connections. |
| |
Right-click the transformation
and choose the Properties. This will show the
properties dialog box. |
| |
In the Source
tab, choose the [Sheet1$] table as the source. |
| |
When you click on
the Destination tab, you are provided with
a dialog box titled Define Columns. This dialog
box allows you to decide what columns are to be exported and
whether they need to be delimited. We will not change anything
in this, thus choose "Execute" and the destination
tab will be populated with the values chosen as shown in the
following figure: |
| |
 |
| |
| |
Now click on the
transformations tab and select Delete All.
This will remove all existing transformations. Now click Select
All and click New. In the Create
New Transformation choose ActiveX
script and click Ok. You are now shown the
Transformation Options dialog box. Give a name
to the transformation as show in the following figure. |
| |
 |
| |
| |
Now choose the Properties
button. You are shown the script editor with some default
script already written in VBScript. Let's change the script
a bit so that we multiple the quantity field with 10. Modify
the code as shown in the following figure: |
| |
 |
| |
| |
Close all dialog
boxes by choosing Ok. If you now see the Transformations
tab, you will see that DTS has created a group transformation
which is an ActiveX script as shown in the following figure: |
| |
 |
|
| |
| OK, we are done now and you can execute
the package by clicking on the green execute icon. You will see a
confirmation dialog box that everything went fine. If you see the
output file that was generated (in our case it was generated as C:\Temp\Output.txt),
you will see the following output. |
| |
1001,Clips,1000
1002,File Folders,200
1003,Punching Machines,100 |
| |
| Note that the VBScript calculation
was applied to the "quantity" field. Real-world applications
can do more sophisticated transformations that even connect to other
data sources to fetch data. |
| |
Executing DTS packages
Till now, we have been executing DTS packages from the designer
itself. But in a production scenario, you might want to execute
a DTS package from a stored procedure or even from the command line.
DTS provides a command line tool called dtsrun for this purpose.
There are many options to this command, but we will see the simplest
of them here: |
| |
 |
| |
| This command can be embedded inside
a stored procedure by using the xp_cmdshell option.
You can also encrypt the entire command so that no one can see your
passwords. For this you need to use the /!Y option. |
| |
Conclusion
In this article we saw two very simple examples how DTS can be
used to import and export data. You can develop many complex applications
in DTS and there are many books that have been written on DTS. The
DTS API is also a very complex topic, but is currently out of the
scope of this article. Hopefully, you have got an insight of the
power of DTS and I would encourage you to explore it more. A very
good site on the various features of DTS is www.sqldts.com.
This site is maintained by Darren Green a SQL Server
MVP and is a storehouse of DTS examples. |
| |
| For more questions, you can mail
me at srisamp@hotmail.com
or visit my web site at http://www32.brinkster.com/srisamp.
|
| |