Click Here to Install Silverlight*
IndiaChange|All Microsoft Sites
MSDN
|Developer Centers|Library|Downloads|How To Buy|Subscribers|My MSDN
 
Basics of Data Transformation Services (DTS)
By Srinivas Sampath
 
Article Posted: August 19, 2003
 
Data Exchange is the buzzword of today's organizations. Data tends to be everywhere and in various formats. Often management is not worried about the source of the data, but rather the way in which data can be presented to them from the various sources. What this means is that, you would need applications that consolidate data from various sources into some meaningful centralized data store so that you can derive reports off it. This centralized data store can be a SQL Server database. What is missing in the puzzle then is, how do you aggregate data from the various sources. There are many different ways you can do this aggregation:


  •  
  • You could use custom applications that fetch data from your various sources and then update the relevant SQL Server database.
  •  
  • You could use the Data Transformation Services (DTS) framework that comes with SQL Server.


    Before we delve into the features of DTS, let's see why the first method is not suitable. Writing your own custom applications for data aggregation is a very tedious task, since you need to write connection code for the various sources and the destination and also provide facilities for massaging the data before it reaches its destination. This is quite a lot of code and if suddenly the process for massaging some data changes, you need to recompile your code, test it and do a whole lot of other maintenance chores. This is where DTS steps in. DTS is a set of tools that lets you extract, transform, and consolidate data from disparate sources into single or multiple destinations supported by the DTS connectivity options. By using DTS tools to graphically build DTS packages or by programming a package with the DTS object model, you can create custom data movement solutions tailored to the specialized business needs of your organization. Since most of the underlying plumbing is taken care for you, all you need to do is only concentrate on the business task at hand.

     
    In this article, we will see the basic building blocks of DTS and see a few examples of how you can use DTS to develop some very simple applications.
     

    DTS Building Blocks

    A DTS package is the underlying framework on which the other elements are built. A DTS package is an organized collection of connections, tasks, transformations, and workflow constraints assembled either with a DTS tool or programmatically and saved to one of the stores that DTS supports. The various stores that are supported are: Microsoft SQL Server, SQL Server 2000 Meta Data Services, structured storage file, or a Microsoft Visual Basic file. Each package contains one or more steps that are executed sequentially or in parallel when the package is run. When executed, the package connects to the various data sources, copies data and database objects, transforms data, and notifies other users or processes of events. A package consists of the following elements:

     
  •  
  • Connections
  •  
  • Tasks
  •  
  • Transformations
  •  
  • Workflow
     

    DTS Tasks

    A DTS task is a discrete set of functionality which is executed as a single step inside a package. Each task defines a single work item to be performed as part of the data movement and transformation process, or as a job to be executed. DTS supplies a number of tasks that are part of the DTS object model which can be accessed graphically, through DTS Designer, or programmatically (via a COM API). These tasks, which can be configured individually, cover a wide variety of data copying, data transformation, and notification situations. For example, you have tasks that will let you:

     
  •  
  • Import and Export data
  •  
  • Transform data
  •  
  • Copy various database objects
     

    And many more too. Because DTS is based on an extensible COM model, you can create your own custom task and then use them to design your data movement solution. These custom tasks can also be integrated with the designer.

     

    DTS Transformations

    A DTS transformation is a set of one or more functions that operate on a piece of data. The data that it operates on is called the source data and the output of the transformation is usually what you want at the destination. Sometimes you might also want to apply more than one transformation before the data reaches its destination. Transformations make it easy to implement complex data validation, data scrubbing, and conversions during the import and export process. Apart from stock set of transformations that DTS supplies (like SubString, Trim etc), you can also write custom transformation functions using a scripting language like VBScript.

     

    DTS Package Workflow

    A workflow allows you to control the flow of execution of the various tasks inside a DTS package. Workflows allow you set a precedence constraints that allow you to link tasks together based on whether the first task executes, executes successfully, or executes unsuccessfully. You can use precedence constraints to build conditional branches in a workflow. Steps without constraints are executed immediately, and several steps can execute in parallel. You can also manually control the workflow behavior using ActiveX scripts.

    One final note. Packages can also be versioned so that you progressively build your solution and invoke the various versions based on demand. You can also protect your packages with a password so that you can protect who can access your package. Finally, all DTS packages can be executed from the command line using the command line tool for DTS.

    Ok, now that we have some of the basics out of the way, let's see some examples of how to work with DTS. We will develop some sample DTS packages that illustrate some of the core concepts of DTS. Before we get into the examples, let's see where you can find DTS. Open the SQL Server enterprise manager and locate the Data Transformation Services node. The following figure shows where you can find this node.

     
     

    Sample A, Simple Import

    As our first case study, let's consider that we have the list of products as an XL sheet as shown in the following figure:

     
     

    We can see that we have 3 columns of information: the product code, the name of the product and the item in stock. Our requirement is to import this piece of information into a SQL Server database table. Let's see how we can write a DTS package for achieving this. Open the DTS designer by choosing the Local Packages node, right-clicking the node and choosing New Package from the context menu. You will now be presented with the DTS designer canvas. Now follow these steps:

     
  •  
  • From the connection toolbar, choose the Microsoft Excel 97-2000 icon and drag and drop it on the designer surface. You will now see the Connection Properties dialog box. In this you will need to provide a name for the connection and the source of the XL sheet. This is shown in the following figure:
       
     
  •  
  • The designer now places the XL icon on the designer canvas. You have just finished configuring the source for the solution.
  •  
  • From the connection toolbar, choose the Microsoft OLEDB Provider for SQL Server icon and drag and drop it on the designer surface. You will now see the Connection Properties dialog box. In this you will need to provide a name for the connection, the SQL Server, the connection information and the database to connect to. This is shown in the following figure:
     
     
  •  
  • The designer now places the server icon on the designer canvas. At this stage, you have finished configuring the destination for the solution.
  •  
  • From the task toolbar, click on the Transform data task icon (the one with a gear and arrow in between). When you move the cursor to the designer surface, the cursor icon changes to select source connection. Click on the XL icon for the source. The cursor now changes to select destination connection. Click on the server icon for the destination.
  •  
  • The designer now draws an arrow mark from the source (XL sheet) to the destination (SQL Server) indicating that we want to move data from the XL sheet to the SQL Server.
  •  
  • Choose the transformation arrow, right-click and choose Properties. You are now presented with the Transform Data Task Properties dialog box. Here, you have various tabs to enter information.
  •  
  • In the Source tab, enter a name for the transformation and choose the Table/View option. Note that the designer automatically selects the first sheet of the XL file and shows that name. If you click on the Preview button, you can see the data from the XL sheet. This is shown in the following figures.
     
     
     
  •  
  • Choose the Destination tab and click on the Create button. This will show a dialog box with a CREATE TABLE script. This is required, since we need to create a new table to hold the information from the XL sheet. If a table already exists, you can select it from the Table name combo box. In the CREATE TABLE dialog box, click OK. The dialog box now shows the new table that will be created. This is shown in the following figures:
     
     
       
  •  
  • Note that SQL Server automatically infers the data type from the XL sheet. You can change this in the CREATE TABLE dialog box if required.
  •  
  • Click on the Transformations tab. You will now see that SQL Server has automatically mapped the source to the destination. For now, let us not change anything in this tab. Click on the Ok button in the dialog box.
     
    With this we complete the basic DTS package. You can now execute this package by clicking on the green arrow icon on the main toolbar. This will execute the package and also displays a status of the execution. This is shown in the following figure:
     
     

    You can now see if the data has been successfully imported by executing the following query in the query analyzer tool (in the pubs database).

     
    SELECT * FROM [Sheet1$]
     
    That was a really simple task!! But if you look at it, we have accomplished an amazing piece of functionality with a few drag and drops. Imagine if you had to code all this manually!! Finally, you can save the package by clicking on the Package/Save menu. This will show the following dialog box:
     
     

    Click on the OK button to save the package to the SQL Server repository. If you choose Structured Storage File or Visual Basic File in the Location combo box, the designer will create an appropriately named file. Note that a Structured Storage File option is very efficient for transport across a network.

     

    Sample B, Simple Export

    One of the most common questions that I see coming in the SQL Server newsgroups is how to export a table as a CSV file. Although there are many methods to do this, DTS offers a solution, where you can also transform the data during the process of export and thus is more configurable. In this example, we will see how to export the contents of a table into a flat-file and doing some transformation in the process.

     
  •  
  • Start by creating a new package using the methods outlined in the earlier example.
  •  
  • From the connection toolbar, choose the Microsoft OLEDB Provider for SQL Server icon and drag and drop it on the designer surface. Configure the connection to connect to the pubs database.
  •  
  • From the connection toolbar, choose the Text File (Destination) icon and drag and drop it on the designer surface. In the Connection Properties dialog box, give the name of a file and then click on the Properties button. This is shown in the following figure.
     
     
  •  
  • The properties dialog box allows you to configure the file format. Here, we will choose to have a delimited text file as shown in the following figure.
     
     
  •  
  • Once you have done this, click on the Finish button and on the Ok button to close all the dialog boxes.
  •  
  • From the task toolbar, click on the Transform data task icon and connect the SQL Server and the text file connections.
  •  
  • Right-click the transformation and choose the Properties. This will show the properties dialog box.
  •  
  • In the Source tab, choose the [Sheet1$] table as the source.
  •  
  • When you click on the Destination tab, you are provided with a dialog box titled Define Columns. This dialog box allows you to decide what columns are to be exported and whether they need to be delimited. We will not change anything in this, thus choose "Execute" and the destination tab will be populated with the values chosen as shown in the following figure:
     
     
  •  
  • Now click on the transformations tab and select Delete All. This will remove all existing transformations. Now click Select All and click New. In the Create New Transformation choose ActiveX script and click Ok. You are now shown the Transformation Options dialog box. Give a name to the transformation as show in the following figure.
     
     
  •  
  • Now choose the Properties button. You are shown the script editor with some default script already written in VBScript. Let's change the script a bit so that we multiple the quantity field with 10. Modify the code as shown in the following figure:
     
     
  •  
  • Close all dialog boxes by choosing Ok. If you now see the Transformations tab, you will see that DTS has created a group transformation which is an ActiveX script as shown in the following figure:
     
     
    OK, we are done now and you can execute the package by clicking on the green execute icon. You will see a confirmation dialog box that everything went fine. If you see the output file that was generated (in our case it was generated as C:\Temp\Output.txt), you will see the following output.
     
    1001,Clips,1000
    1002,File Folders,200
    1003,Punching Machines,100
     
    Note that the VBScript calculation was applied to the "quantity" field. Real-world applications can do more sophisticated transformations that even connect to other data sources to fetch data.
     

    Executing DTS packages

    Till now, we have been executing DTS packages from the designer itself. But in a production scenario, you might want to execute a DTS package from a stored procedure or even from the command line. DTS provides a command line tool called dtsrun for this purpose. There are many options to this command, but we will see the simplest of them here:

     
     
    This command can be embedded inside a stored procedure by using the xp_cmdshell option. You can also encrypt the entire command so that no one can see your passwords. For this you need to use the /!Y option.
     

    Conclusion

    In this article we saw two very simple examples how DTS can be used to import and export data. You can develop many complex applications in DTS and there are many books that have been written on DTS. The DTS API is also a very complex topic, but is currently out of the scope of this article. Hopefully, you have got an insight of the power of DTS and I would encourage you to explore it more. A very good site on the various features of DTS is www.sqldts.com. This site is maintained by Darren Green a SQL Server MVP and is a storehouse of DTS examples.

     

    For more questions, you can mail me at srisamp@hotmail.com or visit my web site at http://www32.brinkster.com/srisamp.

     

    ©2009 Microsoft Corporation. All rights reserved. Contact Us |Terms of Use |Trademarks |Privacy Statement
    Microsoft