The traditional extract, transform, and load (or ETL) model of integrating data from standard data sources continues to be the model used in most data warehouses.
Businesses today use information from a wide range of data sources. Extracting and merging data from multiple sources to create a single, consistent set can be challenging, especially in environments where data may reside in legacy stores, relational databases, files, and even the Internet.
SSIS provides enterprise-class scalability options out of the box with its advanced data-integration pipeline architecture, high-performance processing, and native support for 64-bit platforms. SSIS also provides native support for asynchronous and parallel execution of packages, tasks, and transformations, which maximizes the efficient use of available resources.
Out of the box, SSIS provides a rich set of advanced transformations, making it possible to build data warehouses and data marts rapidly. SSIS enables the complex aggregations and sorting required to prepare data for loading into dimensions and fact tables.
Data stored and transformed in a business intelligence application may undergo many transformations before it reaches users. Lineage Analysis is critical to understanding the dependencies of data at any point in the flow, and makes it possible to trace data through all its transformations, starting at the source of the data.
SQL Server Integration Services, Analysis Services, and Reporting Services all use a common Visual Studio–based development environment called the SQL Server Business Intelligence Development Studio. The Business Intelligence Development Studio provides an integrated development environment for developing business intelligence applications, thus enabling developers to create SSIS integration solutions.
SQL Server Management Studio aids in the management of SSIS package-execution from a system standpoint. By taking advantage of new manageability features in SSIS, a much more granular level of management and robustness can be built into SSIS packages.
The primary goal of a data integration solution is to assemble data from one or more data sources. As you bring data together, you are likely to find a broad range of data-quality issues that require attention. Examples of poor data quality include duplicate, missing, misrepresented, or incorrect data, compounded by the extremely large volume of data being integrated. SSIS provides a flexible strategy for handing these issues by providing three key tasks for data quality solutions: profiling, cleansing, and auditing.
Even though SSIS provides a comprehensive set of tasks, developers occasionally need to build on the available functionality to meet the special needs of their applications. One of way developers can extend and customize the SSIS platform is by creating custom scripts in Visual Basic .NET.
While SSIS does include a large variety of built-in components, customers may want to extend the capabilities of SSIS to meet the unique needs of their data integration and Business Intelligence applications. The full functionality of SSIS can be accessed through a rich set of APIs, which can be used to build custom components.