Excel Services part 10: Data Connection Libraries, or connecting to databases made easy, manageable, and secure

In the past few posts I have talked about all the work we have done to make managing, sharing, and securing your Excel workbooks better using SharePoint and Excel Services.  Today, I am going to cover a new feature that provides management, sharing, and security of data connections – the Data Connection Library (DCL).

What is a Data Connection Library?
A Data Connection Library (or DCL) is a new type of SharePoint library (much like a document library) that provides a place to store, share, and manage connection files.  By connection files, I mean Office Data Connection (ODC) files which contain all the information and parameters needed to form a data connection, such as server name, OLAP cube or table name, and query (note – ODC files are not a new feature – they were introduced in Excel 2002).  Since the DCL is a library in SharePoint, it comes with all the great SharePoint features you would expect – such as workflow support, file approval, library level/item level security, and sorting filtering based on metadata. You can create a DCL the same way you create any library, and DCLs can be created almost anywhere in SharePoint e.g. on a portal, team site, etc.   Here is what a DCL looks like in SharePoint 12.

(Click to enlarge)

Even though the DCL reuses the library concept in SharePoint, it is much more valuable than just a document library full of connection files – this is because of how Excel interacts with the DCL.  Let’s take a look at that and see how the DCL and Excel 12 solve a few problems.

Connecting to databases made easy …
Setting up a connection to a database in Excel is a task that many users struggle with – for example, if you want to connect to an ODBC datasource or SQL Server Analysis Services cube, users must know server names, cube names, table names, what type of connection to create, user credentials, etc.  Lots of clicks and knowledge are required. 

Excel 12 and DCLs make connecting to databases a much simpler, faster activity.  Users will simply need to know what data they want to work with, and that’s pretty much it.  Let’s take a look at an example of getting started with a PivotTable connected to a database in Excel 12.

To connect to a database (or other external data source like a web query) in Excel 12, users will use the Data tab.  Here is a shot of the Data tab in the beta build (note, this is not the final UI we will ship).

(Click to enlarge)

One of the buttons on the Data tab is the “Existing Connections” button.  To connect to the database they want to use for their PivotTable, the user will start by pressing this button.  When they do, they see a new dialog (“Existing Connections”) which lists the connections that are stored in the DCL. 

(Click to enlarge)

One point to note is that these connections all have friendly names and non-technical descriptions, so it is easy for users to choose the connection they want.  Those names and descriptions are provided by the person(s) who set up the DCL and populate it with ODC files – more on this below. 

Next, the user simply needs to highlight the connection they want to use and press open.  At this point, they see another dialog which allows the user to tell Excel 12 what to do with the data.

(Click to enlarge)

Another point to note is that Excel 12 looks at the connection the user has selected and only offers options that are possible for a particular database (for example, in this case, the datasource was a SQL Server Analysis Server database, which cannot be represented in Excel 12 as a table, so that option is disabled).  At this point, the user needs to simply press OK and they have a PivotTable connected to data in Excel 12.  That’s a total of 3-4 clicks.

Some of you are probably asking how the connections ended up in the DCL in the first place.  In general, we anticipate that either departmental “connection-savvy” power users or IT will author data connection files and put them in DCLs where the connections will be reused by many people in the organization. 

You may also be asking how Excel knows about the existence of DCLs.  SharePoint has a new feature that allows the administrator to “advertise” the location of the DCL to Office 12 clients, allowing connections from a DCL to show up in Excel 12.  Of course, the DCL only shows up if the user has permissions to access those connection files.

Solving connection management problems …
In addition to improving discoverability of connections, DCLs will help customers manage connections.  Information about data sources can change, such as server name, OLAP cube name, table name, etc. – a typical example is a database moving from a test server to a production server.  For organizations that have many authors, it may be difficult to communicate these changes to all the right people.  Worse yet, there may be hundreds of existing workbooks that need to have their data connections updated.  The DCL helps solve these problems because customers will only need to update a single connection file in the DCL with new information.  After they have done so, workbook authors will get the right connection information the next time they use that connection file, and any existing workbooks that were created using the connection file will now have their connections updated automatically the next time that workbook’s data is refreshed. 

How exactly does this work?  By default, workbooks will refresh their connection information from a DCL only when they fail to connect to the data source (you might think of this as a “failover” mechanism).  But we have also added the ability to force workbooks to always get the latest connection information before attempting to connect.  An example of when this might be useful is when you want workbook authors to start using a new database for business reports, but you still want to keep the old database around and functioning for auditing or test reasons.  Connections to the old database still work, but you want current and future workbooks to start using the new database.  The “always use this file to refresh data” setting is designed for exactly that kind of scenario.  The setting is a property in the ODC file itself – it can be set when the ODC file is created (pictured below).

(Click to enlarge)

Making data connectivity more secure …
Now that we have talked about discoverability and manageability, let’s conclude by looking at how the DCL can be used to make connecting to data more secure. 

One common security concern is knowing which data connections are safe to run – for example, data connections can contain malicious queries, or they could contain connection parameters that can slow an app down or compromise the integrity of the data.  By creating a DCL, and by only allowing most knowledgeable and trusted “connection authors” to save connections to the DCL, you add an extra layer of security that helps ensure that connections coming from a DCL safe to run. 

In a previous entry, I talked about Trusted Locations on Excel Services as a means to ensure that malicious workbooks were prevented from running on the server.  Much like Trusted Locations, Excel Services has “Trusted Connection Libraries” for data connections.  Excel Services has a mode where it will only process data connections from DCLs that the administrator has explicitly marked as “trusted” by the server.  As mentioned above, data connections have many security threats associated with them – in many ways processing a data connection can be like running code.  By providing Trusted Connection Libraries, Excel Services gives the administrator the ability to allow only specific data connections to be run on the server.    

That’s it for DCLs.  Next time we will take a look at how Excel Services integrates with SharePoint dashboards.