Microsoft Research Data Visualization Apps for Office

Established: February 6, 2013

At Microsoft Research, we want to let users play with new types of visualizations. We’re doing it with Apps for Office, a new feature for Office 2013 and Office 365. These let users easily add new functionality to Office.

Apps for Office require Office 2013 or Office 365. You can get a free trial of Office 365. Our apps are in the Microsoft Research Office Store. (Need help getting started? Check out this handy guide.)

We’d Love Your Feedback!
Please ask questions and tell us about what you are doing at our Online Forum.

This is a Research Deployment
We have released the Visualization Toolkit for us to learn about how users interact with advanced visualization. We return a small amount of anonymized information to learn how our users are working with the application; we keep no personally-identifiable information. This information is stored on secure servers, and is used only for research purposes.

Histogram

Histogram: Microsoft Research Data Visualization Apps for Office

A histogram shows the distribution of data. It will count and bucket data points, and create a bar graph of the frequency of those points. A histogram can be helpful to see how many data points fall into different categories.

How to use

First, add the Histogram from the Office Store. (Here are the basic instructions for getting started with Apps for Office.)

Click Sample Data to add a column of sample data to your spreadsheet.

Make sure you have data in exactly one selected, and click Set Data to set the data in the histogram to the column.

The slider controls the number of bins: drag it left to make fewer, big bins; drag right to make more, small ones.

The Color box allows you to select a color for the bars.

The histogram can view either categorical data (such as text labels) or continuous values. The Category checkbox forces the system to choose one bin for each distinct value in the dataset.

A Histogram with five bins, showing the distribution of the sample data at left.

2D Histogram (Density Plot)

2D Histogram: Microsoft Research Data Visualization Apps for Office

A 2D Histogram, or Density Plot, visualizes the joint distribution of two different axes: it lets you see how much of one thing there is, given another.

Consider the scenario in the scene below. There are two columns of raw data. On the left side, we have built individual histograms of each one–and they both look a bit like normal curves. Only the 2D histogram reveals that there is a connection between the two dimensions: that high values in one axis go with low values on another.

A 2D histogram is sort of like a scatterplot, but can show clusters of points. Even if there are a few thousand points, or the points overlap, the density plot can still show them all.

How to use

First, add the Histogram 2D App from the Office Store. (Here are the basic instructions for getting started with Apps for Office.)

Click Sample Data to add two columns of sample data to your spreadsheet. Click the button again to see categorical-vs-numerical data, which can look at the frequency of sales of “apples” compare to “pears”. Click a third time to see sample data for categorical-vs-categorical.

Make sure you have data in exactly two columns selected, and click Set Data to set the data in the histogram to the column.The two sliders control the number of bins: drag it left to make fewer, big bins; drag right to make more, small ones.The Color box allows you to select a color for the squares in the chart.The histogram can view either categorical data (such as text labels) or continuous values. The Category checkbox forces the system to choose one bin for each distinct value in the dataset.

Treemap

Treemap: Microsoft Research Data Visualization Apps for Office

A treemap is a hierarchical view of data: it breaks an area into rectangles representing branches of a tree. Each rectangle is broken in turn into subrectangles, representing subbranches. Each subrectangle can be associated with two different numbers, once of which may be set to a color; the other to a size.

A treemap can make it easy to discover how a whole is broken into parts: where the budget in your organization goes, or where the space on your disk has gone.

For example, here’s an organization divided into four offices: North, South, West, and East. Each sells volumes of “food”, “office”, and “tech”; each of these, in turn, has subcategories. We’re using the size of each box to represent sales volume, and the color to represent the profit.

The Treemap was invented by Ben Shneiderman, who maintains a history of different treemaps.

Data Formats

Excel doesn’t naturally have a way to represent hierarchical data: it likes rows and columns. In the Treemap module, you can enter data either as a name list, or as an id/parent set.

Name List

The Name List format might be most natural for things like disk space. In a name list, you enter as many columns as you need to represent the deepest part of the hierarchy–and then leave some parts of it blank. Here’s an example from a disk:

Level 1 Level 2 Level 3 Level 4 Level 5
C:
C: Files Personal User MyFile.txt
C: Temp

You might also use the Name List to handle data that isn’t completely hierarchical:

 

Region

CategoryProductVolumeProfit Each
WestSprocketsType A10000.5
WestSprocketsType B400-5
EastRatchets 20001.0

Note the “Volume” and “Profit” columns. We can use them for size and color, respectively.

Id/parent

If you have an organization chart stored in a database, then it might look different: you might have records with a person and their manager. Here’s a small organization:

NameManager
David FreeseWilfredo Baber
Wilfredo BaberYolanda Hill
Yolanda Hill

Since it’s very annoying to switch between the “Name List” and the “Id/Parent” forms, the Treemap application accepts both.

Using the App

For basic instructions on starting up an App for Office, see the description on t.

The easiest way to get started with the app is to generate sample data. on the “Sample Data” tab, press either “Name List” or “ID/Parent”.

After data is available, it will take three selections to build a complete chart: one to add the names; one to select the column for the size, and one more to select the column for color. Note that these last two columns are optional: it’s a perfectly fine treemap even without size or color.

Streamgraph

Data Visualization Apps for Office

A Streamgraph is a more unusual type of graph. It shows the way that a set of numbers has changed over time by their relative area. (See the sidebar for the history of Streamgraphs).

A streamgraph is based on a rectangle of data: in this example, “Comedy” had a volume of 10 in 2008, and of 12 in 2010.

YearComedyDramaSci-FiAction
2008101060
2009121330
201014181016
201122229.355

Once you have the app installed, you can press “Generate” to get sample data that is appropriately formatted.

There are two buttons to specify the chart data. Press “Series Values…” to select the data that represents the values of the dataset, including the column headers. In the table below, those are italicized.

Press “Horizontal axis labels…” to select the data that represents the column names. In this example, you would select the boldfaced years.

All of the values in the table should be numbers greater than zero. If the application sees a number less than zero, it will replace it with a zero.

Apps for Access

Apps for Access

Our implementations of core visualizations are a testbed to learn about how these tools are used with database tools.

We provide core visualizations, familiar to users of Excel and other popular charting tools:

  • A bar chart and column chart. These differ only in the orientation of the bars.
  • A line chart and an area chart; both offer stacked or overlapping areas. An area chart is filled; a line chart only draws the top line.
  • A pie chart
  • A scatter chart, or scatter plot.

For each of these apps, the system expects see one column of X axis labels, and several columns of series data. Press “Generate” to insert sample data on the sheet.

Year   Sales   Profit
2001   100   14
2002   122   12
   …   …

Click [Series Data…] to select the series: Sales and Profit.

Click [Label Data…] be used to select the labels: the Year column.

This image shows a column chart for the sample data above.

Notes

For the area chart and line chart, the visualization will attempt to interpret the ‘labels’ column as ascending, numeric X axis values. If that works, then it will render them appropriately spaced; the spacing need not be regular. If any of the values are not numeric, or not ascending, then each label will instead be placed as evenly-spaced categories.

Charts behave inconsistently with negative Y-axis values.

Note that in the current version of the charts, the X and Y axes are automaitcal

People