Data Formulator: AI-powered Concept-driven Visualization Authoring

With most modern visualization tools, authors need to transform their data into tidy formats to create visualizations they want. Because this requires experience with programming or separate data processing tools, data transformation remains a barrier in visualization authoring. To address this challenge, we present a new visualization paradigm, concept binding, that separates high-level visualization intents and low-level data transformation steps, leveraging an AI agent. We realize this paradigm in Data Formulator, an interactive visualization authoring tool. With Data Formulator, authors first define data concepts they plan to visualize using natural languages or examples, and then bind them to visual channels. Data Formulator then dispatches its AI-agent to automatically transform the input data to surface these concepts and generate desired visualizations. When presenting the results (transformed table and output visualizations) from the AI agent, Data Formulator provides feedback to help authors inspect and understand them. A user study with 10 participants shows that participants could learn and use Data Formulator to create visualizations that involve challenging data transformations and presents interesting future research directions.

관련 도구

Data Formulator

9월 5, 2024

Transform data and create rich visualizations iteratively with AI.

Data Formulator v0.5: Vibe with data, in control

Data Formulator is an AI-powered visualization tool that helps analysts iteratively explore and visualize data from any source—clean or messy, small or large—without coding. In agent mode, users provide high-level descriptions, and agents automatically plan, transform, refine, and visualize data to surface insights. To keep users in control, Data Formulator provides an interactive interface for interpreting results and providing fine-grained instructions to guide exploration. Analysts, data scientists, and anyone interested in data are invited to try Data Formulator (opens in new tab) and share feedback.

Data Formulator – Azure AI Foundry Labs | Early-Stage AI Experiments & Prototypes (opens in new tab)

Data Formulator – Product Hunt (opens in new tab)

Data Formulator: Vibe with your data, in control

Data Formulator is an AI-powered tool for analysts to iteratively explore and visualize data. Started with data in any format (screenshot, text, csv, or database), users can work with AI agents with a novel blended interface that combines user interface interactions (UI) and natural language (NL) inputs to communicate their intents, control branching exploration directions, and create reports to share their insights.

Explore more

Transcript

Data Formulator: Vibe with your data, in control

[MUSIC]  

[MUSIC FADES INTO SWEEPING SOUND]

KAREN EASTERBROOK: At Microsoft Research, we see fresh innovations emerge out of the labs ready to reshape how people and organizations harness technology in their everyday work.

A fellow Redmond-based colleague, Chenglong Wang, principal researcher, works with his team to reimagine how analysts interact with data. By combining natural language and intuitive interface design, they introduced Data Formulator. This tool helps users guide AI collaboratively—to visualize, explore, and truly “vibe” with data, in control.

Over to you, Chenglong.

[MUSIC]  

[MUSIC FADES INTO SWEEPING SOUND]

CHENGLONG WANG: Hello, everyone. I’m Chenglong from Microsoft Research Redmond. Today, I’d like to introduce you [to] Data Formulator. It’s an interactive AI-powered visualization tool for you to explore data with AI agents, in control.

Why do we build this tool? For example, given a dataset about US product prices from 2005 to 2025, we may be curious about, how much does a product price grow over the years, as everything around us has become more expensive lately, right? The best way to answer this question is to explore the data and show insights with some visualizations. But how can we explore effectively?

To get these insights, we need to go through an iterative data exploration process.

We first need to formulate plans based on our data. Then we need programming skills to transform and implement the visualizations. We often don’t solve it in one step. We may need to go back to take another branch, to backtrack, and revisit some steps. After a few iterations, we should find some insights and we’re ready to share.

This process, however, can be challenging. It requires a user to have both data science expertise to formulate good questions and also requires good programming skills to implement these designs. How can we make this accessible to everyone who is interested in data?

Luckily, with AI agents that can generate code from user instructions, coding is no longer a big barrier. However, to make data analysis truly accessible, we still need to lower the interaction friction between user and AI.

The first challenge is that natural language can be quite universal, but it can be verbose for describing the visualization intent and it may not be very precise. The second is that chat interface is a way how we go with interact with agents, but it doesn’t support iteration naturally as we want for data analysis. Third, agents can be difficult to control if they are working in a black box. The user may easily lose control over what agent is doing.

In order to solve this problem, we introduce Data Formulator. It’s an interactive data analysis tool for people to work with agents effectively to explore data.

It first features a multi-modal interaction. The multi-modal UI combines UI interaction and natural language so that users can specify their design both precisely and concisely. Second, to support iteration, we organize exploration into threads, allowing users to backtrack, follow up, and revisit. On top of that, we developed agent mode. Instead of running in a black box, it operates on top of the data thread so the user can take control whenever they want.

Let me show you the experience of using Data Formulator to explore data.

Let’s explore, how much do US product prices increase over the decades? Let’s first load the data. This data. We can start with agent mode; ask AI agent to automatically explore the data on our behalf. The agent first generates a plan, and then it’s trying to generate code to execute the plan by transforming the data and visualize it.

This is the first chart recommended by the agent. It’s a line chart for temporal price trends. Wow, everything has been more expensive now.

It’s now recommended a second chart. It’s a bar chart to show the product volatility. It seems egg price is most volatile.

Now we can take control; we can ask a follow-up question using natural language: how does COVID and 2008 crisis affect prices?

It now generates a stacked bar chart, not certainly the best way to visualize it. We can use UI to change it to a grouped bar chart. Now it’s clear. It seems that COVID affected prices more significantly. That’s very interesting.

We can now go back to the previous branch to explore something different. Hmm, what should we explore here? We can ask the AI agent for some ideas. It seems it’s interesting to see the price correlation with gas price. Let’s go with that.

Now we get a bar chart to show the price correlation with gas price. It seems that most product prices are correlated to the gas, except … tomato?

We can continue to explore, but now we can also write a report to share our findings. We can ask AI agents to generate a report grounded on our data. It will leverage the computation, data, and charts to compose a report.

As you can see, with AI agents automating the implementations, we can easily vibe with our data. The blended UI-plus-natural-language-interaction approach allows us to easily work with agents. While agent is automating both steps, we can easily take control when we need to go to the exploration path we like.

So this is Data Formulator. It is an interactive tool for people to explore and visualize data. It features a multi-modal interface that makes it easy for users to specify visualization ideas. It’s providing a new design, data threads, so that users can perform nonlinear interactions with AI agents in exploration tasks. Thirdly, the agent plus interactive modes allow users to vibe with data but still having full control. We invite you to try Data Formulator on data-formulator.ai. Explore with some data and let us know what you find.