Microsoft Research Blog

Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

Influencing mainstream software—Applying programming language research ideas to transform spreadsheets

January 14, 2019 | By Andy Gordon, Principal Research Manager; Simon Peyton Jones, Principal Researcher

Simon Peyton Jones writing with a fellow researcher at a desk

Spreadsheets are the world’s most widely used programming language, by several orders of magnitude. We asked ourselves whether it would be possible to apply programming language research ideas to make spreadsheets a better programming language? If we could, that would empower a huge user community to do more.

One of the joys of working at Microsoft Research is the ability to directly influence mainstream software technologies – in this case, Microsoft Excel. And the Excel product team has been excited to partner with us. The core ideas of our collaboration are these:

  • Make Excel’s data values reflect the datatypes of our end users’ domains, by adding arrays, vectors, records, and even domain-specific data types implemented by third parties. A first step is to make arrays—already a very natural data type for Excel—into first-class values that can be computed by any formula; see Champagne Prototyping: A Research Technique for Early Evaluation of Complex End-User Programming Systems (VLHCC’04).
  • Make Excel functions reflect the abstractions of our end users, by allowing end-users to define new functions using an ordinary worksheet, rather than by switching to a totally different language (and computation paradigm) such as Visual Basic. See A User-Centred Approach to Functions in Excel (ICFP’03).

Through deep collaboration between Microsoft Research Cambridge and the Microsoft Excel product team, Excel is now moving decisively to reflect these core principles. The first pieces have already appeared: an Excel cell can now contain a first-class record, linked to external data sources. And ordinary Excel formulas can now compute array values, that “spill” into adjacent cells (dynamic arrays).

The Calc Intelligence team and Excel colleagues sharing a moment at Microsoft Research Cambridge, UK.

There is more to come: we have a working version of full, higher-order, lexically-scoped lambdas (and let-expressions) in Excel’s formula language and we are prototyping sheet-defined functions and full nesting of array values.

New directions

Our Calc Intelligence project works at the boundary between programming language research and human experience and design. This interdisciplinary work has thrown up interesting new ideas that will extend the reach and ambition of what end users can achieve with Excel, including:

  • Calc View. Traditional programming sees programs principally as textual code, with the data being delayed until runtime. In contrast, spreadsheet programming focuses on the data and its visualisation, with the code (the formulas) being almost entirely hidden. What if we were to provide both views at once? The grid would sit side-by-side with a textual “calculation view”, that could give far more insight into the computations that are going on, illuminated by comments, layout, and all the other mechanisms that programmers use to make their code robust, modular, and long-lived. See Calculation View: multiple-representation editing in spreadsheets (VLHCC 2018).
  • Elastic Sheet Defined Functions. End-users will initially define new user-defined functions on fixed-size arrays, but we would like to generalize them to work on arbitrary-sized arrays. You could come up with ad-hoc ways to do this for special cases, but could we imagine a principal generalisation of a single example to arbitrary sized inputs? We did. See Elastic Sheet-Defined Functions: Generalising Spreadsheet Functions to Variable-Size Input Arrays (in submission).

In each case, the needs of end users have driven us in intriguing (and often technically-demanding) new research directions. Our internal prototypes and strong cross-company relationships provide wonderful intellectual opportunities for research here in Cambridge.

Alongside our long-term partnership with the Excel team, we have close connections with researchers at other Microsoft Research labs who work on spreadsheets. For example, research by our colleagues at Microsoft Research Asia in Beijing on extracting insights from multi-dimensional datasets has influenced the Ideas feature in Excel, while recent research from Microsoft Research Redmond, Washington shows how deep learning can find errors in spreadsheets.

The Office team is thinking creatively and broadly about how Office apps can evolve to embrace the opportunities that surround us. There is no better time to be working in Microsoft Research alongside the product teams to change the future!

Up Next

Programming languages and software engineering

Coyote: Making it easier for developers to build reliable asynchronous software

For developers, writing bug-free software that doesn’t crash is getting difficult in an increasingly competitive world where software needs to ship before it becomes obsolete. This challenge is especially apparent with online cloud services, which are often dictated by aggressive shipping deadlines. Cloud services are distributed programs comprising multiple back-end systems that continuously exchange asynchronous […]

Akash Lal

Senior Principal Researcher

Programming languages and software engineering

The next generation of developer tools for data programming webinar

March 9, 2020 10:00 AM (PT) - In this webinar led by Partner Researcher Dr. Ben Zorn, follow the path of the revolution that empowers more people to easily leverage computational resources for problem solving. You will examine the incredible opportunities and technical challenges of empowering users to quickly build correct and meaningful spreadsheets. The focus here includes recent efforts to combine spatial analysis, statistical analysis, and deep learning to find bugs in spreadsheets. You will also find out how AI is advancing data programming as we know it.

Ben Zorn

Partner Researcher

Artificial intelligence, Data platforms and analytics, Human-computer interaction, Programming languages and software engineering

Program synthesis and the art of programming by intent with Dr. Sumit Gulwani

Episode 99 | November 20, 2019 - Dr. Sumit Gulwani is a programmer’s programmer. Literally. A Partner Research Manager in the Program Synthesis, or PROSE, group at Microsoft Research, Dr. Gulwani is a leading researcher in program synthesis and the inventor of many intent-understanding, programming-by-example and programming-by-natural language technologies – aka, the automation of “what I meant to do and wanted to do, but my computer wouldn’t let me” tasks. Today, Dr. Gulwani gives us an overview of the exciting “now” and promising future of program synthesis; reveals some fascinating new applications and technical advances; tells us the story behind the creation of Excel’s popular Flash Fill feature (and how a Flash Fill Fail elicited a viral tweet that paved the way for new domain investments); and shares a heartwarming story of how human empathy facilitated an “ah-ha math moment” in the life of a child, and what that might mean to computer scientists, educators, and even tech companies in the future.

Microsoft blog editor