Automating String Processing in Spreadsheets using Input-Output Examples
We describe the design of a string programming/expression language that supports restricted forms of regular expressions, conditionals and loops. The language is expressive enough to represent a wide variety of string manipulation tasks that end-users struggle with. We describe an algorithm based on several novel concepts for synthesizing a desired program in this language from input-output examples. The synthesis algorithm is very efficient taking fraction of a second for various benchmark examples. The synthesis algorithm is interactive and has several desirable features: it can rank multiple solutions and has fast convergence, it can detect noise in the user input, and it supports an active interaction model wherein the user is prompted to provide outputs on inputs that may have multiple computational interpretations.
The algorithm has been implemented as an interactive add-in for Microsoft Excel spreadsheet system. The prototype tool has met the golden test – it has synthesized part of itself, and has been used to solve problems beyond authors’ imagination.
On the occasion of receiving the most influential test-of-time paper award for his POPL 2011 paper (which describes the technology behind the popular Flash Fill feature in Excel), Sumit shares stories related to his inspiration for the problem definition in the paper, the solution strategy, and the impact that came after. These stories span a decade before and after this paper, with this paper being the most important turning point in his research philosophy and career. These stories speak to his learnings that are as much cultural as technical.
Sumit acknowledges 5 sets of people in this 5-minute award acceptance talk for his "most influential" POPL 2011 paper, which describes the technology behind the popular Flash Fill feature in Excel. He shares inside stories of the role that these actors played in this invention and its impact.