Analyzing the Entire Program: Applying Natural Language Processing to Software Engineering

  • Michael D. Ernst | University of Washington

A powerful, but limited, way to view software is as source code alone. Mathematical techniques, such as abstract interpretation and model checking, can indicate whether the program satisfies a formal specification. But, where does the formal specification come from? A program consists of much more than a sequence of instructions. Developers make use of test cases, documentation, variable names, program structure, the version control repository, and more. I argue that it is time to take the blinders off of software analysis tools: tools should use all these artifacts to deduce more powerful and useful information about the program. Researchers are beginning to make progress towards this vision. In this talk, I will discuss four initial results that find bugs and generate code, by making use of variable names, error messages, procedure documentation, and user questions.

Series: Microsoft Research Talks