Deep Program Understanding

Established: June 1, 2015


We aim to teach machines to understand complex algorithms, combining methods from the programming languages and the machine learning communities.

Learning Algorithms

A core problem of machine learning is to learn algorithms that explain observed behavior. This can take several forms, such as program synthesis from examples, in which an interpretable program matching given input/output pairs has to be produced; or alternatively programming by demonstration, in which a system has to learn to mimic sequences of actions.


Learning Software Engineering

Building “smart” software engineering tools requires to learn from existing code, documentation and online resources (e.g. StackOverflow). Using machine learning techniques, we can learn from existing code, capture developer intent from both natural language (i.e., a query such as “Read lines from File”) and patterns in source code (i.e., “The current project is in C#, and there is a variable called fileName”) and thus enable a more productive workflow. The Bing Developer Assistant is a first prototype in this line of work, and more research in this area is under way.


Learning to Analyse Programs

A central problem in program verification is the generation of program invariants, a concise description of what program states are reachable. Classical techniques such as abstract interpretation, or counterexample-guided abstraction refinement, synthesise invariants by static analysis (i.e., without executing the program). An alternative to this is the use of prevalent program test suites to observe how “typical” program runs look like, and construct a likely program invariant from these observations. The structure of the problem (“given a set of observations, predict an invariant”) lends itself well to the use of machine learning techniques.