Microsoft Research Blog

Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

Neural-Guided Deductive Search: A best of both worlds approach to program synthesis

April 27, 2018 | By Alex Polozov, Researcher

Program synthesis — automatically generating a program that satisfies a given specification — is a major challenge in AI. In addition to changing the way we design software, it has the potential to revolutionize task automation. End users without programming skills can easily provide input-output examples of the desired program behavior. The Flash Fill feature in Microsoft Excel, a particularly successful application of this technology, demonstrates that a single example is often sufficient to generate the right program.

Figure 1. Flash Fill automatically completes a string transformation task from just a single example.

Designing program synthesis systems is challenging. You need to ensure that the synthesized program satisfies the provided examples. You also need to ensure its generalization; the program must produce the desired outputs on unseen inputs. Many possible programs are consistent with a single provided example and picking the most generalizable one is an important ML challenge. Finally, a user-facing synthesis system should be fast.

State-of-the-art program synthesis research follows one of two main directions: symbolic or neural. Symbolic systems use logical reasoning and a set of domain-specific rules to search for a program that satisfies the provided examples. They are designed to produce a satisfying program by construction. However, designing such rules involves significant engineering effort and the resulting search process may be exponentially slow.

In contrast, neural systems rely on end-to-end differentiable and trainable networks to generate the program. While much easier to implement and train, such networks require a lot of data to learn program synthesis. Because the data is usually randomly generated, neural systems often fail to capture real-world patterns and as a result produce unnatural programs that generalize poorly. Moreover, the purely statistical nature of neural networks does not provide any correctness guarantees on the synthesized program.

At the 6th International Conference on Learning Representations (ICLR), researchers from Microsoft Research will present neural-guided deductive search (NGDS), which combines the best of both AI approaches. It builds on top of the deductive search process, employed by symbolic synthesis systems, but augments it with neural guidance – a model that predicts, for each branching decision a priori, the generalization score of the best program that will be produced from that branch.

Consider, for example, the task in Figure 1. The most general program for it performs a concatenation of three string subexpressions: the first character of the first word, a constant string “. ”, and the last word.

During the search, a synthesis system first decides whether the top-level operator in the right program is a concatenation or a primitive subexpression (substring or constant string). If it decides the top-level operator is Concatenate, the system further reduces the provided input-output examples to the necessary input-output examples for the two subexpressions of Concatenate. These logical decisions introduce branches in the search process, most of which produce programs that satisfy the example but do not generalize to other inputs. Such branches can be eliminated a priori using our neural guidance.

Figure 2. A fragment of the deductive search process looking for the most generalizable program that satisfies the given input-output example. At each branching point in the search tree, the current state is fed into a neural model, which estimates the quality of the best program that is likely to be produced from each branch (shown as a green distribution curve; higher shapes correspond to more promising branches.)

Importantly, the sub-problems in the search process are independent; we can reason about a satisfying program for the sub-problem without factoring in the bigger problem from which it was deduced. This allows us to produce a lot of training data by recording all intermediate decisions in the search. From mere 385 string transformation tasks we generate over 400,000 real-world training examples.

We evaluated NGDS on a variety of real-life string transformation tasks and compared it to state-of-the-art program synthesis systems: PROSE (purely symbolic), RobustFill (purely neural), and DeepCoder (a neuro-symbolic hybrid). While NGDS and PROSE were given a single input-output example, we provided RobustFill and DeepCoder with extra examples because they are not explicitly designed for program generalization. In our experiments, NGDS achieved a 67% speed-up over the baseline PROSE system (improving performance of some tasks up to 12-fold), while retaining the same accuracy of task completion. The baseline approaches could not maintain an equal accuracy/performance balance, failing in one or both.

Figure 3. Accuracy and average speed-up of NGDS vs. baseline methods: PROSE, DeepCoder with 1-3 examples (DC 1-3), and RobustFill with 1-3 examples (RF 1-3).

We believe that new advances in program synthesis can be achieved with a combination of neural and symbolic methods. Logical reasoning and perfect program generalization augment the capabilities of neural systems, combining the strengths of both AI approaches. Our neural-guided deductive search research represents a pioneering achievement in neural-symbolic program synthesis. We are looking forward to expanding it to more challenging application domains.

Related:

Up Next

Human-computer interaction, Programming languages and software engineering

Neural Program Synthesis and the Quest to Democratize Programming with Dr. Rishabh Singh

Episode 10, January 31, 2018 - We can program computers to do almost anything. But what about programming computers to… program computers? That’s a task that Dr. Rishabh Singh, and the team in the Cognition group at Microsoft Research, are tackling with Neural Program Synthesis, also known as artificial programming.

Microsoft blog editor

Artificial intelligence

New Meta-learning Techniques for Neural Program Induction

Much research in AI lately focuses on extending the capabilities of deep learning architectures: moving beyond simple classification and pattern recognition into the realm of learning algorithmic tasks, such as inductive programming. Building on our past work in neural program synthesis for learning string transformations in a functional language, our most recent work explores the […]

Rishabh Singh

Researcher

program tree

Artificial intelligence, Programming languages and software engineering

Deep Learning for Program Synthesis

By Rishabh Singh, Jacob Devlin, Abdelrahman Mohamed, and Pushmeet Kohli, Microsoft Research Despite the many advances in computing over the past decades, the actual process of writing computer software has not fundamentally changed — a programmer must manually code the exact algorithmic logic of a program in a step-by-step manner using a specialized programming language. Although programming languages […]

Microsoft blog editor