Learning Syntactic Program Transformations from Examples

  • Reudismam Rolim ,
  • Gustavo Soares ,
  • Loris D’Antoni ,
  • Oleksandr Polozov ,
  • ,
  • Rohit Gheyi ,
  • Ryo Suzuki ,
  • Björn Hartmann

ICSE 2017 |

Automatic program transformation tools can be
valuable for programmers to help them with refactoring tasks,
and for Computer Science students in the form of tutoring systems
that suggest repairs to programming assignments. However,
manually creating catalogs of transformations is complex and
time-consuming. In this paper, we present REFAZER, a technique
for automatically learning program transformations. REFAZER
builds on the observation that code edits performed by developers
can be used as input-output examples for learning program
transformations. Example edits may share the same structure
but involve different variables and subexpressions, which must
be generalized in a transformation at the right level of abstraction.
To learn transformations, REFAZER leverages state-of-the-art
programming-by-example methodology using the following key
components: (a) a novel domain-specific language (DSL) for describing
program transformations, (b) domain-specific deductive
algorithms for efficiently synthesizing transformations in the DSL,
and (c) functions for ranking the synthesized transformations.
We instantiate and evaluate REFAZER in two domains. First,
given examples of code edits used by students to fix incorrect
programming assignment submissions, we learn program transformations
that can fix other students’ submissions with similar
faults. In our evaluation conducted on 4 programming tasks
performed by 720 students, our technique helped to fix incorrect
submissions for 87% of the students. In the second domain, we
use repetitive code edits applied by developers to the same project
to synthesize a program transformation that applies these edits
to other locations in the code. In our evaluation conducted on 56
scenarios of repetitive edits taken from three large C# open-source
projects, REFAZER learns the intended program transformation
in 84% of the cases using only 2.9 examples on average.