Abstract

An open problem in program verification is to verify properties of computer programs that manipulate memory on the heap. A key challenge is to find formal descriptions of the data structures that are instantiated, which are used as input to a proof procedure that verifies the program. Standard approaches to the problem are to severely restrict the space of data structures that can be recognized (e.g., just lists), or to use search guided by hand specified heuristics. In this work, we explore a machine learning-based approach, where we execute the program and then learn to map the state of heap memory (represented as a labelled directed graph) to a logical description of the instantiated data structures. We formulate the learning problem as one of mapping from graphs to an element of a grammar over data structure descriptions. We report preliminary empirical results showing this to be a promising new direction.