In a world where programming is largely based on using APIs, semantic code search emerges as a way to effectively learn how such APIs should be used. Towards this end, we present a formal framework for static speciﬁcation mining that is able to handle code snippets and incomplete programs. Our framework analyzes code snippets and extract partial temporal specifications. Technically, partial temporal speciﬁcations are represented as symbolic automata – automata where transitions may be labeled by variables, and a variable can be substituted by a letter, a word, or a regular language. With the help of symbolic automata, the use of the API is extracted from each snippet of code, and the many separate examples are consolidated to create a fuller usage scenario database that can be queried. We have implemented our approach in a tool called PRIME and applied it to analyze and consolidate thousands of snippets per tested API.
This talk is based on work with Alon Mishne, Sharon Shoham, Eran Yahav, and Hongseok Yang.