Symbolic reasoning of large programs is bound to be imprecise. How to deal with this imprecision is a fundamental problem in program analysis. Imprecision forces approximation. Traditional static program verification builds “may” over-approximations of the program behaviors to check universal “for-all-paths” properties, while automatic test generation requires “must” under-approximations to check existential “for-some-path” properties.

In this paper, we introduce a new approach to test generation where tests are derived from validity proofs of first-order logic formulas, rather than satisfying assignments of quantifier-free first-order logic formulas as usual. Two key ingredients of thishigher-order test generation_ are to (1) represent complex/unknown program functions/instructions causing imprecision in symbolic execution by uninterpreted functions, and (2) record uninterpreted function samples capturing input-output pairs observed at execution time for those functions. We show that higher-order test generation generalizes and is more precise than simplifying complex symbolic expressions by using their concrete runtime values as done in DART. We present several program examples where our approach can exercise program paths and find bugs missed by previous techniques. We discuss the implementability and applications of this approach. We also explain why dynamic test generation is more powerful than static test generation.