Probabilistic programs use familiar notation of programming languages to specify probabilistic models. Suppose we are interested in estimating the distribution or expectation of a return expression r of a probabilistic program P. We are interested in slicing the probabilistic program P and obtain a simpler program SLI(P) which retains only those parts of P that are relevant to estimating r, and elides those parts P that are not relevant to estimating r. We desire that the SLI transformation be both correct and efficient. By correct, we mean that P and SLI(P) have identical estimates on r. By efficient, we mean that estimation over SLI(P) be as small as possible.
We show that usual notions of program slicing, which traverse control and data dependencies backward from the return expression r, are unsatisfactory for probabilistic programs, since it produces incorrect results on some programs and inefficient results on others. Our key insight is that in addition to the usual notions of control dependence and data dependence that are used to slice non-probabilistic programs, a new kind of dependence called observe dependence arises naturally due to observe statements in probabilistic programs.
We propose a new definition of SLI(P) which is both correct and efficient for probabilistic programs, by including observe dependence in addition to control and data dependences for computing slices. We prove correctness mathematically, and we demonstrate efficiency empirically. We show that by applying the SLI transformation as a pre-pass, we can improve the efficiency of probabilistic inference, not only in our own inference tool R2, but also in other systems for performing inference such as Church and Infer.NET.