I think hard about abstractions that help programmers easily express complex problems yet are sufficiently constrained such that we can build runtimes to efficiently implement those abstractions.  Broadly speaking, I build abstractions and runtimes for (i) probabilistic programming and approximate computing and (ii) parallel programming.



Established: January 5, 2016

Uncertainty is a C# library that uses LINQ to let developers easily express probabilistic computations and then inference over those computations. See our recorded Research In Focus talk from the Microsoft Faculty Summit this past year for more information. Uncertain is on GitHub: (https://github.com/klipto/uncertainty)


Established: October 17, 2014

Parasail is a novel approach to parallelizing a large class of seemingly sequential applications wherein dependencies are, at runtime, treated as symbolic values. The efficiency of parallelization, then, depends on the efficiency of the symbolic computation, an active area of research in static analysis, verification, and partial evaluation. This is exciting as advances in these fields can translate to novel parallel algorithms for sequential computation.


Established: January 30, 2014

Crowd-sourcing is increasingly being used for providing answers to online polls and surveys. However, existing systems, while taking care of the mechanics of attracting crowd workers, poll building, and payment, provide little that would help the survey-maker or pollster to obtain statistically significant results devoid of even the obvious selection biases. InterPoll is a platform for programming of crowd-sourced polls. Polls are expressed as embedded LINQ queries, whose results are provided to the developer. InterPoll…









Other projects:

  • Probabilistic Programming in General Purpose Programs: Applications that sense and reason about the complexity of the world use estimates. Mobile phone applications estimate location with GPS sensors, search estimates information needs from search terms, machine learning estimates hidden parameters from data, and approximate hardware estimates precise hardware to improve energy efficiency. The difference between an estimate and its true value is uncertainty.  This work builds novel tools and runtimes to help developers reason about uncertain data.  Uncertain is a generic type which lifts operations on T to distributions over T.  Its abstractions are designed to help novice developers incorporate, compute, and reason about uncertain data in their programs without requiring a PhD degree in statistics.  Likewise, probabilistic assertions let programmers specify quality constraints in programs as probabilistic correctness properties and our runtime and compiler efficiently verifies whether those probabilistic assertions are likely to hold.
  • Data Parallel Finite State Machines and Dynamic Programming: Hardware is parallel. While desktops, tablets, and even phones have vector instructions, multiple cores, and GPUs, many important algorithms are unable to exploit all of this parallelism.  In this work, we introduce convergence, a new method for breaking data dependencies in loops and demonstrate how it makes FSMs and a class of dynamic programming algorithms data parallel.  We frame both FSMs and dynamic programming algorithms as the repeated application of matrix-vector and matrix-matrix products in an appropriate semring.  Convergence is an empirical property of repeated matrix-matrix products wherein the rank of the resulting matrix after a small sequence of matrix-matrix products tends to be small. Our results demonstrate multi-factor speedups on data parallel hardware.