# R2: A Probabilistic Programming System

Established: July 16, 2013

# What is R2?

R2 is a probabilistic programming system that uses powerful techniques from program analysis and verification for efficient Markov Chain Monte Carlo (MCMC) inference. The language that is used to describe probabilistic models in R2 is based on C#.R2 compiles the given model into executable code to generate samples from the posterior distribution. The inference algorithm currently implemented in R2 is a variation of the Metropolis-Hastings sampling algorithm.

# Getting R2

#### Examples

Let us see how to use R2 for performing inference on probabilistic models by means of an example. We’ll consider the famous Monty Hall problem for this. The Monty Hall problem can be stated as follows:

Suppose you’re on a game show, and you’re given the choice of three doors. Behind one door is a car, behind the others, goats. You pick a door, say #1, and the host, who knows what’s behind the doors, opens another door, say #3, which has a goat. He says to you, “Do you want to pick door #2?” Is it to your advantage to switch your choice of doors?

– Marilyn vos Savant, Parade Magazine, 1990

(Source: xkcd.com)

We need to find the probabilities of the contestant winning the car when they don’t change their choice and also when they change their choice. For this, we would have to construct a model. This model can be represented in R2 by means of the following program:

In line 11, we initialize the game by choosing the door behind which there would be a car. Behind the other two doors, there would be goats. This is done by randomly picking a door for the car. Since each door has equal probability of getting picked, this is modeled by drawing a sample from a Categorical distribution with 3 values. The probability of picking each value is 1/3.

Next, in line 14, we model the contestant picking a door. Again, each door has equal probability of getting picked. As previously, this is specified by drawing a sample from a Categorical distribution with 3 values where the probability of picking each value is 1/3.

In lines 17-46, we make the host open one of the other two doors behind which there is a goat. If the contestant has chosen the door behind which the car is present, then the host picks one of the other two doors with equal probability. Otherwise, there is only one door behind which there is a goat and the host picks this door.

Finally, we check whether the contestant wins the car by not changing their choice or by changing it. These values are stored in the boolean variables chosenDoorContainsCar and otherDoorContainsCar. Both these variables are returned in a tuple in line 62.

Now, let us use R2 to perform inference on this model. In a command prompt, type the following:

`> R2 MontyHall.cs /numSamples:1000000`

R2 would generate samples from the given model and compute the mean and variance of the variables of interest from the generated samples. In this case, R2 would print the mean and variance of each element of the returned tuple. You should get an output similar to the following:

` 0 Mean: 0.332854 0 Variance: 0.222062 1 Mean: 0.667146 1 Variance: 0.222062`

As is clear from the above output, the contestant has a higher chance of winning the car if they change their choice.

More examples can be found in the examples/ directory of the R2 installation location.

For more information on using R2, please refer to the R2 user guide in the docs/ directory.

#### Limitations

R2 currently has the following limitations which we are working towards fixing in future releases:

1. Metropolis-Hastings sampling algorithm is the only inference algorithm that is currently available.
2. The convergence characteristics of the Metropolis-Hastings algorithm depend upon the proposal distribution that is used. We use a heuristic to tune the proposal distribution. There is no facility that allows the user to tune the proposal distribution.
3. Multi-variate distributions are not supported.

#### Collaborators

• Johannes Borgström (Uppsala University)
• Andy Gordon (MSR Cambridge)
• Gil Hur (Seoul National University)

### Interns

• Arun Tejasvi Chaganty (Stanford University)
• Guillaume Claret (ENS Paris)
• Vijay D’Silva (UC Berkeley)
• Christian von Essen (Verimag)
• Sulekha Kulkarni (Georgia Tech)
• Sherjil Ozair (IIT Delhi)
• Robert J. Simmons (CMU)
• Samin Ishtiaq (MSR Cambridge)
• Claudio Russo (MSR Cambridge)

#### Contact

If you face any issues using R2, or have questions/comments regarding it, please feel free to contact us at: r2pp AT Microsoft DOT com.

### People

Senior Researcher

• #### Sriram Rajamani

Managing Director, Microsoft Research India Lab

## Publications

#### 2015

##### A Provably Correct Sampler for Probabilistic Programs
Chung-Kil Hur, Aditya Nori, Sriram Rajamani, , in Foundations of Software Technology and Theoretical Computer Science (FSTTCS), Leibniz International Proceedings in Informatics, December 1, 2015,
##### Efficient Synthesis of Probabilistic Programs
Aditya Nori, Sherjil Ozair, Sriram Rajamani, Deepak Vijaykeerthy, , in Programming Language Design and Implementation (PLDI), ACM – Association for Computing Machinery, June 1, 2015,

#### 2014

##### R2: An Efficient MCMC Sampler for Probabilistic Programs
Aditya Nori, Chung-Kil Hur, Sriram Rajamani, , in AAAI Conference on Artificial Intelligence (AAAI), AAAI, July 1, 2014,
##### Slicing Probabilistic Programs
Chung-Kil Hur, Aditya Nori, Sriram Rajamani, , in Programming Language Design and Implementation (PLDI), ACM, June 1, 2014,
##### Probabilistic Programming
Andy Gordon, Thomas A. Henzinger, Aditya Nori, Sriram Rajamani, in International Conference on Software Engineering (ICSE Future of Software Engineering), IEEE, May 1, 2014,

#### 2013

##### Semantics Sensitive Sampling for Probabilistic Programs
Aditya Nori, Chung-Kil Hur, Sriram Rajamani, , October 1, 2013,
##### Bayesian Inference Using Data Flow Analysis
Guillaume Claret, Sriram Rajamani, Aditya Nori, Andy Gordon, Johannes Borgström, in ESEC/FSE 2013 Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, ACM, August 1, 2013,
##### Efficiently Sampling Probabilistic Programs via Program Analysis
Arun T. Chaganty, Aditya Nori, Sriram Rajamani, in Artificial Intelligence and Statistics (AISTATS), April 1, 2013,
##### A Model-Learner Pattern for Bayesian Reasoning
Andrew D. Gordon, Mihhail Aizatulin, Guillaume Claret, Thore Graepel, Aditya Nori, Sriram Rajamani, Claudio Russo, Johannes Borgstroem, Andy Gordon, in Principles of Programming Languages (POPL), ACM, January 1, 2013,
##### Probabilistic Inference Using Program Analysis
Andy Gordon, Aditya Nori, Sriram Rajamani, January 1, 2013,

## Code

```using MicrosoftResearch.R2Lib;
using MicrosoftResearch.R2Lib.Distributions;

class MontyHall
{
public Tuple Infer()
{
double[] probabilities = new double[] { 1.0/3, 1.0/3, 1.0/3 };

// Set up the game.
int carDoorNum = Categorical.Sample(probabilities);

// Contestant picks a door.
int chosenDoorNum = Categorical.Sample(probabilities);

// Host picks a door.
int[] possibleOpenDoors;
int openedDoorNum = -1;
int i, j;
if (carDoorNum == chosenDoorNum)
{
possibleOpenDoors = new int[2];

j = 0;
for (i = 0; i < 3; ++i)
{
if (i != carDoorNum)
{
possibleOpenDoors[j] = i;
++j;
}
}

j = Categorical.Sample(new double[] { 0.5, 0.5 });
openedDoorNum = possibleOpenDoors[j];
}
else
{
for (i = 0; i < 3; ++i)
{
if ((i != carDoorNum) && (i != chosenDoorNum))
{
openedDoorNum = i;
}
}
}

bool chosenDoorContainsCar = false, otherDoorContainsCar = false;

// Does contestant win if s/he doesn't switch?
chosenDoorContainsCar = (chosenDoorNum == carDoorNum);

// Does contestant win if s/he switches?
for (i = 0; i < 3; ++i)
{
if ((i != chosenDoorNum) && (i != openedDoorNum))
{
otherDoorContainsCar = (i == carDoorNum);
}
}

return new Tuple(chosenDoorContainsCar,
otherDoorContainsCar);
}
}
```