Multiworld Testing

Established: November 1, 2013

loopExponentially better than A/B testing. Multiworld Testing (MWT) is the capability to test and optimize over K policies (context-based decision rules) using an amount of data and computation that scales logarithmically in K, without necessarily knowing these policies before or during data collection. MWT can answer exponentially more detailed questions compared to traditional A/B testing. The underlying machine learning methodology draws on research on “contextual bandits” and “counterfactual evaluation”.

A system for interactive learning. We implement MWT as MWT Decision Service, a machine learning system for making context-based decisions. The system supports the full cycle from exploration to logging to training policies to deploying them in production. Built as a cloud service, the system is widely applicable, modular, and easy to use. This is an ongoing project, released internally in Jun’15 and announced externally in Jul’16. The system is already deployed very successfully with MSN 

Multiworld Testing Decision Service

A typical example. Suppose one wants to optimize clicks on suggested news stories. To discover what works, one needs to explore over the possible news stories. Further, if the suggested news story can be chosen depending on the visitor’s profile, then one needs to explore over the possible “policies” that map profiles to news stories (and there are exponentially more “policies” than news stories!). Traditional machine learning fails at this because it does not explore. Whereas the Decision Service can explore continuously, and optimize decisions using this exploration data.

Team. We are a diverse group of researchers working on all aspects of MWT, spanning algorithms, machine learning, systems, and economics, and covering the entire range from theory to experiments to practical deployments. Most of us are located at Microsoft Research NYC. We can be contacted at



MWT Decision Service
Jul 2016: external announcement.
Jun 2015: internal release.

MWT Exploration library
A library for MWT, structurally compatible with learning algorithms in Vowpal Wabbit.
Nov 2014: external release.

MWT white paper (rev. March 2016)
Jul 2016: rev2 released
Sep 2015: released externally
Jun 2015: released internally

Deployment: personalized news on
Deployed on 100% of the traffic; 25% lift in clicks.
Innovation Award from Microsoft’s Universal Storefronts.



  • Jul 2016: Decision service announced externally.
  • Jul 2016: MWT white paper (rev2) released
  • Mar 2016: a demo at MSR TechFest 2016 (Redmond, MSFT-only).
  • Mar 2016: Innovation Award from Microsoft’s Universal Storefronts for the MSN deployment.
  • Jan 2016: Decision Service for personalized news on MSN:  deployed on 100% of the traffic.
  • Nov 2015: Mini-course (Redmond, WA, MSFT-only) [internal link]
  • Nov 2015: Mini-course (Cambridge, UK, MSFT-only) [internal link].
  • Sep 2015: Decision Service for personalized news on MSN: first test flights.
  • Sep 2015: MWT white paper released externally.
  • Mar 2015: Workshop on Interactive Machine Learning (Redmond, MSFT-only) [internal link].
  • Mar 2015: a demo at MSR TechFest 2015 (Redmond, MSFT-only).
  • Jun 2015: MWT Decision Service (v1) released internally.
  • Nov 2014: MWT exploration library released.
  • Oct 2014: Mini-course (Redmond, MSFT-only) [internal link]
  • Oct 2014: Tutorial at “Practice of ML Conf.” (Redmond, MSFT-only).
  • Mar 2014: Workshop (Redmond, MSFT-only) [internal link].
  • Mar 2014: a demo and a lecture at MSR TechFest 2014 (Redmond, MSFT-only).


Background & details

MWT white paper (pdf) – Background for potential users of the Decision Service: machine learning methodology and system design, and how to make them fit YOUR application. Also covers past deployments and experimental evaluation. For a broad technical audience, both in product groups and in research.

Slide deck on MWT and the Decision Service.

Decision Service Wiki: tutorials, guides and references.

Tutorials and lectures:

Most relevant papers:

Background reading: