I am a member of the Data Management, Exploration and Mining group in Microsoft Research Redmond. I work on various aspects database systems, mostly related to data integration and transaction processing.

Currently, I’m working on a distributed systems programming framework, called Orleans, which was released as open source in January, 2015. I gave a keynote about it at DISC 2014; slides are posted here.

My early research was primarily on transaction processing and, after a long hiatus, I resumed working in this area in 2006 as a co-designer of the database engine for SQL Azure. I then focused on building Hyder, a prototype transactional indexed-record manager that scales out without partitioning. Sudipto Das and I recently surveyed techniques for multi-master replication. I have also published two books on transaction processing:

Principles of Transaction Processing (2nd edition), coauthored with Eric Newcomer, was published in June 2009 by Morgan-Kauffman Publishers, a division of Elsevier. Translated into Chinese, Japanese, and Korean.
Concurrency Control and Recovery in Database Systems, coauthored with Vassos Hadzilacos and Nathan Goodman, is downloadable for free from here.

I also work on data integration problems. From 2000 – 2011 I led the Model Management Project, whose goal was to make database systems easier to use for model-driven applications, such as design tools, message translators, and database translators. I also worked on object-to-relational mapping, especially in support of the ADO.NET Entity Framework. Over the years, this work has been done in close collaboration with Sergey Melnik (now at Google), James Terwilliger (Microsoft), Eli Cortez (Microsoft), Suad Alagic, Alon Halevy (Google), Jayant Madhavan (Google), René Miller (Univ. of Toronto), Peter Mork (Noblis), Rachel Pottinger (Univ. of British Columbia), Christoph Quix (Technical Univ. of Aachen), Erhard Rahm (Univ. of Leipzig), Adi Unnithan (Microsoft), and many great interns.

I’ve published many research papers on transaction processing, data integration, and other aspects of database management. You can find a nearly-complete list at the DBLP Computer Science Bibliography.


Concept Expansion

Established: November 10, 2014

Given a concept name, and seed entities, return entities and tables in this concept. Sway Presentation

Rethinking Eventual Consistency

Established: July 31, 2013

The past five years has seen a resurgence of work on replicated, distributed database systems, to meet the demands of intermittently-connected clients and disaster-tolerant database systems that span data centers. Each product or prototype uses a weakened definition of replica-consistency or isolation, and in some cases new mechanisms, to obtain improvements in partition-tolerance, availability, and performance. We have developed a framework for defining and comparing weaker consistency and isolation properties. We show how these weaker…

Hyder, a transactional indexed-record manager for shared flash

Established: February 8, 2013

Hyder is a transactional indexed-record manager for shared flash. That is, it supports operations on indexed records and transaction operations that bracket the record operations. It is designed to run on a cluster of servers that have shared access to a large pool of network-addressable storage, which stores the indexed records as a multiversion log-structured database. Hyder's main feature is that it scales out without partitioning the database or application. In Hyder, the database is…

Horton – Querying Large Distributed Graphs

Established: October 14, 2010

Horton was a research project to enable querying large distributed graphs. It consists of a graph library built on top of Orleans that targets hosting large graphs in a data center. The library provides a querying interface to search the graph for matching paths. Academic visitors and collaborators Dr. Mohamed Mokbel, UMN, visiting researcher in 2012. Dr. Sherif Sakr, UNSW & NICTA, visiting researcher in 2011. Isabelle Stanton, Berkeley, summer intern in 2011. Juan Mendivelso, National…

Orleans – Virtual Actors

Established: October 14, 2010

Project "Orleans" invented the Virtual Actor abstraction, which provides a straightforward approach to building distributed interactive applications, without the need to learn complex programming patterns for handling concurrency, fault tolerance, and resource management. Orleans applications scale-up automatically and are meant to be deployed in the cloud. It has been used heavily by a number of high-scale cloud services at Microsoft, starting with cloud services for the Halo franchise running in production in Microsoft Azure since 2011. The core…

Model Management

Established: November 5, 2001

The goal of model management is to develop a generic infrastructure that offers an order-of-magnitude productivity improvement to builders of model-driven applications, such as database tools, application design tools, message translators, and customizable commercial applications. We have worked on both abstract operators that manipulate models and mappings and on practical applications of this technology. Abstract operators include the following:· Match – returns correspondences between elements of two given schemas Merge – returns an integration of…







Mapping XML to a Wide Sparse Table
Peter Carlin, Dimitrije Filipovic, Michael Rys, Nikita Shamgunov, James Terwilliger, Liang Jeff Chen, Phil Bernstein, Milos Todic, Sasa Tomasevic, Dragan Tomic, in IEEE Transactions on Knowledge and Data Engineering, January 1, 2013, View abstract


Mapping XML to a Wide Sparse Table
Peter Carlin, Dimitrije Filipovic, Michael Rys, Nikita Shamgunov, James Terwilliger, Milos Todic, Sasa Tomasevic, Dragan Tomic, Liang Jeff Chen, Phil Bernstein, International Conference on Data Engineering, January 1, 2012, View abstract










The Lowell Database Research Self Assessment
Stan Zdonik, Jennifer Widom, Gerhard Weikum, Jeff Ullman, Rick Snodgrass, Mike Stonebraker, Avi Silberschatz, Timos Sellis, Hans Schek, Jeff Naughton, David Maier, Serge Abiteboul, Rakesh Agrawal, Phil Bernstein, Mike Carey, Stefano Ceri, Bruce Croft, David DeWitt, Mike Franklin, Hector Garcia Molina, Dieter Gawlick, Jim Gray, Laura Haas, Alon Halevy, Joe Hellerstein, Yannis Ioannidis, Martin Kersten, Michael Pazzani, Mike Lesk, Association for Computing Machinery, Inc., June 1, 2003, View abstract, Download PDF, View external link







Link description

Programmability at Cloud Scale


July 24, 2015


Gul Agha, Judith Bishop, Philip Bernstein, and Sergey Bykov


Microsoft Research, University of Illinois at Urbana-Champaign, Microsoft

Link description

AstroInformatics 2012: Day 1 Session 4


September 10, 2012


Alexander Szalay, Phil Bernstein, Nigel Sharp, and Daniel Katz


John Hopkins University, Microsoft Research, National Science Foundation, Office of Cyberinfrastructure


 Professional Activities

Concurrency Control and Recovery in Database Systems

Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman

This page offers a free download of the above book in PDF file format. You can read or print it using Adobe’s Acrobat Reader, subject to the restrictions on the copyright page, which is the second page of the Preface.

The easy way to obtain the book is to download the zipfile of the entire book from here (22.9 MB). Then tell Winzip to extract it to a folder. The Contents.pdf file in the root folder offers conveneint access to the rest of the book, and the page numbers in the Index are linked to the proper pages. If you take this route, you can ignore the other directions below.

Download Individual Book Sections

If you have a low bandwidth connection, you can download individual pieces and assemble them on your computer. This requires more work, since to operate properly, you must place the .PDF and index files (used for linking between files) in the proper organizational hierarchy on your computer. The proper hierarchy is as follows:

  • Root Level
    • contents.pdf
    • Images folder
      • Index Folder (expand all subfolders under Index here)
      • appendix.pdf, biblio.pdf, Chapters 1-8.pdf, glossary.pdf, index.log, index.pdf, index.pdx, and preface.pdf

Below are all the files, which you can download one by one. All of these files are governed by the copyright notice in the preface.

Zip file of the index folder (520 KB)

contents.pdf (115 KB)

index.pdf (986 KB)

preface.pdf (575 KB)

chapter1.pdf (1.68 MB)

chapter2.pdf (1.31 MB)

chapter3.pdf (4.6 MB)

chapter4.pdf (2.25 MB)

chapter5.pdf (1.75 MB)

chapter6.pdf (3.67 MB)

chapter7.pdf (3.36 MB)

chapter8.pdf (3.47 MB)

appendix.pdf (295 KB)

glossary.pdf (1.27 MB)

bibliography.pdf (1.84 MB)