Current generation Web applications run at scales of hundreds of millions of users, each with thousands of objects, requiring petabytes of data storage, even excluding large objects such as images and video. Such applications require unprecedented scales of storage, extremely high availability, coupled with low latency for users across the world. Architecting such systems requires replication of data across data centers distributed globally. Such replication poses challenges with respect to consistency, which application developers need to deal with. Several data storage systems have been developed in recent years to address these challenges.
In this set of two talk, we start with an overview of basic concepts of distributed transactions, atomic commit, and concurrency control with replication. After a brief overview of distributed file systems, we address the main topic of distributed data storage systems. Brewer’s CAP theorem and its variants have forced systems to consider different tradeoffs between availability/latency and consistency. We first study the architecture of distributed data storage systems that give greater importance to consistency of individual data items, focusing on three such systems, Big Table, PNUTS and Megastore, which are widely used and path breaking systems. We then study systems that give greater importance to availability than to consistency, starting with a discussion of weak consistency, and then illustrating practical issues in dealing with weak consistency, such as detection and resolution of inconsistencies, using Amazon Dynamo as an example. We conclude with directions of current and future work.