Microsoft Cluster Service (MSCS) extends the Windows NT operating system to support high-availability services. The goal is to offer an execution environment where off-the-shelf server applications can continue to operate, even in the presence of node failures. Later versions of MSCS will provide scalability via a node and application management system that allows applications to scale to hundreds of nodes. This paper provides a detailed description of the MSCS architecture and the design decisions that have driven the implementation of the service. The paper also describes how some major applications use the MSCS features, and describes features added to make it easier to implement and manage fault-tolerant applications on MSCS.