Single Instance Storage in Windows 2000

Bill Bolosky, Scott Corbin, David Goebel, John (JD) Douceur

Proceedings of 4th USENIX Windows Systems Symposium |

Published by USENIX

Certain applications, such as Windows 2000’s Remote Install service, can result in a set of files in which many different files have the same content. Using a traditional file system to store these files separately results in excessive use of disk and main memory file cache space. Using hard or symbolic links would eliminate the excess resource requirements, but changes the semantics of having separate files, in that updates to one “copy” of a file would be visible to users of another “copy.” We describe the Single Instance Store (SIS), a component within Windows© 2000 that implements links with the semantics of copies for files stored on a Windows 2000 NTFS volume. SIS uses copy-on-close to implement the copy semantics of its links. SIS is structured as a file system filter driver that implements links and a user level service that detects duplicate files and reports them to the filter for conversion into links. Because SIS links are semantically identical to separate files, SIS creates them automatically when it detects files with duplicate contents. This paper describes the design and implementation of SIS in detail, briefly presents measurements of a remote install server showing a 58% disk space savings by using SIS, and discusses other possible uses of SIS.