Library-based record&replay tools that reside in the application’s same address space are convenient for debugging, because they don’t require patching the kernel or installing virtual machines, and can often achieve good performance. Current tools, however, have limited replay faithfulness, because they interfere with the application’s state and because they have a fixed record&replay interface (often system and libc calls), which cannot enclose all nondeterminism.

R2 adopts operating system kernel ideas and defines a syscall-upcall interface within the application’s address space. The interface strictly isolates the application from the record&replay library and guarantees replay faithfulness; transfers across the interface are explicit and data movement around arguments is carefully managed. In addition, R2 enables developers to customize calls of the interface to cover nondeterminism, annotate them with simple keywords, and generate code automatically from templates for record&replay.

By doing so developers can further optimize recording performance and avoid manually coding hundreds of stubs for record&replay. We have implemented R2 on Windows; by using annotations it generates stubs for more than 1300 functions automatically. In the past two years it has helped to debug two comprehensive distributed systems incubated in our lab. To the best of our knowledge, it is also the first library-based record&replay tool that can replay multithreaded web and database servers. Experiments show that its recording overhead for Apache is less than 10%, that customizing the interface for SQLite can sometimes reduce the log size up to 99%, and that using optimization annotations for BitTorrent and MPI applications achieves log size reduction ranging from 13.7% to 99.4%.