A Gray Box Approach For High-Fidelity, High-Speed Time-Travel Debugging

  • John Vilk ,
  • James Mickens ,
  • Mark Marron

MSR-TR-2016-7 |

Time-travel debugging (TTD) lets developers step backward as well as forward through a program’s execution. TTD is a powerful mechanism for diagnosing bugs, but previous approaches suffer from poor performance due to checkpoint and logging overhead, or poor fidelity because important information like GUI state is not tracked.

In this paper, we describe how to provide high performance and high-fidelity TTD to programs written in managed languages. Previous high-performance debuggers treat components external to the program like the GUI as black boxes, but that is not sufficient for high fidelity time-travel. Instead, we advocate for a gray-box approach that keeps these components live and in sync with the program during time-travel. The key insight is that managed runtime APIs expose most of the functionality required to do this; where it does not, we extend the runtime with a small number of non-intrusive interrogative interfaces. To demonstrate the power of our gray-box approach, we implement ReJS, a time-traveling debugger for web applications. ReJS imposes imperceptible tracing overhead, and its logs typically grow less than 1 KB/s. As a result, ReJS is performant enough to be deployed in the wild; real client machines can ship buggy execution traces across the wide area to developer-side machines for debugging.