Application Recovery: Advances Toward an Elusive Goal

  • David Lomet

High Performance Transaction Systems (HPTS) |

Persistent savepoints are the model that has usually dominated previous thinking on application recovery. “A persistent savepoint is a savepoint where the state of the transaction’s resources is made stable …, and enough control state is saved to stable storage so that on recovery the application can pick up from this point in its execution. … If the system should fail and subsequently recover, it can restart any transaction at … its last persistent savepoint operation.  It doesn’t have to run an exception handler because the transaction has its state and can simply pick up where it left off.”{BeNe97}

The value of persistent savepoints is more than reducing lost work.  Persistent savepoints simplify application programming compared to more explicit methods for coping with failures. “… the code is not only shorter than the [prior] solution… but simpler.  … everything related to the maintenance of persistent context is now taken care of by the Save_Work function, whereas the [prior] solution had to do the maintenance all by itself…” {GrRe93}

Traditionally, a savepoint has been viewed as the capture of an application’s state in stable storage at the time a savepoint operation is executed. However, this is not necessary, any more than it is necessary for database recovery to materialize the final states of all pages of all committed transaction in stable storage.  Instead, we recover via replay from the log. “Each resource manager participating in the transaction with the persistent savepoint is brought up in the state as of the most recent persistent savepoint.  For that to work, the run-time system of the programming language has to be a resource manager, too; consequently, it also recovers its state to the most recent persistent savepoint.  Its state includes the local variables, the call stack, and the program counter, the heap, the I/O streams, and so forth.”  {GrRe93}