Proactive Recovery in a Byzantine-Fault-Tolerant System

  • Miguel Castro ,
  • Barbara Liskov

Symposium on Operating Systems Design and Implementation (OSDI'00) |

This paper describes an asynchronousstate-machine replication system that tolerates Byzantine faults, which can be caused by malicious attacks or software errors. Our system is the first to recover Byzantine-faulty replicas proactively and it performs well because it uses symmetric rather than publickey cryptography for authentication. The recovery mechanism allows us to tolerate any number of faults over the lifetime of the system provided fewer than 1 3 of the replicas become faulty within a window of vulnerability that is small under normal conditions. The window may increase under a denialof-service attack but we can detect and respond to such attacks. The paper presents results of experiments showing that overall performance is good and that even a small window of vulnerability has little impact on service latency.