Heuristics for Reducing the Number of XOR’s in Erasure Coding Systems

  • Jim Plank | University of Tennessee

As a community, storage systems practitioners are just beginning to scratch the surface with regard to leveraging all that erasure codes have to offer. The higher RAID levels (4-6, 10 and 01) use very limited erasure codes to protect systems from small numbers of failures. More advanced systems are being built to satisfy the goals of large-scale storage, archival storage and widescale data distribution, and these systems make use of erasure codes that are far richer than those required by RAID.

The best performing erasure codes rely on the bitwise exclusive-or operation (XOR) and express their encoding and decoding operations using binary (bit) matrices. There is a great deal of flexibility when generating a schedule of XOR operations from a bit matrix, and it is an open problem to generate optimally performing schedules from a matrix.

This paper explores this open problem, focusing on techniques to convert encoding and decoding bit matrices into schedules of XOR operations that perform well. We develop two heuristics, named Uber- CSHR and X-Sets, and evaluate them on a variety of realistic erasure- coding settings. The results show that these techniques significantly improve the performance of encoding and decoding compared to the current state of the art.

Speaker Details

Jim Plank is a Professor in the EECS department at the University of Tennessee. He received his BS from Yale in 1988, and his PhD from Princeton in 1993. He has been at the University of Tennessee since 1993. Professor Plank’s research has spanned many areas of Fault-Tolerance, including checkpointing systems, wide-area storage systems, and most recently erasure coding for storage systems. Professor Plank has been an associate editor of IEEE Transactions on Parallel and Distributed systems, and as chaired conferences in network storage and applications. He has left a legacy of publicly available software that includes;

  • Jgraph – a graph-plotting package for PostScript
  • Ickp – a checkpointing utility for the Intel Paragon
  • Libckpt – a checkpointing utility for Unix
  • IBP – Network storage depots and their client code
  • LoRS – Tools for aggregation of network storage depots
  • Galois – A package for performing fast Galois Field arithmetic
  • Jerasure – A package for erasure coding, including Reed-Solomon codes, RAID-6, and general bit-matrix codes.
    • Portrait of Jeff Running

      Jeff Running