Full Circle: Simulating Linux Clusters on Linux Clusters

  • Luís Ceze ,
  • ,
  • George Almasi ,
  • Patrick J. Bohrer ,
  • José R. Brunheroto ,
  • Călin Caşcaval ,
  • José G. Castaños ,
  • Derek Lieber ,
  • Xavier Martorell ,
  • José E Moreira ,
  • Alda Sanomiya ,
  • Eugen Schenfeld

CWCE (Linux Clusters Institute International Conference on Linux Clusters) |

BGLsim is a complete system simulator for parallel machines. It is currently being used in hardware validation and software development for the BlueGene/L cellular architecture machine. BGLsim is capable of functionally simulating multiple nodes of this machine operating in parallel. It simulates instruction execution in each node and the communication that happens between nodes. BGLsim allows us to develop, test, and run the exactly same code that will be used in the real system. Using BGLsim, we can gather data that helps us debug and enhance software (including parallel software) and evaluate hardware. To illustrate the capabilities of BGLsim, we describe experiments running the NAS Parallel Benchmark IS on a simulated BlueGene/L machine. BGLsim is a parallel application that runs on Linux clusters. It executes fast enough to run complete operating systems and complex MPI codes.