When researchers at Rutgers wanted to explore massive computing scenarios with real-world significance for fields such as financial services and pharmaceuticals, they turned to Windows® HPC Server 2008. In one research project, the software proved at least 30 percent faster than Linux. The Windows HPC Server cluster was deployed in one day and works like the Windows operating system with which users and administrators are already familiar.
Manish Parashar has his head in the clouds—Internet clouds.
Parashar, a professor of computer science and engineering at Rutgers, The State University of New Jersey, is interested in how massive applications can exist in data centers and readily pull computing resources from elsewhere on the Internet as needed. When he’s not directing research on that subject, he might be investigating how to transport large amounts of data and manipulate that data while it flows between two points. Or, he might be exploring how to make software applications that manage themselves, detecting and correcting anomalies without the need for human intervention.
What all these subjects have in common is the need for massive computing resources. At most universities, computing resources at that scale have traditionally been run on the Linux operating system. In fact, when Parashar became the Founding Director of the Center for Autonomic Computing at Rutgers in April 2008, Linux was the operating system of choice there.
But it soon became clear that Linux alone would not serve the Center’s interests. Its corporate partners—major corporations in the financial services and pharmaceuticals industries, for example—were less interested in the application of Linux to their massive computing challenges than they were in using operating systems that could be deployed and managed within their existing environments with minimal incremental cost.
Nor were corporate sponsors the only ones who would benefit from non-Linux environments. Parashar’s students were studying both high-performance computing systems and the application of those systems to computing challenges. Systems with which they were already familiar would enable them to focus on issues specific to high-performance computing, rather than on how to manage an operating system.
By the end of 2008, Parashar added another computing environment to his center, one based on Windows® Compute Cluster Server 2003 and, now, its successor, the Windows HPC Server 2008 operating system.
||[M]ajor industries … want high-performance computing solutions that they can … apply to their own environments. Windows HPC Server 2008 is the operating system that makes this possible.
Professor, Computer Science and Engineering, Rutgers
Parashar, his colleagues, and students have rapidly increased their use of Windows HPC Server 2008. An initial 32-core cluster with a mix of dual- and quad-core computers is used primarily for teaching. A 64-core cluster with a similar mix of machines has been the research mainstay, and that cluster is now being supplemented by a 256-core cluster that will make it possible for Parashar and his colleagues to conduct larger volumes of research, as well as new research on larger problems and issues of scalability.
For example, Parashar and his colleagues are evaluating the rapid parallel processing of large amounts of data, such as molecular data used by pharmaceutical companies in the analysis of their research. A popular software application for that parallel processing is Hadoop map reduction. Parashar and his students Hyunjoo Kim and S. Chaudhari wanted to know whether Hadoop is the most efficient way to analyze these data sets.
By moving the processing from Linux to Windows HPC Server, where a more efficient implementation of Hadoop based on the CometCloud framework can be applied, Parashar has experienced performance improvements ranging between 30 and 250 percent, depending on the structures being analyzed. One analysis on Windows HPC Server, for example, took 34.67 seconds compared to a Hadoop run of 2 minutes 24 seconds on Linux.
“We found we could run large-scale analyses with less overhead and faster performance using Windows HPC Server 2008,” says Parashar. “Faster performance means researchers have more time to analyze and understand their data.” He cites differences between the two operating systems. For example, in the ways they provide access to their remote direct memory access (RDMA)-based low-overhead communication and data transport mechanisms—as contributing to their results.
“We’re conducting real-world research of interest to major industries, and they want high-performance computing solutions that they can more readily, effectively, and cost-efficiently apply to their own environments,” says Parashar. “Windows HPC Server 2008 is the operating system that makes this possible.”
Parashar cites the popularity of other versions of Windows—from Windows desktop operating systems to Windows Server 2008—in those environments. “Windows is something that corporate IT managers and information workers already know how to use,” he says. “That familiarity means they already know how to use Windows HPC Server 2008. It makes for a smoother transition to high-performance computing, one that allows workers to focus on their real work.”
Parashar says he sees the familiarity benefits of Windows HPC Server with his own students. “They come in knowing Windows, especially the students who may not be computer science majors and are coming in from different fields,” he says. “They use the same IDEs [integrated development environments] with which they’re already familiar. The barrier to entry is much lower.”
That same familiarity makes deploying and managing the Windows HPC Server cluster easier as well, according to Jim Housell, Unit Specialist at Rutgers. “When we deployed the first cluster, I just extrapolated from my previous experience with Windows Server,” he says. “The process was fast, and I had everything set up within a day—without having to look at any documentation. Now, if I want to reboot or reprovision a cluster node, I can take care of it with a couple of clicks. It definitely reduces the overhead of administration.”