4-page Case Study - Posted 11/29/2006
Views: 325
Rate This Evidence:
Early Detection of Cancer One Step Closer to Solution with Microsoft, Dell and Intel
The Melbourne Branch of the Ludwig Institute for Cancer Research is investigating the use of mass spectrometry to identify unique protein markers secreted by cancerous colon tumours. If successful, this research could make early detection of colon cancer as simple as having a blood test and improve survival rates of sufferers dramatically. Researchers working across several labs were using isolated computing systems that struggled to process the huge amounts of raw data generated by the mass spectrometers. Microsoft worked with partners Dell and Intel® to develop a high-performance computing cluster. The scalable solution has significantly reduced data processing times and simplified and centralised management of the Institute’s computing environment.
Situation
The Ludwig Institute for Cancer Research (LICR) was founded in 1971 and is made up of a worldwide network of nine branches in seven countries dedicated to basic and clinical research. The Melbourne Branch for Tumor Biology, in Victoria, Australia, was established in 1980 in collaboration with the Royal Melbourne Hospital and the University of Melbourne. Each branch of LICR ha
s its own area of specialization. The Melbourne Branch concentrates on tumor biology; specifically, for colon cancer, one of the most common forms of cancer in the western world. LICR created the Proteomics Facility and Proteomics Research Laboratory. These laboratories focus on analytical biochemistry, technical developments in protein separation and characterization and high-sensitivity extensive proteomics. Proteomics is the study of proteins, particularly their identification, interactions and functions. It is central to allowing researchers to discover biomarkers – substances whose presence in an organism may indicate a particular disease. “LICR researchers are working to understand the biology of tumors within the gut and how protein markers secreted by tumors can be used as an early detection mechanism for colon cancer,” says Dr Robert Moritz, Manager of the Proteomics Facility at LICR and Director of the Australian Proteomics Computational Facility. “Being able to detect tumors before they become life threatening would significantly increase survival rates for those with this type of cancer.” Currently, common tests for colon cancer, such as the colonoscopy, are invasive and expensive and have their own inherent health risks, making them unsuitable for population-based screening. In contrast, if LICR researchers develop a test for the early detection of colon cancer using protein markers, future tests for this type of cancer could be as simple as having a blood test. The Melbourne Branch has more than 20 proteomics researchers in various laboratories, each working on different aspects of proteomics. They use mass spectrometers to generate the data used to identify protein markers. These instruments were collecting large amounts of data that were processed in isolation, with each researcher relying on the computing power available at individual laboratories. Smaller computing installations struggled under the volume of raw data generated. “Mass spectrometers can generate between 10,000 and 15,000 mass spectra per hour, which then needs to be converted into protein identifications,” says Moritz. “Smaller, lab-based computers were finding it hard to keep up with the dataflow.”
Solution
In 2005, LICR asked Microsoft Global Alliance Partner, Dell, to propose a solution that would deliver greater computing power to researchers working on multiple projects. The solution needed to meet the following criteria:
- Increased processing power of all computers in the cluster.
- Easy-to-use interface for researchers.
- Simple, centralized management of a large cluster system.
- Compatibility with algorithms used for proteomics research.
- Scalability for future growth.
- Cost competitiveness.
- Ability to integrate with existing technology infrastructure.
- Ability for researchers to share results throughout LICR’s global network.
Dell suggested building a high-performance computing cluster (HPCC) that would pool each laboratory’s resources to create a system with far greater processing power that could be accessed remotely by all researchers. Dell joined forces with Microsoft and Global Alliance Partner, Intel, to carry out the project. “Forming a team of three global companies with the strength and reputation of Microsoft, Dell and Intel allowed us to create a seamless solution for the customer,” says Simon Johnson, Director of Enterprise Business at Dell. “There were no missing pieces for the customer to worry about. We were able to leverage each company’s areas of expertise to create a true best-of-breed solution. “The Ludwig Institute’s comfort level with using Microsoft technology was central to the project,” Johnson continues. “Microsoft’s commitment to this sector proved it not only had the necessary products and expertise, but that it was committed to the solution. In a project like this, the willingness of the vendors to provide ongoing support and input is critical.”
LICR joined the Microsoft® Windows® Compute Cluster Server 2003 Rapid Deployment Program. Microsoft gave LICR access to the Beta 2 version of the software, which was released in November 2005. LICR’s proteomics staff visited Microsoft headquarters in Redmond, USA, to contribute to discussions about the product. The project began as a proof of concept, funded jointly by Dell and Intel. “Combining Dell and Intel’s deep pool of technical resources and expertise to build the proof of concept helped the Ludwig Institute quickly identify the benefits and challenges of this project,” says Ivan Chan, Intel’s Dell Regional Account Manager for Asia Pacific. “Ensuring hardware and software work together harmoniously to deliver the best result for the customer is the real strength of the relationship.” The proof of concept was completed in late 2005 and comprised a 16-node computing cluster. This cluster consisted of two Dell PowerEdge 2850 servers as the head nodes and a 16-node Dell 1855 Blade Server cluster, all running Windows Compute Cluster Server 2003. The head nodes carry most of the information for the system and the databases. All compute nodes were connected to the head nodes with Dell PowerConnect networking switches. The cluster used 36 dual-core Intel® Xeon® processors. “The environment was simple to configure and, once the system was powered up, it was ready to use within 40 minutes,” says Moritz. “It was also seamless to integrate other technologies such as Microsoft® Active Directory® into the environment.”
The ease of use of Microsoft Windows Compute Cluster Server 2003 made it the obvious choice for the Ludwig Institute. “The user interface and structure of the Microsoft Windows Compute Cluster Server make managing a large, high-performance computing cluster far less daunting than with other operating systems,” says Moritz. “It integrated with our existing technology infrastructure and we were also secure in the knowledge that we were drawing on Microsoft’s powerful technology and comprehensive support.” LICR was also impressed with the power of Dell’s hardware. “Dell has done a lot of work with our affiliates at the University of Melbourne, and we were confident the company’s robust and stable servers were what we were looking for,” Moritz continued. The dual-core technology in Intel Xeon processors improves processing power by providing increased flexibility in allocating processor resources. “Intel’s dual-core architecture enables us to process more data more quickly,” Moritz says. “The more samples we can test, the closer our researchers come to identifying the unique protein markers for colon cancer.” When discussing their work with other proteomics researchers around Australia, LICR researchers discovered that researchers in other laboratories at different institutions were faced with the same problems as the LICR team. “Researchers at geographically dispersed labs needed more computing power to enable them to do their jobs efficiently,” Moritz said.
In 2005, LICR formed a consortium with 21 other Australian proteomics research centers to discuss the creation of the Australian Proteomics Computational Facility (APCF). The idea was to take the knowledge gained from Microsoft, Dell and Intel during the LICR project and expand this on a national scale. The proof of concept gave LICR confidence in the system’s capabilities and convinced it to go ahead with a full-scale project. Acting as a national test bed, LICR rolled out the solution, using a configuration with one head node to provide a cluster for production and the other head node for testing and development. The cluster connects researchers from at least 20 different locations across Australia, from Perth to Brisbane and beyond, through a single administrative interface. In January 2006, the APCF was awarded an enabling research grant of A$2 million by the National Health and Medical Research Council to create a high-performance computing platform to assist other research centers with high-throughput proteomic analysis. The APCF will process the raw mass spectrometry data using algorithms such as Mascot from Matrix Science, Phenyx from GeneBio, Sequest from Thermo and many other open source programs. These algorithms transform raw molecular data into a form that biologists can understand and use to identify which proteins are present when normal cells are in the early stages of mutating into cancerous cells. LICR is also developing its own algorithms using Microsoft® Visual Studio® .NET. “As my experience is in biology, not computers, a good compiler and an intuitive development environment are important to me, as is integration between tools such as Microsoft Visual Studio .NET and Microsoft® SQL Server™,” says Eugene Kapp, a computational biologist at the Proteomics Laboratory, Ludwig Institute for Cancer Research. Using Microsoft Visual Studio .NET, Kapp has developed a framework for testing the Laboratory’s multi-threaded search algorithm, called DIGGER, which uses mass spectrometry data to accurately identify proteins. He is now developing a 64-bit version of this algorithm, which will enable it to hold extremely large genomic sequences in its memory and handle the massive datasets generated by the newest mass spectrometers. “From a development point of view, I don’t see any products that rival Microsoft Visual Studio .NET for ease of use, especially debugging applications,” says Kapp.
Benefits
Faster Processing Times
The processing power of the high-performance computing cluster is at least 20 times faster than that available when researchers worked on isolated systems. In turn, this has increased the productivity and output of researchers. The servers in the HPCC can work on several individual tasks or combine to complete one large job. “The system’s increased processing power has dramatically reduced protein matching search times, allowing us to process more data per day and to send results back to biologists at a much faster rate,” says Moritz. “This lets biologists formulate new experimental approaches.”
The Strength of a Team
Investing in a solution based on technology from Microsoft, Dell and Intel means LICR and the APCF can rely on world-leading support and service. “The Ludwig Institute is benefiting from the extensive resources of three global companies,” says Chan from Intel. “We can call on expertise from people anywhere in the world – wherever the best person for the job happens to be.” “The combination of the skills and expertise of Microsoft, Dell and Intel transformed this project from a theoretical concept into a reality that is changing the face of proteomics research in Australia,” says Moritz. “The products from the three companies work together seamlessly.” Integrates with Existing Infrastructure Microsoft Windows Compute Cluster Server 2003 integrates with the Melbourne Branch’s existing IT environment, including Microsoft Active Directory and Terminal Services for remote access. “It’s quite common for this sort of computing facility to be isolated from the rest of the technology environment, especially if it’s based on Linux,” says Chris Green, Technical Specialist, Microsoft. “But with Microsoft, the computing cluster can be incorporated into the existing Active Directory environment. This means organizations don’t have to set up separate infrastructure or user accounts for cluster access and can even incorporate other Microsoft technology to manage or administer the environment as required.”
Cost-Effective Computing
The solution is based on a pay-as-you-grow pricing structure, enabling LICR to expand the solution as it needs and as funds become available. Microsoft, Dell and Intel worked together to design and deploy a solution that uses cutting-edge technologies for a very low price to the customer. Because LICR could leverage its existing skill sets in Microsoft technologies, it was able to implement the operating system for the APCF – what is, in effect, a large business system of the kind that would typically be out of reach for an organization of its size. “The solution had a fairly small learning curve, so we didn’t need to increase staff numbers or conduct extensive training to manage the system,” says Moritz. “We have also reduced our support and maintenance costs.”
Centralized Management
Microsoft Windows Compute Cluster Server 2003 features a centralized management interface that allows the entire HPCC to be managed from one location. This increases the ease of implementation and simplifies ongoing management. Centralized management will be especially important as the solution expands to include more researchers in diverse locations. The system’s familiar and intuitive interface also makes it easy for researchers to submit a job onto the cluster and to monitor its progress.
Future Plans
Moritz says LICR is now working towards expanding the solution to give proteomics researchers across Australia access to the HPCC at the APCF. This would involve growing the computing cluster to 128 nodes (with 256 dual-core processors), making a total of 135 servers including the head nodes. All Australian proteomics researchers, working in areas as diverse as agriculture, health and microbiology, would then have access to the system. Depending on the outcome of this phase, APCF is even considering extending the solution internationally to researchers in New Zealand and across Asia.
Microsoft Windows Server System
Microsoft® Windows Server System™ is a line of integrated and manageable server software designed to reduce the complexity and cost of IT. Windows Server System enables you to spend less time and budget on managing your systems so that you can focus your resources on other priorities for you and your business. For more information about Windows Server System, go to: www.microsoft.com/windowsserversystem
For More Information
For more information about Microsoft products and services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada Information Centre at (877) 568-2495. Customers who are deaf or hard-of-hearing can reach Microsoft text telephone (TTY/TDD) services at (800) 892-5234 in the United States or (905) 568-9641 in Canada. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information using the World Wide Web, go to: www.microsoft.com For more information about Dell products and services, visit the Web site at: www.dell.com For more information about Intel products and services, visit the Web site at: www.intel.com For more information about the Ludwig Institute for Cancer Research products and services, call (613) 9341 3155 or visit the Web site at: www.ludwig.edu.au