{"id":968301,"date":"2023-09-19T09:00:00","date_gmt":"2023-09-19T16:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/announcing-the-deepspeed4science-initiative-enabling-large-scale-scientific-discovery-through-sophisticated-ai-system-technologies\/"},"modified":"2023-09-28T08:01:37","modified_gmt":"2023-09-28T15:01:37","slug":"announcing-the-deepspeed4science-initiative-enabling-large-scale-scientific-discovery-through-sophisticated-ai-system-technologies","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/announcing-the-deepspeed4science-initiative-enabling-large-scale-scientific-discovery-through-sophisticated-ai-system-technologies\/","title":{"rendered":"Announcing the DeepSpeed4Science Initiative: Enabling large-scale scientific discovery through sophisticated AI system technologies"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-1024x576.jpg\" alt=\"DeepSpeed4Science Initiative - graphic with 6 icons\" class=\"wp-image-968769\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1.jpg 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><em><strong>Editor\u2019s note, Sept. 28, 2023 \u2013\u00a0<\/strong>The founding collaborators list was updated to correct omissions and the scientific foundation model graph was updated to correct information.<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"introduction\">Introduction&nbsp;<\/h2>\n\n\n\n<p>In the next decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. In line with Microsoft\u2019s mission to empower every person and every organization on the planet to achieve more, the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.deepspeed.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\">DeepSpeed<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> team at Microsoft is responding to this opportunity by launching a new initiative called <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/deepspeed4science.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\">DeepSpeed4Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, aiming to build unique capabilities through AI system technology innovations to help domain experts to unlock today\u2019s biggest science mysteries.<\/p>\n\n\n\n<p>The <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.deepspeed.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\">DeepSpeed<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> system is an industry leading open-source AI system framework, developed by Microsoft, that enables unprecedented scale and speed for deep learning training and inference on a wide range of AI hardware. Figure 1 demonstrates our basic approach to this new initiative. By leveraging DeepSpeed\u2019s current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). We work closely with internal and external teams who own AI-driven science models that represent key science missions, to identify and address general domain-specific AI system challenges. This includes climate science, drug design, biological understanding, molecular dynamics simulation, cancer diagnosis and surveillance, catalyst\/material discovery, and other domains.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"2336\" height=\"1549\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-1.png\" alt=\"Figure 1: It is a three-tier diagram. From bottom to top wise (vertically), it describes our basic approach for executing DeepSpeed4Science initative. Bottom section represents the current three pillars of\nthe DeepSpeed framework, including training, inference and compression. The middle layer, which is what this particular blog is about, is creating a new set of AI system technologies that are beyond generic large language model support, tailored for accelerating scientific discoveries and addressing their complexity. The very top layer represents gemera; AI-driven science models across different domains, which can be supported by DeepSpeed4Science software support.\" class=\"wp-image-968442\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-1.png 2336w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-1-300x199.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-1-1024x679.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-1-768x509.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-1-1536x1019.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-1-2048x1358.png 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-1-240x159.png 240w\" sizes=\"auto, (max-width: 2336px) 100vw, 2336px\" \/><figcaption class=\"wp-element-caption\">Figure 1: DeepSpeed4Science approach: developing a new set of AI system technologies that are beyond generic large language model support, tailored for accelerating scientific discoveries and addressing their complexity.<\/figcaption><\/figure>\n\n\n\n<p>Our long-term vision is to develop DeepSpeed4Science into a new platform and a unified repository for sharing advanced AI system technologies that support scientific discoveries. DeepSpeed4Science is designed to be inclusive, echoing Microsoft\u2019s <a href=\"https:\/\/www.microsoft.com\/en-us\/ai\/ai-for-good\" target=\"_blank\" rel=\"noreferrer noopener\">AI for Good<\/a> commitment. That is reflected in the initiative\u2019s support for a diverse group of signature science models, representing some of the most critical AI for science investments. In this blog, we showcase how DeepSpeed4Science helps address two of their critical system challenges in structural biology research: (1) eliminating memory explosion problems for scaling <em>Evoformer-centric<\/em> protein-structure prediction models, and (2) enabling very-long sequence support for better understanding the evolutionary landscape of pandemic-causing viruses.<\/p>\n\n\n\n<div style=\"height:30px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n\t<div class=\"border-bottom border-top border-gray-300 mt-5 mb-5 msr-promo text-center text-md-left alignwide\" data-bi-aN=\"promo\" data-bi-id=\"1002645\">\n\t\t\n\n\t\t<p class=\"msr-promo__label text-gray-800 text-center text-uppercase\">\n\t\t<span class=\"px-4 bg-white display-inline-block font-weight-semibold small\">Spotlight: AI-POWERED EXPERIENCE<\/span>\n\t<\/p>\n\t\n\t<div class=\"row pt-3 pb-4 align-items-center\">\n\t\t\t\t\t\t<div class=\"msr-promo__media col-12 col-md-5\">\n\t\t\t\t<a class=\"bg-gray-300 display-block\" href=\"https:\/\/aka.ms\/research-copilot\/?OCID=msr_researchforum_Copilot_MCR_Blog_Promo\" aria-label=\"Microsoft research copilot experience\" data-bi-cN=\"Microsoft research copilot experience\" target=\"_blank\">\n\t\t\t\t\t<img decoding=\"async\" class=\"w-100 display-block\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/01\/MSR-Chat-Promo.png\" alt=\"\" \/>\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"msr-promo__content p-3 px-5 col-12 col-md\">\n\n\t\t\t\t\t\t\t\t\t<h2 class=\"h4\">Microsoft research copilot experience<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<p id=\"microsoft-research-copilot-experience\" class=\"large\">Discover more about research at Microsoft through our AI-powered experience<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<div class=\"wp-block-buttons justify-content-center justify-content-md-start\">\n\t\t\t\t\t<div class=\"wp-block-button\">\n\t\t\t\t\t\t<a href=\"https:\/\/aka.ms\/research-copilot\/?OCID=msr_researchforum_Copilot_MCR_Blog_Promo\" aria-describedby=\"microsoft-research-copilot-experience\" class=\"btn btn-brand glyph-append glyph-append-chevron-right\" data-bi-cN=\"Microsoft research copilot experience\" target=\"_blank\">\n\t\t\t\t\t\t\tStart now\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div><!--\/.msr-promo__content-->\n\t<\/div><!--\/.msr-promo__inner-wrap-->\n\t<\/div><!--\/.msr-promo-->\n\t\n\n\n<h2 class=\"wp-block-heading\" id=\"our-launch-and-key-collaborators\">Our launch and key collaborators&nbsp;<\/h2>\n\n\n\n<p>The new system technologies enabled by DeepSpeed4Science can empower AI-driven scientific discoveries using signature models that represent a wide spectrum of efforts pushing the boundaries of science. Currently, DeepSpeed4Science is honored to support several key science models from <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/lab\/microsoft-research-ai4science\/\" target=\"_blank\" rel=\"noreferrer noopener\">Microsoft Research AI4Science<\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.msn.com\/en-us\/weather\/forecast\/in-Seattle,WA?loc=eyJsIjoiU2VhdHRsZSIsInIiOiJXQSIsInIyIjoiS2luZyBDby4iLCJjIjoiVW5pdGVkIFN0YXRlcyIsImkiOiJVUyIsImciOiJlbi11cyIsIngiOiItMTIyLjMzOCIsInkiOiI0Ny42MTMifQ%3D%3D&ocid=ansmsnweather\" target=\"_blank\" rel=\"noopener noreferrer\">Microsoft WebXT\/Bing<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.energy.gov\/national-laboratories\" target=\"_blank\" rel=\"noopener noreferrer\">U.S. DoE National Labs<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"current-microsoft-internal-partnerships\">Current Microsoft internal partnerships<\/h3>\n\n\n\n<p><strong>Scientific Foundation Model (SFM), Microsoft Research AI4Science<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"2169\" height=\"754\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/image008.png\" alt=\"Graph depicting the Scientific Foundation Model (SFM), Microsoft Research AI4Science\" class=\"wp-image-971256\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/image008.png 2169w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/image008-300x104.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/image008-1024x356.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/image008-768x267.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/image008-1536x534.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/image008-2048x712.png 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/image008-240x83.png 240w\" sizes=\"auto, (max-width: 2169px) 100vw, 2169px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-2-2.gif\" alt=\"Figure 2: This figure contains two peices. The top piece represents the general methodology of buliding this scientific foundtaion model (SFM). The bottom section is a GIF that illustrates one important apporach that has been developed by Microsoft on protein structure prediction through Distributional Graphormer. Unlike the other protein prediction methods on the market, Distributional Graphormer claims that molecules are not rigid, rather they are dynamic that can adopt different structures with different probabilities at equilibrium. Distributional Graphormer is the first computational method that can predict equilibrium distribution of molecules by advanced generative AI technology.\" class=\"wp-image-968448\"\/><figcaption class=\"wp-element-caption\">Figure 2: Scientific foundation model (SFM) and its current exploration: Distributional Graphormer.<\/figcaption><\/figure>\n\n\n\n<p>Scientific foundation model (SFM) aims to create a unified large-scale foundation model to empower natural scientific discovery by supporting diverse inputs, multiple scientific domains (e.g., drugs, materials, biology, health, etc.) and computational tasks. The DeepSpeed4Science partnership will provide new training and inference technologies to empower the SFM team\u2019s continuous research on projects like Microsoft\u2019s new generative AI methods, such as <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/distributional-graphormer-toward-equilibrium-distribution-prediction-for-molecular-systems\/\">Distributional Graphormer<\/a>.<\/p>\n\n\n\n<p><strong>ClimaX, MSR AI4Science<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1536\" height=\"680\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-3.png\" alt=\"Figure 3: The diagram of a foundation model for weather modeling is shown here. Our changing climate is producing more frequent extreme weather events. To mitigate the negative effects, it is increasingly important to predict where these events will occur. ClimaX is the first foundation model designed to perform a wide variety of weather and climate modeling tasks. It can absorb many different datasets with different variables and resolutions, potentially improving weather forecasting.\" class=\"wp-image-968451\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-3.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-3-300x133.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-3-1024x453.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-3-768x340.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-3-240x106.png 240w\" sizes=\"auto, (max-width: 1536px) 100vw, 1536px\" \/><figcaption class=\"wp-element-caption\">Figure 3: ClimaX is the first foundation model designed to perform a wide variety of weather and climate modeling tasks.<\/figcaption><\/figure>\n\n\n\n<p>Our changing climate is producing more frequent extreme weather events. To mitigate the negative effects, it is increasingly important to predict where these events will occur. <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/group\/autonomous-systems-group-robotics\/articles\/introducing-climax-the-first-foundation-model-for-weather-and-climate\/\" target=\"_blank\" rel=\"noreferrer noopener\">ClimaX<\/a> is the first foundation model designed to perform a wide variety of weather and climate modeling tasks. It can absorb many different datasets with different variables and resolutions, potentially improving weather forecasting. DeepSpeed4Science is creating new system supports and acceleration strategies for ClimaX for efficiently pretraining\/finetuning bigger foundation models while handling very large high-resolution image data (e.g., tens to hundreds of petabytes) with long sequences.<\/p>\n\n\n\n<p><strong>AI Powered Ab Initio Molecular Dynamics (AI<sup>2<\/sup>MD), MSR AI4Science<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1000\" height=\"705\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-4.gif\" alt=\"Figure 4:This animated figure illustrates one million steps of a molecular dynamics simulation, e.g., RBD-protein interacts with protein inhibitor. Simulations like this are efficient enough to generate trajectories long enough to observe chemically significant events.\" class=\"wp-image-968454\"\/><figcaption class=\"wp-element-caption\">Figure 4: One million steps of molecular dynamics simulation: RBD-protein interacts with protein inhibitor.<\/figcaption><\/figure>\n\n\n\n<p>This project simulates the dynamics of large (million-atom) molecular systems with near ab initio accuracy using <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/ai2bmd-efficient-characterization-of-protein-dynamics-with-ab-initio-accuracy\/\">AI-powered force field models<\/a> while maintaining the efficiency and scalability of classical molecular dynamics. The simulations are efficient enough to generate trajectories long enough to observe chemically significant events. Typically, millions or even billions of inference steps are required for this process. This poses a significant challenge in optimizing the inference speed of graph neural network (GNN)+ LLM models, for which DeepSpeed4Science will provide new acceleration strategies.<\/p>\n\n\n\n<p><strong>Weather from Microsoft Start, Microsoft WebXT\/Bing<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1000\" height=\"552\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-5.gif\" alt=\"Figure 5: This figure shows Microsoft Start precipitation nowcast application on Bing, i.e., every 4 minutes for the next 4 hours. Weather from Microsoft Start provides precise weather information to help users make better decisions for their lifestyles, health, jobs and activities \u2013 including accurate 10-day global weather forecasts updated multiple times every hour.\" class=\"wp-image-968439\"\/><figcaption class=\"wp-element-caption\">Figure 5: Microsoft Start precipitation nowcast (every 4 minutes for the next 4 hours).<\/figcaption><\/figure>\n\n\n\n<p><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.msn.com\/en-us\/weather\" target=\"_blank\" rel=\"noopener noreferrer\">Weather from Microsoft Start<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> provides precise weather information to <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/blogs.windows.com\/windowsexperience\/2022\/08\/31\/microsoft-joins-noaas-weather-ready-nation-ambassador-initiative-to-help-improve-americas-readiness-and-response-to-weather-events\/\" target=\"_blank\" rel=\"noopener noreferrer\">help users make better decisions for their lifestyles, health, jobs and activities<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> \u2013 including accurate 10-day global weather forecasts updated multiple times every hour.&nbsp; Previously, Weather from Microsoft Start benefited from DeepSpeed technologies to accelerate their multi-GPU training environments. Currently, DeepSpeed4Science is working with the WebXT weather team to further enhance Microsoft Weather services with cutting-edge features and improvements.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"current-external-collaborators\">Current external collaborators&nbsp;<\/h3>\n\n\n\n<p>DeepSpeed4Science\u2019s journey started with two pioneering LLM-based AI models for structural biology research: <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/openfold.io\/\" target=\"_blank\" rel=\"noopener noreferrer\">OpenFold<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> from Columbia University, an open-sourced high-fidelity protein structure prediction model; and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/ramanathanlab\/genslm\" target=\"_blank\" rel=\"noopener noreferrer\">GenSLMs<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> from <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.anl.gov\/\" target=\"_blank\" rel=\"noopener noreferrer\">Argonne National Laboratory<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, an <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.acm.org\/media-center\/2022\/november\/gordon-bell-special-prize-covid-research-2022\" target=\"_blank\" rel=\"noopener noreferrer\">award-winning genome-scale language model<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> for learning the evolutionary landscape of SARS-CoV-2 (COVID-19) genomes. As the featured showcases for this release, they represent two common AI system challenges facing today\u2019s AI-driven structural biology research. We will discuss how DeepSpeed4Science empowered their scientific discovery in the next section.&nbsp;&nbsp;<\/p>\n\n\n\n<p>Additionally, DeepSpeed4Science has recently expanded its scope to support a more diverse range of science models. For example, in our work with Argonne on training a trillion-parameter science model on <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.anl.gov\/aurora\" target=\"_blank\" rel=\"noopener noreferrer\">Aurora Exascale system<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, DeepSpeed4Science technologies will help them reach the performance requirements and scalability needed for this critical mission. Furthermore, by collaborating with <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/ai-roadmap.ornl.gov\/\" target=\"_blank\" rel=\"noopener noreferrer\">Oak Ridge National Lab<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.cancer.gov\/\" target=\"_blank\" rel=\"noopener noreferrer\">National Cancer Institute (NCI)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> on cancer surveillance, DeepSpeed4Science will help enable high-fidelity extraction and classification of information from unstructured clinical texts for the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.olcf.ornl.gov\/tag\/mossaic\/\" target=\"_blank\" rel=\"noopener noreferrer\">MOSSAIC project<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.&nbsp; DeepSpeed4Science technologies will also be adopted by <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.bnl.gov\/world\/\" target=\"_blank\" rel=\"noopener noreferrer\">Brookhaven National Laboratory<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> to support development of a large digital twin model for clean energy research by using LLMs to produce more realistic simulation data. You can find more detailed information about our external colleagues and their science missions at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/deepspeed4science.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\">DeepSpeed4Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"partnership-showcases\">Partnership showcases&nbsp;<\/h2>\n\n\n\n<p><strong>Showcase (I): DeepSpeed4Science eliminates memory explosion problems for scaling <em>Evoformer-centric<\/em> structural biology models via <em>DS4Sci_EvoformerAttention<\/em><\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img decoding=\"async\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-6-1.png\" alt=\"Figure 6: The top figure illustrates the prediction demonstration from AlphaFold2 and OpenFold against the baseline experiemental result. OpenFold is a community reproduction of DeepMind\u2019s AlphaFold2 that makes it possible to train or finetune AlphaFold2 on new datasets. Researchers have used it to retrain AlphaFold2 from scratch to produce new sets of model parameters, studied the early training phase of AlphaFold2 (shown as the bottom figure), and developed new protein folding systems. The bottom figure demonstrates OpenFold's predictions for PDB chain 7B3A_A as the model trains.\" class=\"wp-image-968409\" style=\"height:300px\" height=\"300\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-6-1.png 631w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-6-1-300x207.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-6-1-240x165.png 240w\" sizes=\"(max-width: 631px) 100vw, 631px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1500\" height=\"718\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure6-2.gif\" alt=\"Figure 6: The top figure illustrates the prediction demonstration from AlphaFold2 and OpenFold against the baseline experiemental result. OpenFold is a community reproduction of DeepMind\u2019s AlphaFold2 that makes it possible to train or finetune AlphaFold2 on new datasets. Researchers have used it to retrain AlphaFold2 from scratch to produce new sets of model parameters, studied the early training phase of AlphaFold2 (shown as the bottom figure), and developed new protein folding systems. The bottom figure demonstrates OpenFold's predictions for PDB chain 7B3A_A as the model trains.\" class=\"wp-image-968415\"\/><figcaption class=\"wp-element-caption\">Figure 6: OpenFold predictions for PDB chain 7B3A_A as the model trains.<\/figcaption><\/figure>\n\n\n\n<p><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/aqlaboratory\/openfold\" target=\"_blank\" rel=\"noopener noreferrer\"><strong>OpenFold<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a> is a community reproduction of <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/alphafold.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">DeepMind\u2019s AlphaFold2<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> that makes it possible to train or finetune AlphaFold2 on new datasets. Researchers have used it to retrain AlphaFold2 from scratch to produce new sets of model parameters, studied the early training phase of AlphaFold2 (Figure 6), and developed new protein folding systems.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img decoding=\"async\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-7.jpg\" alt=\"Figure 7: It shows the peak memory requirement for training variants of the multiple sequence alignment (MSA) attention kernels (with bias) with the maximum possible training sample dimension in OpenFold. (Left) The original OpenFold implementation with EvoformerAttention used in AlphaFold2. The memory explosion problems in training\/inference for these types of protein structure prediction models are common. Particularly, state-of-the-art FlashAttention cannot effectively support such science attention variants. (Right) A new solution from DeepSpeed4Science called DS4Sci_EvoformerAttention significantly reduces OpenFold\u2019s peak memory requirement for training by 13X without accuracy loss.\" class=\"wp-image-968418\" style=\"width:400px\" width=\"400\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-7.jpg 634w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-7-300x191.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-7-240x153.jpg 240w\" sizes=\"(max-width: 634px) 100vw, 634px\" \/><figcaption class=\"wp-element-caption\">Figure 7: Peak memory requirement for training variants of the multiple sequence alignment (MSA) attention kernels (with bias) with the maximum possible training sample dimension in OpenFold. (Left) The original OpenFold implementation with EvoformerAttention used in AlphaFold2. The memory explosion problems in training\/inference for these types of protein structure prediction models are common. Particularly, state-of-the-art FlashAttention cannot effectively support such science attention variants. (Right) A new solution from DeepSpeed4Science called DS4Sci_EvoformerAttention significantly reduces OpenFold\u2019s peak memory requirement for training by 13X without accuracy loss.<\/figcaption><\/figure>\n\n\n\n<p>While OpenFold does include performance and memory optimizations using state-of-the-art system technologies, training AlphaFold2 from scratch is still computationally expensive. The model at the current stage is small in absolute terms, with just 93 million parameters, but it contains several custom attention variants that manifest unusually large activations. During the \u201cfinetuning\u201d phase of a standard AlphaFold2 training run, the logit tensor produced in just one of these variants&#8211;one designed to attend over the deep protein MSAs fed to the model as input&#8211;is in excess of 12GB in half precision alone, dwarfing the peak memory requirements of comparably sized language models. Even with techniques like activation checkpointing and DeepSpeed ZeRO optimizations, this memory explosion problem heavily constrains the sequence lengths and MSA depths on which the model can be trained. Furthermore, approximation strategies can significantly affect the model accuracy and convergence, while still resulting in memory explosion, shown as the left bar (orange) in Figure 7. &nbsp;<\/p>\n\n\n\n<p>To address this common system challenge in structural biology research (e.g., protein structure prediction and equilibrium distribution prediction), DeepSpeed4Science is addressing this memory inefficiency problem by designing customized exact attention kernels for the attention variants (i.e., <em>EvoformerAttention<\/em>), which widely appear in this category of science models. Specifically,\u202fa set of highly memory-efficient DS4Sci_EvoformerAttention kernels enabled by sophisticated fusion\/tiling strategies and on-the-fly memory reduction methods, are created for the broader community as high-quality machine learning primitives. Incorporated into OpenFold, they provide a substantial speedup during training and dramatically reduce the model\u2019s peak memory requirement for training and inference. This allows OpenFold to be experimented with bigger and more complex models, and longer sequences, and trained on a wider spectrum of hardware. Detailed information about this technology can be found at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/deepspeed4science.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\">DeepSpeed4Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n<p><strong>Showcase (II): DeepSpeed4Science enables very-long sequence support via both systematic and algorithmic approaches for genome-scale foundation models (e.g., GenSLMs)<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1600\" height=\"900\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed-Figure-8.gif\" alt=\"Figure 8. The dynamic figure dipicts GenSLMs, 2022 ACM Gordon Bell Winning COVID Model (a 25B\/33B dense model based on GPT-NeoX). It is used to learn the latent space that describes biologically meaningful properties for SARS-CoV-2 genomes. This GIF is visualizing an important protein family, malate dehydrogenase, and viewing a projection of the latent space colored by important features such as sequence length and GC content (the ratio of the content of the nucleic acids guanine and cytosine in comparison to adenine and thymine. It measures the ability of a DNA strand to withstand heat).\" class=\"wp-image-968424\"\/><figcaption class=\"wp-element-caption\">Figure 8: GenSLMs: 2022 ACM Gordon Bell Winning COVID Model (a 25B\/33B dense model based on GPT-NeoX). It is used to learn the latent space that describes biologically meaningful properties for SARS-CoV-2 genomes. This GIF is visualizing an important protein family, malate dehydrogenase, and viewing a projection of the latent space colored by important features such as sequence length and GC content (the ratio of the content of the nucleic acids guanine and cytosine in comparison to adenine and thymine. It measures the ability of a DNA strand to withstand heat).<\/figcaption><\/figure>\n\n\n\n<p><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/ramanathanlab\/genslm\" target=\"_blank\" rel=\"noopener noreferrer\"><strong>GenSLMs<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, a 2022 <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.acm.org\/media-center\/2022\/november\/gordon-bell-special-prize-covid-research-2022\" target=\"_blank\" rel=\"noopener noreferrer\">ACM Gordon Bell award<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> winning genome-scale language model from Argonne National Lab, can learn the evolutionary landscape of SARS-CoV-2 (COVID-19) genomes by adapting large language models (LLMs) for genomic data. It is designed to transform how new and emergent variants of pandemic-causing viruses, especially SARS-CoV-2, are identified and classified. GenSLM represents one of the first whole genome-scale foundation models which can generalize to other prediction tasks. A good understanding of the latent space can help GenSLMs tackle new domains beyond just viral sequences and expand their ability to model bacterial pathogens and even eukaryotic organisms, e.g., to understand things such as function, pathway membership, and evolutionary relationships. To achieve this scientific goal, GenSLMs and similar models require very <em>long sequence<\/em> support for both training and inference that is beyond generic LLMs\u2019 long-sequence strategies like <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/abs\/2307.08691\" target=\"_blank\" rel=\"noopener noreferrer\">FlashAttention<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Through DeepSpeed4Science\u2019s new designs, scientists can now build and train models with significantly longer context windows, allowing them to explore relationships that were previously inaccessible.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1160\" height=\"291\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/Figure-9.png\" alt=\"DeepSpeed - Figure 9. The two figures show the maximum sequence lengths of GenSLM models (25 billion parameters and 33 billion parameters) supported by different frameworks at different scales. The hardware profiled here are NVIDIA DGX nodes with eight 40G A100 GPUs per node.\" class=\"wp-image-968397\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/Figure-9.png 1160w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/Figure-9-300x75.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/Figure-9-1024x257.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/Figure-9-768x193.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/Figure-9-240x60.png 240w\" sizes=\"auto, (max-width: 1160px) 100vw, 1160px\" \/><figcaption class=\"wp-element-caption\">Figure 9: Maximum sequence lengths of GenSLM models supported by different frameworks at different scales. The hardware profiled here are NVIDIA DGX nodes with eight 40G A100 GPUs per node.<\/figcaption><\/figure>\n\n\n\n<p>Specifically, at system level, we release the newest <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/Megatron-DeepSpeed\/tree\/main\/examples_deepspeed\/deepspeed4science\/megatron_long_seq_support\" target=\"_blank\" rel=\"noopener noreferrer\">Megatron-DeepSpeed<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> framework for very-long sequence support along with <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/microsoft\/Megatron-DeepSpeed\/tree\/main\/examples_deepspeed\/deepspeed4science\/megatron_long_seq_support\">other new optimizations<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Scientists can now train their large science models like GenSLMs with much longer sequences via a synergetic combination of our newly added memory optimization techniques on attention mask and position embedding, tensor parallelism, pipeline parallelism, sequence parallelism, ZeRO-style data parallelism and model state offloading. Figure 9 demonstrates that our new release enables the longest sequence length for GenSLMs\u2019 25B and 33B models by up to <em>12X<\/em> and <em>14X<\/em>, respectively, over the previous Megatron-DeepSpeed. In terms of supported sequence lengths, this new framework also significantly outperforms NVIDIA\u2019s Megatron-LM by up to 9.8X and 9.1X for the 25B and 33B models, respectively. For example, GenSLMs\u2019 25B model can now be trained with a 512K sequence of nucleotides, compared to the Argonne team\u2019s original 42K sequence length on 64 GPUs. This drastically improves model quality and scientific discovery scope with <em>no accuracy loss<\/em>. Additional support for domain scientists who prefer algorithmic strategies like relative position embedding techniques is also integrated in this <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/deepspeed4science.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\">new release<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"summary-and-roadmap\">Summary and roadmap&nbsp;<\/h2>\n\n\n\n<p>We are very proud and excited to announce the DeepSpeed4Science initiative along with several R&D highlights and achievements. Starting today, we will host our new initiative at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/deepspeed4science.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\">DeepSpeed4Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, including information about our external colleagues, and current and future DeepSpeed4Science technology releases. One of our high-level goals is to generalize AI system technologies that broadly address the major system pain points for large-scale scientific discoveries. We hope scientists around the world will enjoy the new capabilities unlocked by DeepSpeed4Science through open-sourced software. We are looking forward to better understanding the AI system design challenges that block your discovery progress. We sincerely welcome your participation to help us build a promising AI4Science future. Please email us at <a href=\"mailto:deepspeed-info@microsoft.com\" target=\"_blank\" rel=\"noreferrer noopener\">deepspeed-info@microsoft.com<\/a>. We encourage you to report issues, contribute PRs, and join discussions on our\u202f<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/DeepSpeed\/\" target=\"_blank\" rel=\"noopener noreferrer\">DeepSpeed GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u202fpage.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-fill-github\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"mailto:deepspeed-info@microsoft.com\">DeepSpeed<\/a><\/div>\n\n\n\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"mailto:deepspeed-info@microsoft.com\">Contact us<\/a><\/div>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"acknowledgements\">Acknowledgements&nbsp;<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"core-deepspeed4science-team\">Core DeepSpeed4Science Team:&nbsp;&nbsp;<\/h3>\n\n\n\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/leonsong\/\">Shuaiwen Leon Song<\/a>&nbsp;(DeepSpeed4Science lead), <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/minjiaz\/\">Minjia Zhang<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/conglli\/\">Conglong Li<\/a>, Shiyang Chen, Chengming Zhang, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/xiaoxiawu\/\">Xiaoxia (Shirley) Wu<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/mtanaka\/\">Masahiro Tanaka<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/mcai\/\">Martin Cai<\/a>, Adam Graham, Charlie Zhou, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/yuxhe\/\">Yuxiong He<\/a>&nbsp;(DeepSpeed team lead)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"our-founding-collaborators-in-alphabetical-order\">Our Founding Collaborators (in alphabetical order):<\/h3>\n\n\n\n<p><strong>Argonne National Lab team<\/strong>: Rick Stevens, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Thomas Gibbs, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin<\/p>\n\n\n\n<p><strong>AMD<\/strong>: Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz<\/p>\n\n\n\n<p><strong>Brookhaven National Lab team<\/strong>: Adolfy Hoisie, Shinjae Yoo, Yihui Ren.&nbsp;<\/p>\n\n\n\n<p><strong>Columbia University <\/strong><strong>OpenFold<\/strong><strong> team<\/strong>: Mohammed AlQuraishi, Gustaf Ahdritz&nbsp;<\/p>\n\n\n\n<p><strong>Microsoft Research AI4Science team: <\/strong>Christopher Bishop, Bonnie Kruft, Max Welling, Tie-Yan Liu, Christian Bodnar, Johannes Brandsetter, Wessel Bruinsma, Chan Cao, Yuan-Jyue Chen, Peggy Dai, Patrick Garvan, Liang He, Elizabeth Heider, PiPi Hu, Peiran Jin, Fusong Ju, Yatao Li, Chang Liu, Renqian Luo, Qi Meng, Frank Noe, Tao Qin, Janwei Zhu, Bin Shao, Yu Shi, Wenlei Shi, Gregor Simm, Megan Stanley, Lixin Sun, Yue Wang, Tong Wang, Zun Wang, Lijun Wu, Yingce Xia, Leo Xia, Shufang Xie, Shuxin Zheng, Jianwei Zhu<\/p>\n\n\n\n<p><strong>NVIDIA<\/strong>: Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar<\/p>\n\n\n\n<p><strong>Oakridge National Lab team<\/strong>: Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano (Max) Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash , Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu<\/p>\n\n\n\n<p><strong>Princeton University<\/strong>: William Tang, Kyle Felker, Alexey Svyatkovskiy (Microsoft liaison)&nbsp;<\/p>\n\n\n\n<p><strong>Rutgers University: <\/strong>Hang Liu<\/p>\n\n\n\n<p><strong>WebXT<\/strong><strong> Weather team<\/strong>: Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Editor\u2019s note, Sept. 28, 2023 \u2013\u00a0The founding collaborators list was updated to correct omissions and the scientific foundation model graph was updated to correct information. In the next decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing [&hellip;]<\/p>\n","protected":false},"author":42735,"featured_media":968769,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[{"type":"user_nicename","value":"Shuaiwen Leon Song","user_id":"42567"},{"type":"user_nicename","value":"Bonnie Kruft","user_id":"41919"},{"type":"user_nicename","value":"Minjia Zhang","user_id":"36335"},{"type":"user_nicename","value":"Conglong Li","user_id":"39318"},{"type":"user_nicename","value":"Martin Cai","user_id":"32856"},{"type":"user_nicename","value":"Yuxiong He","user_id":"35084"}],"msr_hide_image_in_river":0,"footnotes":""},"categories":[1],"tags":[],"research-area":[13556],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[264846],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-968301","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[851467],"msr_impact_theme":["Computing foundations"],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[788837,678390],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Bonnie Kruft","user_id":41919,"display_name":"Bonnie Kruft","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/bonniekruft\/\" aria-label=\"Visit the profile page for Bonnie Kruft\">Bonnie Kruft<\/a>","is_active":false,"last_first":"Kruft, Bonnie","people_section":0,"alias":"bonniekruft"},{"type":"user_nicename","value":"Martin Cai","user_id":32856,"display_name":"Martin Cai","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/mcai\/\" aria-label=\"Visit the profile page for Martin Cai\">Martin Cai<\/a>","is_active":false,"last_first":"Cai, Martin","people_section":0,"alias":"mcai"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-960x540.jpg\" class=\"img-object-cover\" alt=\"DeepSpeed4Science Initiative - graphic with 6 icons\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/09\/DeepSpeed4Science-BlogHeroFeature-no-text-1400x788-1.jpg 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"September 19, 2023","formattedExcerpt":"Editor\u2019s note, Sept. 28, 2023 \u2013\u00a0The founding collaborators list was updated to correct omissions and the scientific foundation model graph was updated to correct information. In the next decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/968301","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/42735"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=968301"}],"version-history":[{"count":40,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/968301\/revisions"}],"predecessor-version":[{"id":971271,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/968301\/revisions\/971271"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/968769"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=968301"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=968301"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=968301"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=968301"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=968301"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=968301"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=968301"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=968301"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=968301"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=968301"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=968301"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}