{"id":945276,"date":"2023-07-07T09:01:22","date_gmt":"2023-07-07T16:01:22","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=945276"},"modified":"2024-06-10T10:01:29","modified_gmt":"2024-06-10T17:01:29","slug":"distributional-graphormer-toward-equilibrium-distribution-prediction-for-molecular-systems","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/distributional-graphormer-toward-equilibrium-distribution-prediction-for-molecular-systems\/","title":{"rendered":"Distributional Graphormer: Toward equilibrium distribution prediction for molecular systems"},"content":{"rendered":"\n<figure class=\"wp-block-image aligncenter size-full is-style-spectrum\"><img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"300\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/diglogo-graphormer-animated-logo-800x300-1.gif\" alt=\"Distributional Graphormer (DiG) animated logo\" class=\"wp-image-953460\"\/><\/figure>\n\n\n\n<p>Structure prediction is a fundamental problem in molecular science because the structure of a molecule determines its properties and functions. In recent years, deep learning methods have made remarkable progress and impact on predicting molecular structures, especially for protein molecules. Deep learning methods, such as <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/alphafold.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">AlphaFold<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.ipd.uw.edu\/2021\/07\/rosettafold-accurate-protein-structure-prediction-accessible-to-all\/\" target=\"_blank\" rel=\"noopener noreferrer\">RoseTTAFold<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, have achieved unprecedented accuracy in predicting the most probable structures for proteins from their amino acid sequences and have been hailed as a game changer in molecular science. However, this method provides only a single snapshot of a protein structure, and structure prediction cannot tell the complete story of how a molecule works.<\/p>\n\n\n\n<p>Proteins are not rigid objects; they are dynamic molecules that can adopt different structures with specific probabilities at equilibrium. Identifying these structures and their probabilities is essential in understanding protein properties and functions, how they interact with other proteins, and the statistical mechanics and thermodynamics of molecular systems. Traditional methods for obtaining these equilibrium distributions, such as molecular dynamics simulations or Monte Carlo sampling (which uses repeated random sampling from a distribution to achieve numerical statistical results), are often computationally expensive and may even become intractable for complex molecules. Therefore, there is a pressing need for novel computational approaches that can accurately and efficiently predict the equilibrium distributions of molecular structures from basic descriptors.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"350\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig1a.jpg\" alt=\"A schematic diagram illustrating the goal of Distributional Graphormer (DiG). A molecular system is represented by a basic descriptor D, such as the amino acid sequence for a protein. DiG transforms D into a structural ensemble S, which consists of multiple possible conformations and their probabilities. S is expected to follow the equilibrium distribution of the molecular system. A legend shows a example of D and S for Adenylate kinase protein.\" class=\"wp-image-945444\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig1a.jpg 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig1a-300x75.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig1a-1024x256.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig1a-768x192.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig1a-240x60.jpg 240w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><figcaption class=\"wp-element-caption\">Figure 1. The goal of Distributional Graphormer (DiG). DiG takes the basic descriptor, D, of a molecular system, such as the amino acid sequence for a protein, as input to predict the structures and their probabilities following equilibrium distribution.<\/figcaption><\/figure>\n\n\n\n<p>In this blog post, we introduce <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/towards-predicting-equilibrium-distributions-for-molecular-systems-with-deep-learning\/\" target=\"_blank\" rel=\"noreferrer noopener\">Distributional Graphormer (DiG)<\/a>, a new deep learning framework for predicting protein structures according to their equilibrium distribution. It aims to address this fundamental challenge and open new opportunities for molecular science. DiG is a significant advancement from single structure prediction to structure ensemble modeling with equilibrium distributions. Its distribution prediction capability bridges the gap between the microscopic structures and the macroscopic properties of molecular systems, which are governed by statistical mechanics and thermodynamics. Nevertheless, this is a tremendous challenge, as it requires modeling complex distributions in high-dimensional space to capture the probabilities of different molecular states.<\/p>\n\n\n\n<p>DiG achieves a novel solution for distribution prediction through an advancement of our previous work, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/graphormer\/\" target=\"_blank\" rel=\"noreferrer noopener\">Graphormer<\/a>, which is a general-purpose graph transformer that can effectively model molecular structures. Graphormer has shown excellent performance in molecular science research, demonstrated by applications in quantum chemistry and molecular dynamics simulations, as reported in our previous blog posts (see <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/lab\/microsoft-research-asia\/articles\/transformer-stands-out-as-the-best-graph-learner-researchers-from-microsoft-research-asia-wins-the-kdd-cups-2021-graph-prediction-track\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a> and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/lab\/microsoft-research-asia\/articles\/graphormer-wins-the-open-catalyst-challenge-and-upgrades-to-ai-for-molecular-simulation-toolkit\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a> for more details). Now, we have advanced Graphormer to create DiG, which has a new and powerful capability: using deep neural networks to directly predict target distribution from basic descriptors of molecules.<\/p>\n\n\n\n<div style=\"height:15px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n\t<div class=\"border-bottom border-top border-gray-300 mt-5 mb-5 msr-promo text-center text-md-left alignwide\" data-bi-aN=\"promo\" data-bi-id=\"1002645\">\n\t\t\n\n\t\t<p class=\"msr-promo__label text-gray-800 text-center text-uppercase\">\n\t\t<span class=\"px-4 bg-white display-inline-block font-weight-semibold small\">Spotlight: AI-POWERED EXPERIENCE<\/span>\n\t<\/p>\n\t\n\t<div class=\"row pt-3 pb-4 align-items-center\">\n\t\t\t\t\t\t<div class=\"msr-promo__media col-12 col-md-5\">\n\t\t\t\t<a class=\"bg-gray-300 display-block\" href=\"https:\/\/aka.ms\/research-copilot\/?OCID=msr_researchforum_Copilot_MCR_Blog_Promo\" aria-label=\"Microsoft research copilot experience\" data-bi-cN=\"Microsoft research copilot experience\" target=\"_blank\">\n\t\t\t\t\t<img decoding=\"async\" class=\"w-100 display-block\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/01\/MSR-Chat-Promo.png\" alt=\"\" \/>\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"msr-promo__content p-3 px-5 col-12 col-md\">\n\n\t\t\t\t\t\t\t\t\t<h2 class=\"h4\">Microsoft research copilot experience<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<p id=\"microsoft-research-copilot-experience\" class=\"large\">Discover more about research at Microsoft through our AI-powered experience<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<div class=\"wp-block-buttons justify-content-center justify-content-md-start\">\n\t\t\t\t\t<div class=\"wp-block-button\">\n\t\t\t\t\t\t<a href=\"https:\/\/aka.ms\/research-copilot\/?OCID=msr_researchforum_Copilot_MCR_Blog_Promo\" aria-describedby=\"microsoft-research-copilot-experience\" class=\"btn btn-brand glyph-append glyph-append-chevron-right\" data-bi-cN=\"Microsoft research copilot experience\" target=\"_blank\">\n\t\t\t\t\t\t\tStart now\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div><!--\/.msr-promo__content-->\n\t<\/div><!--\/.msr-promo__inner-wrap-->\n\t<\/div><!--\/.msr-promo-->\n\t\n\n\n<p>DiG tackles this challenging problem. It is based on the idea of simulated annealing, a classic method in thermodynamics and optimization, which has also motivated the recent development of diffusion models that achieved remarkable breakthroughs in AI-generated content (AIGC). Simulated annealing produces a complex distribution by gradually refining a simple distribution through the simulation of an annealing process, allowing it to explore and settle in the most probable states. DiG mimics this process in a deep learning framework for molecular systems. AIGC models are often based on the idea of diffusion models, which are inspired by statistical mechanics and thermodynamics.<\/p>\n\n\n\n<p>DiG is also based on the idea of diffusion models, but we bring this idea back to thermodynamics research, creating a closed loop of inspiration and innovation. We imagine scientists someday will be able to use DiG like an AIGC model for drawing, inputting a simple description, such as an amino acid sequence, and then using DiG to quickly generate realistic and diverse protein structures that follow equilibrium distribution. This will greatly enhance scientists\u2019 productivity and creativity, enabling novel discoveries and applications in fields such as drug design, materials science, and catalysis.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"how-does-dig-work\">How does DiG work?<\/h2>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"500\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig5.jpg\" alt=\"A schematic diagram illustrating the design and backbone architecture of DiG. The diagram shows a molecular system with two possible conformations as an example. The top row shows the energy function of the molecular system as a curve, with two local minima corresponding to the two conformations. The bottom row shows the probability distribution of the molecular system as a bar chart, with two peaks corresponding to the two conformations. The diagram also shows a diffusion process that transforms the probability distribution from a simple uniform one to the equilibrium one that matches the energy function. The diffusion process consists of several intermediate time steps, labeled as i=0,1,\u2026,T. At each time step, a deep-learning model, Graphormer, is used to construct a forward diffusion step that converts the distribution at the previous time step to the next one, indicated by blue arrows. The Graphormer model is learned to match the distribution at each time step to a predefined backward diffusion step that converts the equilibrium distribution to the simple one, indicated by orange arrows. The backward diffusion step is computed by adding Gaussian noise to the equilibrium distribution and normalizing it. The learning of the Graphormer model is supervised by both the samples and the energy function of the molecular system. The samples are obtained from a large-scale molecular simulation dataset that provides the initial samples and the corresponding energy labels. The energy function is used to calculate the energy scores for the generated samples and guide the diffusion process towards the equilibrium distribution. The diagram also shows a physics-informed diffusion pre-training (PIDP) method that is developed to pre-train DiG with only energy functions as inputs, without the data dependency. The PIDP method uses a contrastive loss function to minimize the distance between the energy scores and the probabilities of the generated samples at each time step. The PIDP method can enhance the generalization of DiG to molecular systems that are not in the dataset.\" class=\"wp-image-945450\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig5.jpg 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig5-300x107.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig5-1024x366.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig5-768x274.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG_Jun2023_fig5-240x86.jpg 240w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><figcaption class=\"wp-element-caption\">Figure 2. DiG\u2019s design and backbone architecture.<\/figcaption><\/figure>\n\n\n\n<p>DiG is based on the idea of diffusion by transforming a simple distribution to a complex distribution using Graphormer. The simple distribution can be a standard Gaussian, and the complex distribution can be the equilibrium distribution of molecular structures. The transformation is done step-by-step, where the whole process mimics the simulated annealing process.<\/p>\n\n\n\n<p>DiG can be trained using different types of data or information. For example, DiG can use energy functions of molecular systems to guide transformation, and it can also use simulated structure data, such as molecular dynamics trajectories, to learn the distribution. More concretely, DiG can use energy functions of molecular systems to guide transformation by minimizing the discrepancy between the energy-based probabilities and the probabilities predicted by DiG. This approach can leverage the prior knowledge of the system and train DiG without stringent dependency on data. Alternatively, DiG can also use simulation data, such as molecular dynamics trajectories, to learn the distribution by maximizing the likelihood of the data under the DiG model.<\/p>\n\n\n\n<p>DiG shows similarly good generalizing abilities on many molecular systems compared with deep learning-based structure prediction methods. This is because DiG inherits the advantages of advanced deep-learning architectures like Graphormer and applies them to the new and challenging task of distribution prediction. &nbsp;Once trained, DiG can generate molecular structures by reversing the transformation process, starting from a simple distribution and applying neural networks in reverse order. DiG can also provide the probability estimation for each generated structure by computing the change of probability along the transformation process. DiG is a flexible and general framework that can handle different types of molecular systems and descriptors.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"results\">Results<\/h2>\n\n\n\n<p>We demonstrate DiG\u2019s performance and potential through several molecular sampling tasks covering a broad range of molecular systems, such as proteins, protein-ligand complexes, and catalyst-adsorbate systems. Our results show that DiG not only generates realistic and diverse molecular structures with high efficiency and low computational costs, but it also provides estimations of state densities, which are crucial for computing macroscopic properties using statistical mechanics. Accordingly, DiG presents a significant advancement in statistically understanding microscopic molecules and predicting their macroscopic properties, creating many exciting research opportunities in molecular science.<\/p>\n\n\n\n<p>One major application of DiG is to sample protein conformations, which are indispensable to understanding their properties and functions. Proteins are dynamic molecules that can adopt diverse structures with different probabilities at equilibrium, and these structures are often related to their biological functions and interactions with other molecules. However, predicting the equilibrium distribution of protein conformations is a long-standing and challenging problem due to the complex and high-dimensional energy landscape that governs probability distribution in the conformation space. In contrast to expensive and inefficient molecular dynamics simulations or Monte Carlo sampling methods, DiG generates diverse and functionally relevant protein structures from amino acid sequences at a high speed and a significantly reduced cost.<\/p>\n\n\n\n<figure class=\"wp-block-video aligncenter\"><video height=\"1600\" style=\"aspect-ratio: 2560 \/ 1600;\" width=\"2560\" controls src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/07\/2023-06-18-22-51-41_1.mp4\"><\/video><figcaption class=\"wp-element-caption\">Figure 3. This illustration shows DiG\u2019s performance when generating multiple conformations of proteins. On the left, DiG-generated structures of the main protease of SARS-CoV-2 virus are projected into 2D space panned with two TICA coordinates. On the right, structures generated by DiG (thin ribbons) are compared with experimentally determined structures (cylindrical figures) in each case.<\/figcaption><\/figure>\n\n\n\n<p>DiG can generate multiple conformations from the same protein sequence. The left side of Figure 3 shows DiG-generated structures of the main protease of SARS-CoV-2 virus compared with MD simulations and AlphaFold prediction results. The contours (shown as lines) in the 2D space reveal three clusters sampled by extensive MD simulations. DiG generates highly similar structures in clusters II and III, while structures in cluster I are undersampled. In the right panel, DiG-generated structures are aligned to experimental structures for four proteins, each with two distinguishable conformations corresponding to unique functional states. In the upper left, the Adenylate kinase protein has open and closed states, both well sampled by DiG. Similarly, for the drug transport protein LmrP, DiG also generates structures for both states. Here, note that the closed state is experimentally determined (in the lower-right corner, with PDB ID 6t1z), while the other is the AlphaFold predicted model that is consistent with experimental data. In the case of human B-Raf kinase, the major structural difference is localized in the A-loop region and a nearby helix, which are well captured by DiG. The D-ribose binding protein has two separated domains, which can be packed into two distinct conformations. DiG perfectly generated the straight-up conformation, but it is less accurate in predicting the twisted conformation. Nonetheless, besides the straight-up conformation, DiG generated some conformations that appear to be intermediate states.<\/p>\n\n\n\n<p>Another application of DiG is to sample catalyst-adsorbate systems, which are central to heterogeneous catalysis. Identifying active adsorption sites and stable adsorbate configurations is crucial for understanding and designing catalysts, but it is also quite challenging due to the complex surface-molecular interactions. Traditional methods, such as density functional theory (DFT) calculations and molecular dynamics simulations, are time-consuming and costly, especially for large and complex surfaces. DiG predicts adsorption sites and configurations, as well as their probabilities, from the substrate and adsorbate descriptors. DiG can handle various types of adsorbates, such as single atoms or molecules being adsorbed onto different types of substrates, such as metals or alloys.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"829\" height=\"535\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/07\/DiG_Jul2023_fig4.png\" alt=\"Figure 4. Adsorption prediction results of single C, H, and O atoms on catalyst surfaces. The predicted probability distribution on catalyst surface is compared to the interaction energy between the adsorbate molecules and the catalyst in the middle and bottom rows.\" class=\"wp-image-954159\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/07\/DiG_Jul2023_fig4.png 829w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/07\/DiG_Jul2023_fig4-300x194.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/07\/DiG_Jul2023_fig4-768x496.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/07\/DiG_Jul2023_fig4-240x155.png 240w\" sizes=\"auto, (max-width: 829px) 100vw, 829px\" \/><figcaption class=\"wp-element-caption\">Figure 4. Adsorption prediction results of single C, H, and O atoms on catalyst surfaces. The predicted probability distribution on catalyst surface is compared to the interaction energy between the adsorbate molecules and the catalyst in the middle and bottom rows.<\/figcaption><\/figure>\n\n\n\n<p>Applying DiG, we predicted the adsorption sites for a variety of catalyst-adsorbate systems and compared these predicted probabilities with energies obtained from DFT calculations. We found that DiG could find all the stable adsorption sites and generate adsorbate configurations that are similar to the DFT results with high efficiency and at a low cost. DiG estimates the probabilities of different adsorption configurations, in good agreement with DFT energies.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"conclusion-1\">Conclusion<\/h2>\n\n\n\n<p>In this blog, we introduced DiG, a deep learning framework that aims to predict the distribution of molecular structures. DiG is a significant advancement from single structure prediction toward ensemble modeling with equilibrium distributions, setting a cornerstone for connecting microscopic structures to macroscopic properties under deep learning frameworks.<\/p>\n\n\n\n<p>DiG involves key ML innovations that lead to expressive generative models, which have been shown to have the capacity to sample multimodal distribution within a given class of molecules. We have demonstrated the flexibility of this approach on different classes of molecules (including proteins, etc.), and we have shown that individual structures generated in this way are chemically realistic. Consequently, DiG enables the development of ML systems that can sample equilibrium distributions of molecules given appropriate training data.<\/p>\n\n\n\n<p>However, we acknowledge that considerably more research is needed to obtain efficient and reliable predictions of equilibrium distributions for arbitrary molecules. We hope that DiG inspires additional research and innovation in this direction, and we look forward to more exciting results and impact from DiG and other related methods in the future.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Distributional Graphormer, Microsoft\u2019s new deep learning framework for predicting the equilibrium distribution of molecular structures, can generate realistic and diverse molecular structures with high efficiency and low cost.<\/p>\n","protected":false},"author":42735,"featured_media":945435,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_hide_image_in_river":0,"footnotes":""},"categories":[1],"tags":[],"research-area":[13556],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[243984],"msr-impact-theme":[264846,261673],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-945276","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-post-option-blog-homepage-featured"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199560,851467],"msr_impact_theme":["Computing foundations","Health"],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[788837],"related-events":[],"related-researchers":[{"type":"guest","value":"peiran-jin","user_id":"945291","display_name":"Peiran Jin","author_link":"Peiran Jin","is_active":true,"last_first":"Jin, Peiran","people_section":0,"alias":"peiran-jin"},{"type":"user_nicename","value":"Frank No\u00e9","user_id":42216,"display_name":"Frank No\u00e9","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/franknoe\/\" aria-label=\"Visit the profile page for Frank No\u00e9\">Frank No\u00e9<\/a>","is_active":false,"last_first":"No\u00e9, Frank","people_section":0,"alias":"franknoe"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1-960x540.png\" class=\"img-object-cover\" alt=\"A schematic diagram illustrating the goal of Distributional Graphormer (DiG). A molecular system is represented by a basic descriptor D, such as the amino acid sequence for a protein. DiG transforms D into a structural ensemble S, which consists of multiple possible conformations and their probabilities. S is expected to follow the equilibrium distribution of the molecular system. A legend shows a example of D and S for Adenylate kinase protein.\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1-960x540.png 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1-240x135.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1-640x360.png 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1-1280x720.png 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/DiG-msr-blog-hero-1400x788-1.png 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"July 7, 2023","formattedExcerpt":"Distributional Graphormer, Microsoft\u2019s new deep learning framework for predicting the equilibrium distribution of molecular structures, can generate realistic and diverse molecular structures with high efficiency and low cost.","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/945276","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/42735"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=945276"}],"version-history":[{"count":23,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/945276\/revisions"}],"predecessor-version":[{"id":994491,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/945276\/revisions\/994491"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/945435"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=945276"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=945276"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=945276"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=945276"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=945276"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=945276"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=945276"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=945276"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=945276"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=945276"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=945276"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}