{"id":1099176,"date":"2024-11-04T09:00:00","date_gmt":"2024-11-04T17:00:00","guid":{"rendered":""},"modified":"2024-11-14T20:54:22","modified_gmt":"2024-11-15T04:54:22","slug":"microsoft-at-sosp-2024-innovations-in-systems-research","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/microsoft-at-sosp-2024-innovations-in-systems-research\/","title":{"rendered":"Microsoft at SOSP 2024: Innovations in systems research"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1.jpg\" alt=\"SOSP 2024 on a blue and green gradient background\" class=\"wp-image-1099248\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1.jpg 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-1280x720.jpg 1280w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n\n<p>Microsoft is proud to sponsor the <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/event\/sosp-2024\/\">30<sup>th<\/sup> Symposium on Operating Systems Principles<\/a> (SOSP 2024), highlighting its commitment to advancing computing systems research. In an age where digital infrastructure underpins nearly every facet of modern life, SOSP serves as an important forum for showcasing the technologies that shape our interconnected world. Organized annually by the Association for Computing Machinery (ACM), the symposium brings together experts to explore innovations in operating systems, distributed systems, and systems software.<\/p>\n\n\n\n<p>With seven accepted papers, including \u201c<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/verus-a-practical-foundation-for-systems-verification\/\">Verus: A Practical Foundation for Systems Verification<\/a>,\u201d which won the Distinguished Artifact Award, as well as two workshops, and a tutorial, Microsoft researchers are presenting groundbreaking work that strengthens the security, efficiency, and scalability of cloud computing and distributed systems. This work not only contributes to theoretical knowledge but also address real-world challenges, helping ensure that as computing systems grow more complex, they remain sustainable, reliable, and secure.<\/p>\n\n\n\n<p>Continue reading to learn more about Microsoft\u2019s contributions to SOSP 2024, including breakthroughs that tackle the evolving demands of modern computing.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"papers\">Papers<\/h2>\n\n\n\n<h3 class=\"wp-block-heading h4\" id=\"efficient-reproduction-of-fault-induced-failures-in-distributed-systems-with-feedback-driven-fault-injectionjia-pan-haoze-wu-tanakorn-leesatapornwongsa-suman-nath-peng-huang\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/feedback-driven-fault-injection-efficiently-reproducing-fault-induced-failures\/\">Efficient Reproduction of Fault-Induced Failures in Distributed Systems with Feedback-Driven Fault Injection<\/a><\/h3>\n\n\n\n<p><em>Jia Pan, Haoze Wu, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/taleesat\/\"><em>Tanakorn Leesatapornwongsa<\/em><\/a><em>, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sumann\/\"><em>Suman Nath<\/em><\/a><em>, Peng Huang<\/em><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"774\" height=\"645\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Efficient-Reproduction_AH.jpg\" alt=\"Diagram of Anduril\u2019s architecture and workflow. Anduril contains two major components, Instrumenter and Explorer. Instrumenter receives the system bytecode and the failure log file from production. It performs static analysis and computes a causal graph consisting of program points potentially related to the failure symptom. In addition, it inserts code snippets into the system for (1) injecting a fault to throw a desired exception, and (2) logging additional information to facilitate the feedback algorithms. Note that Anduril does not instrument the production system. Its input failure log is from the uninstrumented system. \" class=\"wp-image-1099191\" style=\"width:548px;height:auto\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Efficient-Reproduction_AH.jpg 774w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Efficient-Reproduction_AH-300x250.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Efficient-Reproduction_AH-768x640.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Efficient-Reproduction_AH-216x180.jpg 216w\" sizes=\"auto, (max-width: 774px) 100vw, 774px\" \/><\/figure>\n\n\n\n<p>This research introduces Anduril, a fault injection technique that quickly reproduces specific fault-induced failures in production systems. Anduril uses static causal analysis and a novel feedback-driven algorithm to quickly search the fault space for the failure\u2019s cause and timing. Evaluation on 22 real-world fault-induced failures from five large-scale distributed systems demonstrate FIR\u2019s ability to reproduce all failures by identifying and injecting the root-cause faults at the right time.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h4\" id=\"if-at-first-you-don-t-succeed-try-try-again-insights-and-llm-informed-tooling-for-detecting-retry-bugs-in-software-systems\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/if-at-first-you-dont-succeed-try-try-again-insights-and-llm-informed-tooling-for-detecting-retry-bugs-in-software-systems\/\">If At First You Don\u2019t Succeed, Try, Try, Again\u2026? Insights and LLM-informed Tooling for Detecting Retry Bugs in Software Systems<\/a><\/h3>\n\n\n\n<p><em>Bogdan Alexandru Stoica<\/em>, <em>Utsav Sethi<\/em>, <em>Yiming Su<\/em>, <em>Cyrus Zhou<\/em>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/shanlu\"><em>Shan Lu<\/em><\/a><em>, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jonathanmace\/\"><em>Jonathan Mace<\/em><\/a><em>, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/madanm\/\"><em>Madan Musuvathi<\/em><\/a><em>, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sumann\/\"><em>Suman Nath<\/em><\/a><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"804\" height=\"230\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/If-at-First-You-Dont-Succeed_AH.jpg\" alt=\"A workflow that repurposes existing unit tests to expose retry-related bugs in system software. \" class=\"wp-image-1099194\" style=\"width:700px;height:auto\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/If-at-First-You-Dont-Succeed_AH.jpg 804w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/If-at-First-You-Dont-Succeed_AH-300x86.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/If-at-First-You-Dont-Succeed_AH-768x220.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/If-at-First-You-Dont-Succeed_AH-800x230.jpg 800w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/If-at-First-You-Dont-Succeed_AH-240x69.jpg 240w\" sizes=\"auto, (max-width: 804px) 100vw, 804px\" \/><\/figure>\n\n\n\n<p><em>Retry<\/em>\u2014the re-execution command used when a task fails\u2014is commonly employed to build resilient software systems, but implementing and testing it in modern systems is challenging. Based on real-world <em>retry<\/em> issues, the authors introduce a suite of static and dynamic techniques to detect <em>retry<\/em> problems. They found that the ad-hoc nature of <em>retry<\/em> implementation complicates traditional program analysis but that large language models (LLMs) can address these issues effectively. The research also demonstrates that repurposing unit tests, combined with fault injection, can reveal various <em>retry<\/em> issues.<\/p>\n\n\n\n<p>Listen to research manager\u202fShan Lu and PhD candidate Bogdan Stoica discuss this work in a recent <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/abstracts-november-4-2024\/\" target=\"_blank\" rel=\"noreferrer noopener\">podcast episode<\/a>.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h4\" id=\"scaling-deep-learning-computation-over-the-inter-core-connected-intelligence-processor-with-t10yiqi-liu-yuqi-xue-yu-cheng-lingxiao-ma-ziming-miao-jilong-xue-jian-huang\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/scaling-deep-learning-computation-over-the-inter-core-connected-intelligence-processor\/\">Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor with T10<\/a><\/h3>\n\n\n\n<p><em>Yiqi Liu, Yuqi Xue, Yu Cheng, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/lingm\/\"><em>Lingxiao Ma<\/em><\/a><em>, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/zimiao\/\"><em>Ziming Miao<\/em><\/a><em>, Jilong Xue, Jian Huang<\/em><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1165\" height=\"180\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Scaling-Deep-Learning-Computation_AH.jpg\" alt=\"An example that maps a MatMul operator to two cores with the compute-shift style execution. Both (b) and (c) are valid compute-shift execution plans, but with different tradeoffs between memory footprint and communication overhead.\" class=\"wp-image-1099197\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Scaling-Deep-Learning-Computation_AH.jpg 1165w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Scaling-Deep-Learning-Computation_AH-300x46.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Scaling-Deep-Learning-Computation_AH-1024x158.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Scaling-Deep-Learning-Computation_AH-768x119.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Scaling-Deep-Learning-Computation_AH-240x37.jpg 240w\" sizes=\"auto, (max-width: 1165px) 100vw, 1165px\" \/><\/figure>\n\n\n\n<p>Despite advances in AI chips that enable high-bandwidth and low-latency inter-core memory access, deep learning compilers lack support for scalable inter-core connections, limiting their potential. To address this, the authors introduce T10, an end-to-end deep learning compiler to take advantage of inter-core communication and distributed on-chip memory. T10 introduces a distributed tensor abstraction, rTensor, and maps the computation and communication of tensor operators with a generalized compute-shift pattern to cores. T10 optimizes on-chip memory consumption and inter-core communication overhead, selecting the best execution plan.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h4\" id=\"silvanforge-a-schedule-guided-retargetable-compiler-for-decision-tree-inference\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/silvanforge-a-schedule-guided-retargetable-compiler-for-decision-tree-inference\/\"><u>SilvanForge: A Schedule-Guided Retargetable Compiler for Decision Tree Inference<\/u><\/a><\/h3>\n\n\n\n<p><em>Ashwin Prasad<\/em>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/srajendra\"><em>Sampath Rajendra<\/em><\/a><em>, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/krajan\/\"><em>Kaushik Rajan<\/em><\/a><em>, R Govindarajan, Uday Bondhugula<\/em><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1780\" height=\"871\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SilvanForge_AH.jpg\" alt=\"The figure provides an overview of schedule-guided compilation with SilvanForge. Compilation occurs through multiple intermediate representations (IR), with optimizations at each level and transitions from one level to the next guided by a schedule.\" class=\"wp-image-1099203\" style=\"width:646px;height:auto\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SilvanForge_AH.jpg 1780w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SilvanForge_AH-300x147.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SilvanForge_AH-1024x501.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SilvanForge_AH-768x376.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SilvanForge_AH-1536x752.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SilvanForge_AH-240x117.jpg 240w\" sizes=\"auto, (max-width: 1780px) 100vw, 1780px\" \/><\/figure>\n\n\n\n<p>SilvanForge is a schedule-guided retargetable compiler for decision tree-based models that explores various optimization options and automatically generates high-performance inference routines for CPUs and GPUs. It consists of two core components: a scheduling language to efficiently explore the optimization space, and a retargetable compiler that generates code for any specified schedule. By utilizing different data layouts, loop structures, and caching strategies, SilvanForge achieves portable performance across multiple hardware targets.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h4\" id=\"uncovering-nested-data-parallelism-and-data-reuse-in-dnn-computation-with-fractaltensor\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/uncovering-nested-data-parallelism-and-data-reuse-in-dnn-computation-with-fractaltensor\/\">Uncovering Nested Data Parallelism and Data Reuse in DNN Computation with FractalTensor<\/a><\/h3>\n\n\n\n<p><em>Siran Liu, Chengxiang Qi, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/yincao\/\"><em>Ying Cao<\/em><\/a><em>, Chao Yang, Weifang Hu, Xuanhua Shi, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/fanyang\/\"><em>Fan Yang<\/em><\/a><em>, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/maoyang\/\"><em>Mao Yang<\/em><\/a><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"932\" height=\"434\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Slide1_AH.jpg\" alt=\"FractalTensor enhances DAG programming for DNNs with a list-based ADT, allowing innovative DNN algorithmic ideas to be naturally expressed, their parallelism analyzed by the compiler, and subsequently translated into tile processing.\" class=\"wp-image-1099206\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Slide1_AH.jpg 932w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Slide1_AH-300x140.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Slide1_AH-768x358.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Slide1_AH-240x112.jpg 240w\" sizes=\"auto, (max-width: 932px) 100vw, 932px\" \/><\/figure>\n\n\n\n<p>Deep neural networks (DNNs) often use highly optimized tensor operators to speed up computation, but these operators are typically defined empirically, limiting cross-operator analysis and optimization. FractalTensor addresses this by introducing a nested list-based abstract data type (ADT), where each element is either a tensor with a static shape or another FractalTensor. DNNs are then defined using high-order compute operators like map\/reduce\/scan and data access operators like window\/stride, explicitly exposing nested data parallelism and fine-grained access patterns. This approach enables entire program analysis and optimization. This paper will only be available in the SOSP 2024 proceedings.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h3 class=\"wp-block-heading h4\" id=\"unearthing-semantic-checks-for-cloud-infrastructure-as-code-programs\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/unearthing-semantic-checks-for-cloud-infrastructure-as-code-programs\">Unearthing Semantic Checks for Cloud Infrastructure-as-Code Programs<\/a><\/h3>\n\n\n\n<p><em>Yiming Qiu, Patrick Tser Jern Kon, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/rybecket\/\"><em>Ryan Beckett<\/em><\/a><em>, Ang Chen<\/em><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"634\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Unearthing-Semantic-Checks-for-Clouds_AH-1024x634.jpg\" alt=\"The overall workflow of Zodiac. It starts by ingesting IaC repositories crawled from online sources. Based on a curated set of check templates using a semantic knowl\u0002edge base (KB) and a specification language, it generates a set of hypothesized checks. Next, it performs statistical filtering and interpolation to reduce false positives and fill in missing details with the help of LLMs. The hypothesized checks are then fed into the validation phase. For each such check, Zodiac identifies conforming instances that could be used as positive test cases, and it further mutates them to obtain corresponding negative test cases. A check is validated if the positive test case succeeds to deploy but its negative counterpart does not. To further resolve conflicts across different checks, Zodiac plans the order of negative test case generation and deployment via a validation scheduler.\" class=\"wp-image-1099209\" style=\"width:530px;height:auto\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Unearthing-Semantic-Checks-for-Clouds_AH-1024x634.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Unearthing-Semantic-Checks-for-Clouds_AH-300x186.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Unearthing-Semantic-Checks-for-Clouds_AH-768x476.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Unearthing-Semantic-Checks-for-Clouds_AH-1536x951.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Unearthing-Semantic-Checks-for-Clouds_AH-240x149.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Unearthing-Semantic-Checks-for-Clouds_AH.jpg 1600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>This research introduces Zodiac, a tool that uses semantic-guided mining and deployment-based validation pipelines to automatically uncover cloud IaC semantic checks. When applied to Microsoft Azure resources, Zodiac identified over 400 semantic checks that would cause deployment failures if violated, demonstrating its ability to detect cloud requirements that are difficult for state-of-the-art IaC tools to find.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"paper-and-tutorial\">Paper and tutorial<\/h2>\n\n\n\n<h3 class=\"wp-block-heading h4\" id=\"verus-a-practical-foundation-for-systems-verification\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/verus-a-practical-foundation-for-systems-verification\/\">Verus: A Practical Foundation for Systems Verification<\/a><\/h3>\n\n\n\n<p>Distinguished Artifact Award<br><em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/chrishaw\/\">Chris Hawblitzel<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/lorch\/\">Jay Lorch<\/a><\/em><\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"995\" height=\"515\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Verus_AH.jpg\" alt=\"Overview diagram of Verus. Verus offers powerful automated and semi-automated reasoning techniques that apply to the full system stack, as well as reasoning techniques tuned to specific stack levels. \" class=\"wp-image-1099212\" style=\"width:606px;height:auto\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Verus_AH.jpg 995w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Verus_AH-300x155.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Verus_AH-768x398.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Verus_AH-240x124.jpg 240w\" sizes=\"auto, (max-width: 995px) 100vw, 995px\" \/><\/figure>\n\n\n\n<p>This work presents an updated version of Verus, a tool that accelerates and simplifies formal verification of system software. It builds on previous advances for faster and more cost-effective verification of complex properties in realistic systems and now verifies code up to 61 times faster than existing methods. It has been evaluated on various systems encompassing 6,100 lines of code and 31,000 lines of proof. Verus is ready for broader adoption by developers using Rust who want to create more robust systems.<\/p>\n\n\n\n<p>Listen to principal researchers Chris Hawblitzel and Jay Lorch discuss this work in a recent <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/abstracts-november-5-2024\/\">podcast episode<\/a>.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"workshops\">Workshops<\/h2>\n\n\n\n<h3 class=\"wp-block-heading h4\" id=\"hot-topics-in-system-infrastructure\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/hotinfra24.github.io\/\" target=\"_blank\" rel=\"noopener noreferrer\">Hot Topics in System Infrastructure<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/h3>\n\n\n\n<p><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.linkedin.com\/in\/%C3%AD%C3%B1igo-goiri-025a6525\/\" target=\"_blank\" rel=\"noopener noreferrer\"><em>Inigo Goiri<\/em><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><em>, <\/em><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/pzardoshti\"><em>Pantea Zardoshti<\/em><\/a><\/p>\n\n\n\n<p>Researchers and engineers share recent findings and experiences while exploring new challenges and opportunities in building next-generation infrastructures, including AI, sustainable datacenters, and edge and cloud computing. Topics span the entire system stack, focusing on design and implementation, hardware architecture, operating systems, runtimes, and applications.<\/p>\n\n\n\n<h3 class=\"wp-block-heading h4\" id=\"practical-adoption-challenges-of-ml-for-systems\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/sites.google.com\/view\/pacmi\/home\" target=\"_blank\" rel=\"noopener noreferrer\">Practical Adoption Challenges of ML for Systems<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/h3>\n\n\n\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/chetanb\/\"><em>Chetan Bansal<\/em><\/a><\/p>\n\n\n\n<p>Despite significant progress in machine learning, deploying it in computer systems is rare due to non-machine learning challenges like feature stability, reliability, and availability. This workshop brings together researchers from academia and industry communities to foster communication and collaboration in addressing these practical issues and aligning research with real-world system deployment needs.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dots\"\/>\n\n\n\n<p>Discover how interdisciplinary systems research is <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/event\/sosp-2024\/systems-at-microsoft\/\">driving innovation at Microsoft<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Building resilient systems, scaling deep learning computation, and reproducing failures in production are just some of the ways Microsoft researchers are advancing the state of the art in computer systems research at SOSP 2024.<\/p>\n","protected":false},"author":43518,"featured_media":1099248,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[{"type":"user_nicename","value":"Tanakorn Leesatapornwongsa","user_id":"38754"},{"type":"user_nicename","value":"Suman Nath","user_id":"33753"},{"type":"user_nicename","value":"Shan Lu","user_id":"43215"},{"type":"user_nicename","value":"Jonathan Mace","user_id":"42546"},{"type":"user_nicename","value":"Madan Musuvathi","user_id":"32766"},{"type":"user_nicename","value":"Lingxiao Ma","user_id":"39769"},{"type":"user_nicename","value":"Ziming Miao","user_id":"42249"},{"type":"user_nicename","value":"Sampath Rajendra","user_id":"43107"},{"type":"user_nicename","value":"Kaushik Rajan","user_id":"32574"},{"type":"user_nicename","value":"Ying Cao","user_id":"37571"},{"type":"user_nicename","value":"Fan Yang","user_id":"31782"},{"type":"user_nicename","value":"Mao Yang","user_id":"32798"},{"type":"user_nicename","value":"Ryan Beckett","user_id":"37775"},{"type":"user_nicename","value":"Chris Hawblitzel","user_id":"31425"},{"type":"user_nicename","value":"Jay Lorch","user_id":"32732"},{"type":"user_nicename","value":"Pantea Zardoshti","user_id":"40717"},{"type":"user_nicename","value":"Chetan Bansal","user_id":"31394"}],"msr_hide_image_in_river":null,"footnotes":""},"categories":[1],"tags":[],"research-area":[13547],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[269148,243984,269142],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-1099176","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-systems-and-networking","msr-locale-en_us","msr-post-option-approved-for-river","msr-post-option-blog-homepage-featured","msr-post-option-include-in-river"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199565],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[144927,920058],"related-projects":[554055],"related-events":[1073397],"related-researchers":[{"type":"user_nicename","value":"Tanakorn Leesatapornwongsa","user_id":38754,"display_name":"Tanakorn Leesatapornwongsa","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/taleesat\/\" aria-label=\"Visit the profile page for Tanakorn Leesatapornwongsa\">Tanakorn Leesatapornwongsa<\/a>","is_active":false,"last_first":"Leesatapornwongsa, Tanakorn","people_section":0,"alias":"taleesat"},{"type":"user_nicename","value":"Suman Nath","user_id":33753,"display_name":"Suman Nath","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sumann\/\" aria-label=\"Visit the profile page for Suman Nath\">Suman Nath<\/a>","is_active":false,"last_first":"Nath, Suman","people_section":0,"alias":"sumann"},{"type":"user_nicename","value":"Shan Lu","user_id":43215,"display_name":"Shan Lu","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/shanlu\/\" aria-label=\"Visit the profile page for Shan Lu\">Shan Lu<\/a>","is_active":false,"last_first":"Lu, Shan","people_section":0,"alias":"shanlu"},{"type":"user_nicename","value":"Jonathan Mace","user_id":42546,"display_name":"Jonathan Mace","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jonathanmace\/\" aria-label=\"Visit the profile page for Jonathan Mace\">Jonathan Mace<\/a>","is_active":false,"last_first":"Mace, Jonathan","people_section":0,"alias":"jonathanmace"},{"type":"user_nicename","value":"Madan Musuvathi","user_id":32766,"display_name":"Madan Musuvathi","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/madanm\/\" aria-label=\"Visit the profile page for Madan Musuvathi\">Madan Musuvathi<\/a>","is_active":false,"last_first":"Musuvathi, Madan","people_section":0,"alias":"madanm"},{"type":"user_nicename","value":"Sampath Rajendra","user_id":43107,"display_name":"Sampath Rajendra","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/srajendra\/\" aria-label=\"Visit the profile page for Sampath Rajendra\">Sampath Rajendra<\/a>","is_active":false,"last_first":"Rajendra, Sampath","people_section":0,"alias":"srajendra"},{"type":"user_nicename","value":"Kaushik Rajan","user_id":32574,"display_name":"Kaushik Rajan","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/krajan\/\" aria-label=\"Visit the profile page for Kaushik Rajan\">Kaushik Rajan<\/a>","is_active":false,"last_first":"Rajan, Kaushik","people_section":0,"alias":"krajan"},{"type":"user_nicename","value":"Fan Yang","user_id":31782,"display_name":"Fan Yang","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/fanyang\/\" aria-label=\"Visit the profile page for Fan Yang\">Fan Yang<\/a>","is_active":false,"last_first":"Yang, Fan","people_section":0,"alias":"fanyang"},{"type":"user_nicename","value":"Ryan Beckett","user_id":37775,"display_name":"Ryan Beckett","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/rybecket\/\" aria-label=\"Visit the profile page for Ryan Beckett\">Ryan Beckett<\/a>","is_active":false,"last_first":"Beckett, Ryan","people_section":0,"alias":"rybecket"},{"type":"user_nicename","value":"Chris Hawblitzel","user_id":31425,"display_name":"Chris Hawblitzel","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/chrishaw\/\" aria-label=\"Visit the profile page for Chris Hawblitzel\">Chris Hawblitzel<\/a>","is_active":false,"last_first":"Hawblitzel, Chris","people_section":0,"alias":"chrishaw"},{"type":"user_nicename","value":"Jay Lorch","user_id":32732,"display_name":"Jay Lorch","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/lorch\/\" aria-label=\"Visit the profile page for Jay Lorch\">Jay Lorch<\/a>","is_active":false,"last_first":"Lorch, Jay","people_section":0,"alias":"lorch"},{"type":"user_nicename","value":"Chetan Bansal","user_id":31394,"display_name":"Chetan Bansal","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/chetanb\/\" aria-label=\"Visit the profile page for Chetan Bansal\">Chetan Bansal<\/a>","is_active":false,"last_first":"Bansal, Chetan","people_section":0,"alias":"chetanb"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-960x540.jpg\" class=\"img-object-cover\" alt=\"SOSP 2024 on a blue and green gradient background\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/SOSP-2024-Conference-round-up-BlogHeroFeature-1400x788-1.jpg 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"November 4, 2024","formattedExcerpt":"Building resilient systems, scaling deep learning computation, and reproducing failures in production are just some of the ways Microsoft researchers are advancing the state of the art in computer systems research at SOSP 2024.","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1099176","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/43518"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1099176"}],"version-history":[{"count":30,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1099176\/revisions"}],"predecessor-version":[{"id":1101951,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1099176\/revisions\/1101951"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1099248"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1099176"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1099176"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1099176"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1099176"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1099176"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1099176"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1099176"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1099176"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1099176"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1099176"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1099176"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}