{"id":1098390,"date":"2024-11-04T09:30:00","date_gmt":"2024-11-04T17:30:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=1098390"},"modified":"2024-12-05T08:10:05","modified_gmt":"2024-12-05T16:10:05","slug":"abstracts-november-4-2024","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/abstracts-november-4-2024\/","title":{"rendered":"Abstracts: November 4, 2024"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Shan-and-Bogdan_Abstracts_Hero_Feature_No_Text_1400x788.jpg\" alt=\"Outlined illustrations of Shan Lu and Bogdan Stoica for the Microsoft Research Podcast.\" class=\"wp-image-1098543\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Shan-and-Bogdan_Abstracts_Hero_Feature_No_Text_1400x788.jpg 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Shan-and-Bogdan_Abstracts_Hero_Feature_No_Text_1400x788-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Shan-and-Bogdan_Abstracts_Hero_Feature_No_Text_1400x788-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Shan-and-Bogdan_Abstracts_Hero_Feature_No_Text_1400x788-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Shan-and-Bogdan_Abstracts_Hero_Feature_No_Text_1400x788-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Shan-and-Bogdan_Abstracts_Hero_Feature_No_Text_1400x788-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Shan-and-Bogdan_Abstracts_Hero_Feature_No_Text_1400x788-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Shan-and-Bogdan_Abstracts_Hero_Feature_No_Text_1400x788-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Shan-and-Bogdan_Abstracts_Hero_Feature_No_Text_1400x788-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/10\/Shan-and-Bogdan_Abstracts_Hero_Feature_No_Text_1400x788-1280x720.jpg 1280w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n<div class=\"wp-block-msr-podcast-container my-4\">\n\t<iframe loading=\"lazy\" src=\"https:\/\/player.blubrry.com\/?podcast_id=138107027&modern=1\" class=\"podcast-player\" frameborder=\"0\" height=\"164px\" width=\"100%\" scrolling=\"no\" title=\"Podcast Player\"><\/iframe>\n<\/div>\n\n\n\n<p>Members of the research community at Microsoft work continuously to advance their respective fields.&nbsp;<em>Abstracts<\/em>&nbsp;brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.<\/p>\n\n\n\n<p>In this episode, Senior Principal Research Manager&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/shanlu\/\">Shan Lu<\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/bastoica.github.io\/\" target=\"_blank\" rel=\"noopener noreferrer\">Bogdan Stoica<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, a PhD candidate at the University of Chicago, join host Gretchen Huizinga to discuss \u201c<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/if-at-first-you-dont-succeed-try-try-again-insights-and-llm-informed-tooling-for-detecting-retry-bugs-in-software-systems\/?msockid=35739e94ab6c69d41b738b93aa076831\">If At First You Don\u2019t Succeed, Try, Try, Again &#8230; ? Insights and LLM-informed Tooling for Detecting Retry Bugs in Software Systems<\/a>.\u201d In the paper, which was accepted at this year\u2019s Symposium on Operating Systems Principles, or SOSP, Lu, Stoica, and their coauthors examine typical retry issues and present techniques that leverage traditional program analysis and large language models to help detect them.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/if-at-first-you-dont-succeed-try-try-again-insights-and-llm-informed-tooling-for-detecting-retry-bugs-in-software-systems\/\">Read the paper<\/a><\/div>\n<\/div>\n\n\n\n<section class=\"wp-block-msr-subscribe-to-podcast subscribe-to-podcast\">\n\t<div class=\"subscribe-to-podcast__inner border-top border-bottom border-width-2\">\n\t\t<h2 class=\"h5 subscribe-to-podcast__heading\">\n\t\t\tSubscribe to the <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/podcast\">Microsoft Research Podcast<\/a>:\t\t<\/h2>\n\t\t<ul class=\"subscribe-to-podcast__list list-unstyled\">\n\t\t\t\t\t\t\t<li class=\"subscribe-to-podcast__list-item\">\n\t\t\t\t\t<a class=\"subscribe-to-podcast__link\" href=\"https:\/\/itunes.apple.com\/us\/podcast\/microsoft-research-a-podcast\/id1318021537?mt=2\" target=\"_blank\" rel=\"noreferrer noopener\">\n\t\t\t\t\t\t<svg class=\"subscribe-to-podcast__svg\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" fill=\"black\" viewBox=\"0 0 32 32\">  <path d=\"M7.12 0c-3.937-0.011-7.131 3.183-7.12 7.12v17.76c-0.011 3.937 3.183 7.131 7.12 7.12h17.76c3.937 0.011 7.131-3.183 7.12-7.12v-17.76c0.011-3.937-3.183-7.131-7.12-7.12zM15.817 3.421c3.115 0 5.932 1.204 8.079 3.453 1.631 1.693 2.547 3.489 3.016 5.855 0.161 0.787 0.161 2.932 0.009 3.817-0.5 2.817-2.041 5.339-4.317 7.063-0.812 0.615-2.797 1.683-3.115 1.683-0.12 0-0.129-0.12-0.077-0.615 0.099-0.792 0.192-0.953 0.64-1.141 0.713-0.296 1.932-1.167 2.677-1.911 1.301-1.303 2.229-2.932 2.677-4.719 0.281-1.1 0.244-3.543-0.063-4.672-0.969-3.595-3.907-6.385-7.5-7.136-1.041-0.213-2.943-0.213-4 0-3.636 0.751-6.647 3.683-7.563 7.371-0.245 1.004-0.245 3.448 0 4.448 0.609 2.443 2.188 4.681 4.255 6.015 0.407 0.271 0.896 0.547 1.1 0.631 0.447 0.192 0.547 0.355 0.629 1.14 0.052 0.485 0.041 0.62-0.072 0.62-0.073 0-0.62-0.235-1.199-0.511l-0.052-0.041c-3.297-1.62-5.407-4.364-6.177-8.016-0.187-0.943-0.224-3.187-0.036-4.052 0.479-2.323 1.396-4.135 2.921-5.739 2.199-2.319 5.027-3.543 8.172-3.543zM16 7.172c0.541 0.005 1.068 0.052 1.473 0.14 3.715 0.828 6.344 4.543 5.833 8.229-0.203 1.489-0.713 2.709-1.619 3.844-0.448 0.573-1.537 1.532-1.729 1.532-0.032 0-0.063-0.365-0.063-0.803v-0.808l0.552-0.661c2.093-2.505 1.943-6.005-0.339-8.296-0.885-0.896-1.912-1.423-3.235-1.661-0.853-0.161-1.031-0.161-1.927-0.011-1.364 0.219-2.417 0.744-3.355 1.672-2.291 2.271-2.443 5.791-0.348 8.296l0.552 0.661v0.813c0 0.448-0.037 0.807-0.084 0.807-0.036 0-0.349-0.213-0.683-0.479l-0.047-0.016c-1.109-0.885-2.088-2.453-2.495-3.995-0.244-0.932-0.244-2.697 0.011-3.625 0.672-2.505 2.521-4.448 5.079-5.359 0.547-0.193 1.509-0.297 2.416-0.281zM15.823 11.156c0.417 0 0.828 0.084 1.131 0.24 0.645 0.339 1.183 0.989 1.385 1.677 0.62 2.104-1.609 3.948-3.631 3.005h-0.015c-0.953-0.443-1.464-1.276-1.475-2.36 0-0.979 0.541-1.828 1.484-2.328 0.297-0.156 0.709-0.235 1.125-0.235zM15.812 17.464c1.319-0.005 2.271 0.463 2.625 1.291 0.265 0.62 0.167 2.573-0.292 5.735-0.307 2.208-0.479 2.765-0.905 3.141-0.589 0.52-1.417 0.667-2.209 0.385h-0.004c-0.953-0.344-1.157-0.808-1.553-3.527-0.452-3.161-0.552-5.115-0.285-5.735 0.348-0.823 1.296-1.285 2.624-1.291z\"\/><\/svg>\n\t\t\t\t\t\t<span class=\"subscribe-to-podcast__link-text\">Apple Podcasts<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/li>\n\t\t\t\n\t\t\t\t\t\t\t<li class=\"subscribe-to-podcast__list-item\">\n\t\t\t\t\t<a class=\"subscribe-to-podcast__link\" href=\"https:\/\/subscribebyemail.com\/www.blubrry.com\/feeds\/microsoftresearch.xml\" target=\"_blank\" rel=\"noreferrer noopener\">\n\t\t\t\t\t\t<svg class=\"subscribe-to-podcast__svg\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" fill=\"none\" viewBox=\"0 0 32 32\"><path fill=\"currentColor\" d=\"M6.4 6a2.392 2.392 0 00-2.372 2.119L16 15.6l11.972-7.481A2.392 2.392 0 0025.6 6H6.4zM4 10.502V22.8a2.4 2.4 0 002.4 2.4h19.2a2.4 2.4 0 002.4-2.4V10.502l-11.365 7.102a1.2 1.2 0 01-1.27 0L4 10.502z\"\/><\/svg>\n\t\t\t\t\t\t<span class=\"subscribe-to-podcast__link-text\">Email<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/li>\n\t\t\t\n\t\t\t\t\t\t\t<li class=\"subscribe-to-podcast__list-item\">\n\t\t\t\t\t<a class=\"subscribe-to-podcast__link\" href=\"https:\/\/subscribeonandroid.com\/www.blubrry.com\/feeds\/microsoftresearch.xml\" target=\"_blank\" rel=\"noreferrer noopener\">\n\t\t\t\t\t\t<svg class=\"subscribe-to-podcast__svg\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" fill=\"none\" viewBox=\"0 0 32 32\"><path fill=\"currentColor\" d=\"M12.414 4.02c-.062.012-.126.023-.18.06a.489.489 0 00-.12.675L13.149 6.3c-1.6.847-2.792 2.255-3.18 3.944h13.257c-.388-1.69-1.58-3.097-3.179-3.944l1.035-1.545a.489.489 0 00-.12-.675.492.492 0 00-.675.135l-1.14 1.68a7.423 7.423 0 00-2.55-.45c-.899 0-1.758.161-2.549.45l-1.14-1.68a.482.482 0 00-.494-.195zm1.545 3.824a.72.72 0 110 1.44.72.72 0 010-1.44zm5.278 0a.719.719 0 110 1.44.719.719 0 110-1.44zM8.44 11.204A1.44 1.44 0 007 12.644v6.718c0 .795.645 1.44 1.44 1.44.168 0 .33-.036.48-.09v-9.418a1.406 1.406 0 00-.48-.09zm1.44 0V21.76c0 .793.646 1.44 1.44 1.44h10.557c.793 0 1.44-.647 1.44-1.44V11.204H9.878zm14.876 0c-.169 0-.33.035-.48.09v9.418c.15.052.311.09.48.09a1.44 1.44 0 001.44-1.44v-6.719a1.44 1.44 0 00-1.44-1.44zM11.8 24.16v1.92a1.92 1.92 0 003.84 0v-1.92h-3.84zm5.759 0v1.92a1.92 1.92 0 003.84 0v-1.92h-3.84z\"\/><\/svg>\n\t\t\t\t\t\t<span class=\"subscribe-to-podcast__link-text\">Android<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/li>\n\t\t\t\n\t\t\t\t\t\t\t<li class=\"subscribe-to-podcast__list-item\">\n\t\t\t\t\t<a class=\"subscribe-to-podcast__link\" href=\"https:\/\/open.spotify.com\/show\/4ndjUXyL0hH1FXHgwIiTWU\" target=\"_blank\" rel=\"noreferrer noopener\">\n\t\t\t\t\t\t<svg class=\"subscribe-to-podcast__svg\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" fill=\"none\" viewBox=\"0 0 32 32\"><path fill=\"currentColor\" d=\"M16 4C9.383 4 4 9.383 4 16s5.383 12 12 12 12-5.383 12-12S22.617 4 16 4zm5.08 17.394a.781.781 0 01-1.086.217c-1.29-.86-3.477-1.434-5.303-1.434-1.937.002-3.389.477-3.403.482a.782.782 0 11-.494-1.484c.068-.023 1.71-.56 3.897-.562 1.826 0 4.365.492 6.171 1.696.36.24.457.725.217 1.085zm1.56-3.202a.895.895 0 01-1.234.286c-2.338-1.457-4.742-1.766-6.812-1.747-2.338.02-4.207.466-4.239.476a.895.895 0 11-.488-1.723c.145-.041 2.01-.5 4.564-.521 2.329-.02 5.23.318 7.923 1.995.419.26.547.814.286 1.234zm1.556-3.745a1.043 1.043 0 01-1.428.371c-2.725-1.6-6.039-1.94-8.339-1.942h-.033c-2.781 0-4.923.489-4.944.494a1.044 1.044 0 01-.474-2.031c.096-.023 2.385-.55 5.418-.55h.036c2.558.004 6.264.393 9.393 2.23.497.292.663.931.371 1.428z\"\/><\/svg>\n\t\t\t\t\t\t<span class=\"subscribe-to-podcast__link-text\">Spotify<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/li>\n\t\t\t\n\t\t\t\t\t\t\t<li class=\"subscribe-to-podcast__list-item\">\n\t\t\t\t\t<a class=\"subscribe-to-podcast__link\" href=\"https:\/\/www.blubrry.com\/feeds\/microsoftresearch.xml\" target=\"_blank\" rel=\"noreferrer noopener\">\n\t\t\t\t\t\t<svg class=\"subscribe-to-podcast__svg\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" fill=\"none\" viewBox=\"0 0 32 32\"><path fill=\"currentColor\" d=\"M6.667 4a2.676 2.676 0 00-2.612 2.13v.003c-.036.172-.055.35-.055.534v18.666c0 .183.019.362.055.534v.003a2.676 2.676 0 002.076 2.075h.002c.172.036.35.055.534.055h18.666A2.676 2.676 0 0028 25.333V6.667a2.676 2.676 0 00-2.13-2.612h-.003A2.623 2.623 0 0025.333 4H6.667zM8 8h1.333C17.42 8 24 14.58 24 22.667V24h-2.667v-1.333c0-6.618-5.382-12-12-12H8V8zm0 5.333h1.333c5.146 0 9.334 4.188 9.334 9.334V24H16v-1.333A6.674 6.674 0 009.333 16H8v-2.667zM10 20a2 2 0 11-.001 4.001A2 2 0 0110 20z\"\/><\/svg>\n\t\t\t\t\t\t<span class=\"subscribe-to-podcast__link-text\">RSS Feed<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/li>\n\t\t\t\t\t<\/ul>\n\t<\/div>\n<\/section>\n\n\n<div class=\"wp-block-msr-show-more\">\n\t<div class=\"bg-neutral-100 p-5\">\n\t\t<div class=\"show-more-show-less\">\n\t\t\t<div>\n\t\t\t\t<span>\n\t\t\t\t\t\n\n<h2 class=\"wp-block-heading\" id=\"transcript-1\">Transcript<\/h2>\n\n\n\n<p>[MUSIC]<\/p>\n\n\n\n<p><strong>GRETCHEN HUIZINGA:<\/strong> Welcome to <em>Abstracts<\/em>, a Microsoft Research Podcast that puts the spotlight on world-class research in brief. I\u2019m Dr. Gretchen Huizinga. In this series, members of the research community at Microsoft give us a quick snapshot\u2014or a <em>podcast abstract<\/em>\u2014of their new and noteworthy papers.<\/p>\n\n\n\n<p>[MUSIC FADES]<\/p>\n\n\n\n<p>Today I&#8217;m talking to Dr. Shan Lu, a senior principal research manager at Microsoft Research, and Bogdan Stoica, also known as Bo, a doctoral candidate in computer science at the University of Chicago. Shan and Bogdan are coauthors of a paper called \u201cIf at First You Don&#8217;t Succeed, Try, Try, Again \u2026? Insights and LLM-informed Tooling for Detecting Retry Bugs in Software Systems.\u201d And this paper was presented at this year&#8217;s Symposium on Operating Systems Principles, or SOSP. Shan and Bo, thanks for joining us on <em>Abstracts<\/em> today!<\/p>\n\n\n\n\t\t\t\t<\/span>\n\t\t\t\t<span id=\"show-more-show-less-toggle-1\" class=\"show-more-show-less-toggleable-content\">\n\t\t\t\t\t\n\n\n\n<p><strong>SHAN LU: <\/strong>Thank you.<\/p>\n\n\n\n<p><strong>BOGDAN STOICA: <\/strong>Thanks for having us.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Shan, let&#8217;s kick things off with you. Give us a brief overview of your paper. What problem or issue does it address, and why should we care about it?<\/p>\n\n\n\n<p><strong>LU:<\/strong> Yeah, so basically from the title, we are looking at retry bugs in software systems. So what retry means is that people may not realize for big software like the ones that run in Microsoft, all kinds of unexpected failures\u2014software failure, hardware failure\u2014may happen. So just to make our software system robust, there&#8217;s often a <em>retry<\/em> mechanism built in. So if something unexpected happens, a task, a request, a job will be re-executed. And what this paper talks about is, it&#8217;s actually very difficult to implement this retry mechanism correctly. So in this paper, we do a study to understand what are typical retry problems and we offer a solution to detecting these problems.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Bo, this clearly isn&#8217;t a new problem. What research does your paper build on, and how does <em>your<\/em> research challenge or add to it?<\/p>\n\n\n\n<p><strong>STOICA:<\/strong> Right, so retry is a well-known mechanism and is widely used. And retry bugs, in particular, have been identified in other papers as root causes for all sorts of failures but never have been studied as a standalone class of bugs. And what I mean by that, nobody looked into, why is it so difficult to implement retry? What are the symptoms that occur when you <em>don&#8217;t<\/em> implement retry correctly? What are the causes of why developers struggle to implement retry correctly? We built on a few key bug-finding ideas that have been looked at by other papers but never in this context. We use fault injection. We repurpose existing unit tests to trigger this type of bugs as opposed to asking developers to write specialized tests to trigger retry bugs. So we\u2019re, kind of, making the developer&#8217;s job easier in a sense. And in this pipeline, we also rely on large language models to augment the program and the code analysis that goes behind the fault injection and the reutilization of existing tests.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Have large language models not been utilized much in this arena?<\/p>\n\n\n\n<p><strong>LU:<\/strong> I want to say that, you know, actually this work was started about two years ago. And at that time, large language model was really in its infancy and people just started exploring what large language model can help us in terms of improving software reliability. And our group, and together with, you know, actually same set of authors from Microsoft Research, we actually did some of the first things in a <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/hotgpt-how-to-make-software-documentation-more-useful-with-a-large-language-model\/\">workshop paper<\/a> just to see what kind of things that we were able to do before like, you know, finding bugs can now be replicated by using large language model.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> OK \u2026<\/p>\n\n\n\n<p><strong>LU: <\/strong>But at that time, we were not very happy because, you know, just use large language model to do something people were able to do using traditional program analysis, I mean, it seems cool, right, but does not add new functionality. So I would say what is new, at least when we started this project, is we were really thinking, hey, are there anything, right, are there some program analysis, are there some bug finding that we were not able to do using traditional program analysis but actually can be <em>enabled<\/em> by large language model.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Gotcha \u2026<\/p>\n\n\n\n<p><strong>LU: <\/strong>And so that was at, you know, what I feel like was novel at least, you know, when we worked on this. But of course, you know, large language model is a field that is moving so fast. People are, you know, finding new ways to using it every day. So yeah.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Right. Well, in your paper, you say that retry functionality is commonly <em>under<\/em>tested and thus prone to problems slipping into production. Why would it be undertested if it&#8217;s such a problem?<\/p>\n\n\n\n<p><strong>STOICA:<\/strong> So testing retry is difficult because what you need is to simulate the systemwide conditions that lead to retry. That often means simulating external transient errors that might happen on the system that runs your application. And to do this during testing and capture this in a small unit test is difficult.<\/p>\n\n\n\n<p><strong>LU:<\/strong> I think, actually, Bogdan said this very well. It&#8217;s like, why do we need a retry? It&#8217;s, like, when unexpected failure happen, right. And this is, like, something like Bogdan mentioned, like external transient error such as my network card suddenly does not work, right. And this may occur, you know, only for, say, one second and then it goes back on. But this one second may cause some job to fail and need retry. So during normal testing, these kind of unexpected things rarely, rarely happen, if at all, and it&#8217;s also difficult to simulate. That&#8217;s why it&#8217;s just not well tested.<s><\/s><\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Well, Shan, let&#8217;s talk about methodology. Talk a bit about how you tackled this work and why you chose the approach you did for this particular problem.<\/p>\n\n\n\n<p><strong>LU:<\/strong> Yeah, so I think this work includes two parts. One is a systematic study. We study several big open-source systems to see whether there are retry-related problems in this real system. Of course there are. And then we did a very systematic categorization to understand the common characteristics. And the second part is about, you know, detecting. And in terms of method, we have used, particularly in the detecting part, we actually used a hybrid of techniques of traditional static program analysis. We used this large language model-enabled program analysis. In this case, imagine we just asked a large language model saying, hey, tell us, are there any retry implemented in this code? If there is, where it is, right. And then we also use, as Bogdan mentioned, we repurposed unit test to help us to execute, you know, the part of code that large language model tell us there may be a retry. And addition to that, we also used fault injection, which means we simulate those transient, external, environmental failures such as network failures that very rarely would occur by itself.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Well, Bo, I love the part in every paper where the researchers say, \u201cAnd what we found was &#8230;\u201d So tell us, what did you find?<\/p>\n\n\n\n<p><strong>STOICA:<\/strong> Well, we found that implementing retry is difficult and complex! Not only find new bugs because, yes, that was kind of the end goal of the paper but also try to understand why these bugs are happening. As Shan mentioned, we started this project with a bug study. We looked at retry bugs across eight to 10 applications that are widely popular, widely used, and that the community is actively contributing to them. And the experiences of both users and developers, if we can condense that\u2014what do you think about retries?\u2014is that, yeah, they&#8217;re frustrated because it&#8217;s a simple mechanism, but there&#8217;s so many pitfalls that you have to be aware of. So I think that&#8217;s the biggest takeaway. Another takeaway is that when I was thinking about bug-finding tools, I was having this somewhat myopic view of, you know, you instrument at the program statement level, you figure out relationships between different lines of code and anti-patterns, and then you build your tools to find those anti-patterns. Well, with retry, this kind of gets thrown out the window because retry is a mechanism. It&#8217;s not just one line of code. It is multiple lines of code that span multiple functions, multiple methods, and multiple files. And you need to think about retry holistically to find these issues. And that&#8217;s one of the reasons we used large language models, because traditional static analysis or traditional program analysis cannot capture this. And, you know, large language models turns out to be actually great at this task, and we try to harness the, I would say, <em>fuzzy code comprehension<\/em> capabilities of large language models to help us find retry bugs.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Well, Shan, research findings are important, but real-world impact is the ultimate goal here. So who will this research help most and why?<\/p>\n\n\n\n<p><strong>LU:<\/strong> Yeah, that&#8217;s a great question. I would consider several groups of people. One is hopefully, you know, people who actually build, design real systems will find our study interesting. I hope it will resonate with them about those difficulties in implementing retry because we studied a set of systems and there was a little bit of comparison about how different retry mechanisms are actually used in different systems. And you can actually see that, you know, this different mechanism, you know, they have pros and cons, and we have a little bit of, you know, suggestion about what might be good practice. That&#8217;s the first group. The second group is, our tool actually did find, I would say, a relatively large number of retry problems in the latest version of every system we tried, and we find these problems, right, by repurposing existing unit tests. So I hope our tool will be used, you know, in the field by, you know, being maybe integrated with future unit testing so that our future system will become more robust. And I guess the third type of, you know, audience I feel like may benefit by reading our work, knowing our work: the people who are thinking about how to use large language model. And as I mentioned, I think a takeaway is large language model can repeat, can replace some of things we were able to do using traditional program analysis <em>and<\/em> it can do more, right, for those fuzzy code comprehension\u2013related things. Because for traditional program analysis, we need to precisely describe what I want. Like, oh, I need a loop. I need a WRITE statement, right. For large language model, it&#8217;s imprecise by nature, and that imprecision sometimes actually match with the type of things we&#8217;re looking for.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Interesting. Well, both of you have just, sort of, addressed nuggets of this research. And so the question that I normally ask now is, if there&#8217;s one thing you want our listeners to take away from the work, what would it be? So let&#8217;s give it a try and say, OK, in a sentence or less, if I&#8217;m reading this paper and it matters to me, what&#8217;s my big takeaway? What is my big \u201caha\u201d that this research helps me with?<\/p>\n\n\n\n<p><strong>STOICA:<\/strong> So the biggest takeaway of this paper is not to be afraid to integrate large language models in your bug-finding or testing pipelines. And I&#8217;m saying this knowing full well how imprecise large language models can be. But as long as you can trust but verify, as long as you have a way of checking what these models are outputting, you can effectively insert them into your testing framework. And I think this paper is showing one use case and bring us closer to, you know, having it integrated more ubiquitously.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Well, Shan, let&#8217;s finish up with ongoing research challenges and open questions in this field. I think you&#8217;ve both alluded to the difficulties that you face. Tell us what&#8217;s up next on your research agenda in this field.<\/p>\n\n\n\n<p><strong>LU:<\/strong> Yeah, so for me, personally, I mean, I learned a lot from this project and particularly this idea of leveraging large language model but also as a way to validate its result. I&#8217;m actually working on how to leverage large language model to verify the correctness of code, code that may be generated by large language model itself. So it&#8217;s not exactly, you know, a follow-up of this work, but I would say at idea, you know, philosophical level, it is something that is along this line of, you know, leverage large language model, leverage its creativity, leverage its \u2026 sometimes, you know \u2026 leverage its imprecision but has a way, you know, to control it, to verify it. That&#8217;s what I&#8217;m working on now.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Yeah \u2026 Bo, you&#8217;re finishing up your doctorate. What&#8217;s next on your agenda?<\/p>\n\n\n\n<p><strong>STOICA:<\/strong> So we&#8217;re thinking of, as Shan mentioned, exploring what large language models can do in this bug-finding\/testing arena further and harvesting their imprecision. I think there are a lot of great problems that traditional code analysis has tried to tackle, but it was difficult. So in that regard, we&#8217;re looking at performance issues and how large language models can help identify and diagnose those issues because my PhD was mostly focused, up until this point, on correctness. And I think performance inefficiencies are such a wider field and with a lot of exciting problems. And they do have this inherent imprecision and fuzziness to them that also large language models have, so I hope that combining the two imprecisions maybe gives us something a little bit more precise.<\/p>\n\n\n\n<p><strong>HUIZINGA:<\/strong> Well, this is important research and very, very interesting.<\/p>\n\n\n\n<p>[MUSIC]<\/p>\n\n\n\n<p>Shan Lu, Bogdan Stoica, thanks for joining us today. And to our listeners, thanks for tuning in. If you&#8217;re interested in learning more about this paper, you can find a link at aka.ms\/abstracts. And you can also find it on the SOSP website. See you next time on <em>Abstracts<\/em>!<\/p>\n\n\n\n<p>[MUSIC FADES]<\/p>\n\n\t\t\t\t<\/span>\n\t\t\t<\/div>\n\t\t\t<button\n\t\t\t\tclass=\"action-trigger glyph-prepend mt-2 mb-0 show-more-show-less-toggle\"\n\t\t\t\taria-expanded=\"false\"\n\t\t\t\tdata-show-less-text=\"Show less\"\n\t\t\t\ttype=\"button\"\n\t\t\t\taria-controls=\"show-more-show-less-toggle-1\"\n\t\t\t\taria-label=\"Show more content\"\n\t\t\t\tdata-alternate-aria-label=\"Show less content\">\n\t\t\t\tShow more\t\t\t<\/button>\n\t\t<\/div>\n\t<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>In their 2024 SOSP paper, researchers explore a common\u2014though often undertested\u2014software system issue: retry bugs. Research manager\u00a0Shan Lu and PhD candidate Bogdan Stoica share how they\u2019re combining traditional program analysis and LLMs to address the challenge.<\/p>\n","protected":false},"author":43518,"featured_media":1109838,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"https:\/\/player.blubrry.com\/id\/138107027","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_hide_image_in_river":null,"footnotes":""},"categories":[240054],"tags":[],"research-area":[13547],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[269148,269142,243990],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[268128],"class_list":["post-1098390","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-msr-podcast","msr-research-area-systems-and-networking","msr-locale-en_us","msr-post-option-approved-for-river","msr-post-option-include-in-river","msr-post-option-podcast-featured","msr-podcast-series-abstracts"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"https:\/\/player.blubrry.com\/id\/138107027","podcast_episode":"","msr_research_lab":[199565],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[144927,920058],"related-projects":[],"related-events":[1073397],"related-researchers":[{"type":"guest","value":"gretchen-huizinga-2","user_id":"444834","display_name":"Gretchen Huizinga","author_link":"<a href=\"https:\/\/www.linkedin.com\/in\/gretchen-huizinga-phd-3a4b2921?trk=people-guest_people_search-card\" aria-label=\"Visit the profile page for Gretchen Huizinga\">Gretchen Huizinga<\/a>","is_active":true,"last_first":"Huizinga, Gretchen","people_section":0,"alias":"gretchen-huizinga-2"},{"type":"user_nicename","value":"Shan Lu","user_id":43215,"display_name":"Shan Lu","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/shanlu\/\" aria-label=\"Visit the profile page for Shan Lu\">Shan Lu<\/a>","is_active":false,"last_first":"Lu, Shan","people_section":0,"alias":"shanlu"},{"type":"guest","value":"bogdan-stoica","user_id":"1098405","display_name":"Bogdan Stoica","author_link":"<a href=\"https:\/\/bastoica.github.io\/\" aria-label=\"Visit the profile page for Bogdan Stoica\">Bogdan Stoica<\/a>","is_active":true,"last_first":"Stoica, Bogdan","people_section":0,"alias":"bogdan-stoica"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/11\/Shan-and-Bogdan_Abstracts_Hero_Feature_River_No_Text_1400x788-960x540.jpg\" class=\"img-object-cover\" alt=\"Outlined illustrations of Shan Lu and Bogdan Stoica for the Microsoft Research Podcast.\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/11\/Shan-and-Bogdan_Abstracts_Hero_Feature_River_No_Text_1400x788-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/11\/Shan-and-Bogdan_Abstracts_Hero_Feature_River_No_Text_1400x788-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/11\/Shan-and-Bogdan_Abstracts_Hero_Feature_River_No_Text_1400x788-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/11\/Shan-and-Bogdan_Abstracts_Hero_Feature_River_No_Text_1400x788-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/11\/Shan-and-Bogdan_Abstracts_Hero_Feature_River_No_Text_1400x788-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/11\/Shan-and-Bogdan_Abstracts_Hero_Feature_River_No_Text_1400x788-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/11\/Shan-and-Bogdan_Abstracts_Hero_Feature_River_No_Text_1400x788-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/11\/Shan-and-Bogdan_Abstracts_Hero_Feature_River_No_Text_1400x788-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/11\/Shan-and-Bogdan_Abstracts_Hero_Feature_River_No_Text_1400x788-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/11\/Shan-and-Bogdan_Abstracts_Hero_Feature_River_No_Text_1400x788.jpg 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"<a href=\"https:\/\/www.linkedin.com\/in\/gretchen-huizinga-phd-3a4b2921?trk=people-guest_people_search-card\" title=\"Go to researcher profile for Gretchen Huizinga\" aria-label=\"Go to researcher profile for Gretchen Huizinga\" data-bi-type=\"byline author\" data-bi-cN=\"Gretchen Huizinga\">Gretchen Huizinga<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/shanlu\/\" title=\"Go to researcher profile for Shan Lu\" aria-label=\"Go to researcher profile for Shan Lu\" data-bi-type=\"byline author\" data-bi-cN=\"Shan Lu\">Shan Lu<\/a>, and <a href=\"https:\/\/bastoica.github.io\/\" title=\"Go to researcher profile for Bogdan Stoica\" aria-label=\"Go to researcher profile for Bogdan Stoica\" data-bi-type=\"byline author\" data-bi-cN=\"Bogdan Stoica\">Bogdan Stoica<\/a>","formattedDate":"November 4, 2024","formattedExcerpt":"In their 2024 SOSP paper, researchers explore a common\u2014though often undertested\u2014software system issue: retry bugs. Research manager\u00a0Shan Lu and PhD candidate Bogdan Stoica share how they\u2019re combining traditional program analysis and LLMs to address the challenge.","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1098390","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/43518"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1098390"}],"version-history":[{"count":18,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1098390\/revisions"}],"predecessor-version":[{"id":1109841,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1098390\/revisions\/1109841"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1109838"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1098390"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1098390"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1098390"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1098390"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1098390"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1098390"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1098390"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1098390"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1098390"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1098390"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1098390"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}