{"id":253667,"date":"2016-06-21T00:00:00","date_gmt":"2016-06-13T07:00:02","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&#038;p=253667"},"modified":"2016-07-15T15:32:06","modified_gmt":"2016-07-15T22:32:06","slug":"what-new-bugs-live-in-the-cloud-and-how-to-exterminate-them","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/what-new-bugs-live-in-the-cloud-and-how-to-exterminate-them\/","title":{"rendered":"What New Bugs Live in the Cloud? (and How to Exterminate Them)"},"content":{"rendered":"<p>As more data and computation move from local to cloud settings, datacenter distributed systems have become a dominant backbone for many modern applications. However, the complexity of cloud hardware and software ecosystem has outpaced existing testing, debugging, and verification tools. In this talk, I will describe three new classes of bugs that appear in cloud-scale distributed systems: distributed concurrency bugs (with multiple failures), scalability bugs, and non-deterministic performance bugs. (1) A distributed concurrency bug is a concurrency bug in distributed systems that is caused by distributed events (message arrivals, local computation, fault\/reboot) that can occur in non-deterministic order. (2) A scalability bug is a latent but that is scale dependent, which typically surface in large-scale deployments (100+ nodes), but not necessarily in small\/medium-scale deployments. (3) A non-deterministic performance bug is a performance fault that only appears in specific topological scenarios (e.g., specific task placements and locations of slow hardware). I will present our work in combating these new classes of bugs, including semantic-aware model checking (SAMC), taxonomy of distributed concurrency bugs (TaxDC), scalability checks (SCk), performance verification (SPV), and path-based speculative execution (PBSE).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>As more data and computation move from local to cloud settings, datacenter distributed systems have become a dominant backbone for many modern applications. However, the complexity of cloud hardware and software ecosystem has outpaced existing testing, debugging, and verification tools. In this talk, I will describe three new classes of bugs that appear in cloud-scale [&hellip;]<\/p>\n","protected":false},"featured_media":257712,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":0,"footnotes":""},"research-area":[13547],"msr-video-type":[206954],"msr-locale":[268875],"msr-post-option":[],"msr-session-type":[],"msr-impact-theme":[],"msr-pillar":[],"msr-episode":[],"msr-research-theme":[],"class_list":["post-253667","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-research-area-systems-and-networking","msr-video-type-microsoft-research-talks","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/dhxJ7k2o_m4","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/253667","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":0,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/253667\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/257712"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=253667"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=253667"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=253667"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=253667"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=253667"},{"taxonomy":"msr-session-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-session-type?post=253667"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=253667"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=253667"},{"taxonomy":"msr-episode","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-episode?post=253667"},{"taxonomy":"msr-research-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-theme?post=253667"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}