{"id":184239,"date":"2005-03-31T00:00:00","date_gmt":"2009-10-31T13:31:25","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/load-management-and-fault-tolerance-in-a-distributed-stream-processing-system\/"},"modified":"2016-09-09T09:47:36","modified_gmt":"2016-09-09T16:47:36","slug":"load-management-and-fault-tolerance-in-a-distributed-stream-processing-system","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/load-management-and-fault-tolerance-in-a-distributed-stream-processing-system\/","title":{"rendered":"Load Management and Fault-Tolerance in a Distributed Stream Processing System"},"content":{"rendered":"<div class=\"asset-content\">\n<p>Recently, a new class of data management applications has emerged in areas such as sensor-based environmental monitoring, financial services, network monitoring, and military applications. These &#8220;stream processing applications&#8221; require low-latency processing of large-volume data streams. Because traditional database management systems are ill-suited for high-volume, low-latency stream processing, new systems, called stream processing engines (SPEs), have been developed.<\/p>\n<p>In this talk, we present the software architecture and algorithms in Borealis, one of the first distributed stream processing engines. We discuss how our system meets two important challenges: (1) distributed load management, and (2), fault-tolerant operation in the face of node failures, network failures, and network partitions.<\/p>\n<p>We present a mechanism that enables autonomous participants to collaboratively handle load. Our approach is based on contracts that participants negotiate offline. At runtime, participants move load only to partners with whom they have a contract and pay each other the contracted price, making the mechanism lightweight. We show that our approach provides incentives that foster participation and leads to good system-wide load balance properties.<\/p>\n<p>For fault-tolerance, we present a replication-based scheme that masks most node and network failures. When network partitions occur, our approach addresses the traditional availability-consistency trade-off by striving to minimize inconsistencies, while ensuring that the system meets the desired availability specified by the application or user.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Recently, a new class of data management applications has emerged in areas such as sensor-based environmental monitoring, financial services, network monitoring, and military applications. These &#8220;stream processing applications&#8221; require low-latency processing of large-volume data streams. Because traditional database management systems are ill-suited for high-volume, low-latency stream processing, new systems, called stream processing engines (SPEs), have [&hellip;]<\/p>\n","protected":false},"featured_media":195363,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":0,"footnotes":""},"research-area":[],"msr-video-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-session-type":[],"msr-impact-theme":[],"msr-pillar":[],"msr-episode":[],"msr-research-theme":[],"class_list":["post-184239","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/DStryxHz3C8","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/184239","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":0,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/184239\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/195363"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=184239"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=184239"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=184239"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=184239"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=184239"},{"taxonomy":"msr-session-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-session-type?post=184239"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=184239"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=184239"},{"taxonomy":"msr-episode","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-episode?post=184239"},{"taxonomy":"msr-research-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-theme?post=184239"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}