{"id":167751,"date":"2015-02-01T00:00:00","date_gmt":"2015-02-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/min-max-hypergraph-partitioning\/"},"modified":"2018-10-16T21:28:59","modified_gmt":"2018-10-17T04:28:59","slug":"min-max-hypergraph-partitioning","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/min-max-hypergraph-partitioning\/","title":{"rendered":"Min-Max Hypergraph Partitioning"},"content":{"rendered":"<div class=\"asset-content\">\n<p>In many applications, the structure of data can be represented by a hyper-graph, where the data items are vertices, and the associations among items are represented by hyper-edges. Equivalently, we are given as input a bipartite graph with two kinds of vertices: items, and associations (which we refer to as topics). We consider the problem of partitioning the set of items into a given number of partitions, such that the maximum number of topics covered by a partition is minimized.<\/p>\n<p>This is a natural clustering problem, with various applications such as partitioning of information objects like documents, images, and videos, or allocation of system resources in data centers in the context of distributed streaming processing platforms, or graph and machine learning computation platforms.<\/p>\n<p>In this paper, we introduce an approximation algorithm for the offline version of the problem, which is guaranteed to yield a good approximation for every input instance where a solution of small cost exists. Further, we consider the online version of the problem, in which items arrive online and each item must be assigned irrevocably to one of the partitions at the time of its arrival.<\/p>\n<p>We show that a simple greedy online assignment of items is able to recover a hidden co-clustering of vertices under a natural set of recovery conditions. We also report on extensive experimental results, covering both synthetically generated and real-world input bipartite graphs, which demonstrate that the greedy online assignment of items consistently yields superior performance when compared with alternative approaches.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In many applications, the structure of data can be represented by a hyper-graph, where the data items are vertices, and the associations among items are represented by hyper-edges. Equivalently, we are given as input a bipartite graph with two kinds of vertices: items, and associations (which we refer to as topics). We consider the problem [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"Microsoft Research","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"MSR-TR-2015-15","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"Jennifer Iglesias","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2015-02-01","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":2015,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13561,13556,13547],"msr-publication-type":[193718],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-167751","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-algorithms","msr-research-area-artificial-intelligence","msr-research-area-systems-and-networking","msr-locale-en_us"],"msr_publishername":"Microsoft Research","msr_edition":"","msr_affiliation":"","msr_published_date":"2015-02-01","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"MSR-TR-2015-15","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"204492","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"MSR-TR-2015-15%20-%20v2.pdf","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/MSR-TR-2015-1520-20v2.pdf","id":204492,"label_id":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":204492,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/MSR-TR-2015-1520-20v2.pdf"}],"msr-author-ordering":[{"type":"user_nicename","value":"daalista","user_id":31496,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=daalista"},{"type":"text","value":"Jennifer Iglesias","user_id":0,"rest_url":false},{"type":"user_nicename","value":"milanv","user_id":32922,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=milanv"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[171443],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"techreport","related_content":{"projects":[{"ID":171443,"post_title":"Fennel","post_name":"fennel-2","post_type":"msr-project","post_date":"2015-03-16 05:09:47","post_modified":"2016-04-08 20:28:30","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/fennel-2\/","post_excerpt":"This is a project looking into design and evaluation of efficient and deployable algorithms for assignment of complex workloads to resources in modern cloud service platforms. Alumni Jennifer Iglesias (Carnegie Mellon University; summer internship 2014) Florian Bourse (Ecole Normale Superieure, Paris; summer internship 2013) Bo Zong (University of California Santa Barbara; summer internship 2013)","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/171443"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/167751","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/167751\/revisions"}],"predecessor-version":[{"id":536273,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/167751\/revisions\/536273"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=167751"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=167751"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=167751"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=167751"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=167751"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=167751"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=167751"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=167751"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=167751"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=167751"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=167751"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=167751"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=167751"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}