{"id":168643,"date":"2015-07-01T00:00:00","date_gmt":"2015-07-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/yinyang-k-means-a-drop-in-replacement-of-the-classic-k-means-with-consistent-speedup\/"},"modified":"2018-10-16T20:37:26","modified_gmt":"2018-10-17T03:37:26","slug":"yinyang-k-means-a-drop-in-replacement-of-the-classic-k-means-with-consistent-speedup","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/yinyang-k-means-a-drop-in-replacement-of-the-classic-k-means-with-consistent-speedup\/","title":{"rendered":"Yinyang K-Means: A Drop-In Replacement of the Classic K-Means with Consistent Speedup"},"content":{"rendered":"<p>This paper presents Yinyang K-means, a new algorithm for K-means clustering. By clustering the centers in the initial stage, and leveraging ef\ufb01ciently maintained lower and upper bounds between a point and centers, it more effectively avoids unnecessary distance calculations than prior algorithms. It signi\ufb01cantly outperforms prior K-means algorithms consistently across all experimented data sets, cluster numbers, and machine con\ufb01gurations. The consistent, superior performance\u2014plus its simplicity, user-control of overheads, and guarantee in producing the same clustering results as the standard K-means\u2014makes Yinyang K-means a drop-in replacement of the classic K-means with an order of magnitude higher performance.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This paper presents Yinyang K-means, a new algorithm for K-means clustering. By clustering the centers in the initial stage, and leveraging ef\ufb01ciently maintained lower and upper bounds between a point and centers, it more effectively avoids unnecessary distance calculations than prior algorithms. It signi\ufb01cantly outperforms prior K-means algorithms consistently across all experimented data sets, cluster [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"579\u2013587","msr_page_range_start":"579","msr_page_range_end":"587","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"Yufei Ding, Yue Zhao, Xipeng Shen, Madanlal Musuvathi","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2015-07-01","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"http:\/\/jmlr.org\/proceedings\/papers\/v37\/ding15.html","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":2015,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13556,13560,13547],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-168643","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-programming-languages-software-engineering","msr-research-area-systems-and-networking","msr-locale-en_us"],"msr_publishername":"","msr_edition":"Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015","msr_affiliation":"","msr_published_date":"2015-07-01","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"579\u2013587","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"204276","msr_publicationurl":"http:\/\/jmlr.org\/proceedings\/papers\/v37\/ding15.html","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"ding15.pdf","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/ding15.pdf","id":204276,"label_id":0},{"type":"url","title":"http:\/\/jmlr.org\/proceedings\/papers\/v37\/ding15.html","viewUrl":false,"id":false,"label_id":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":0,"url":"http:\/\/jmlr.org\/proceedings\/papers\/v37\/ding15.html"},{"id":204276,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/ding15.pdf"}],"msr-author-ordering":[{"type":"text","value":"Yufei Ding","user_id":0,"rest_url":false},{"type":"text","value":"Yue Zhao","user_id":0,"rest_url":false},{"type":"text","value":"Xipeng Shen","user_id":0,"rest_url":false},{"type":"text","value":"Madanlal Musuvathi","user_id":0,"rest_url":false},{"type":"user_nicename","value":"toddm","user_id":34235,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=toddm"},{"type":"user_nicename","value":"madanm","user_id":32766,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=madanm"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[171416,292964],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":171416,"post_title":"Parasail","post_name":"parasail","post_type":"msr-project","post_date":"2014-10-17 13:27:33","post_modified":"2021-03-09 12:33:18","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/parasail\/","post_excerpt":"Project Parasail\u00a0is a novel approach to parallelizing a large class of seemingly sequential applications wherein dependencies are, at runtime, treated as symbolic values. The efficiency of parallelization, then, depends on the efficiency of the symbolic computation, an active area of research in static analysis, verification, and partial evaluation. This is exciting as advances in these fields can translate to novel parallel algorithms for sequential computation. Here's an example of how it works. Imagine an algorithm&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/171416"}]}},{"ID":292964,"post_title":"Project Parade","post_name":"project-parade","post_type":"msr-project","post_date":"2016-09-14 15:28:56","post_modified":"2020-03-13 17:42:45","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-parade\/","post_excerpt":"Project Parade\u00a0is a novel approach to parallelizing a large class of seemingly sequential applications wherein dependencies are, at runtime, treated as symbolic values.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/292964"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/168643","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/168643\/revisions"}],"predecessor-version":[{"id":408515,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/168643\/revisions\/408515"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=168643"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=168643"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=168643"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=168643"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=168643"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=168643"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=168643"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=168643"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=168643"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=168643"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=168643"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=168643"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=168643"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}