{"id":396287,"date":"2017-07-04T14:17:34","date_gmt":"2017-07-04T21:17:34","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&#038;p=396287"},"modified":"2018-10-16T20:00:09","modified_gmt":"2018-10-17T03:00:09","slug":"compare-automatic-identification-comparable-entities-web","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/compare-automatic-identification-comparable-entities-web\/","title":{"rendered":"How do they Compare? Automatic Identification of Comparable Entities on the Web"},"content":{"rendered":"<p>People love comparing things: from home mortgages\u00a0and digital cameras to travel destinations and political philosophies.\u00a0Today, we are mostly limited to browsing documents after\u00a0issuing comparative queries to Web search engines, such as\u201c15-year vs. 30-year mortgage\u201d, \u201cNikon D90 \/ Canon 40D\u201d, \u201cOahu or\u00a0Maui\u201d, and \u201ccommunism vs. fascism\u201d. There is an opportunity\u00a0to improve the search experience by automatically offering\u00a0comparisons to users. In this paper, we propose a first step\u00a0towards this goal of comparative analysis by mining a broad class\u00a0of comparable entities from search query logs and a large Web\u00a0crawl. Example comparables that we extract include medicines,\u00a0appliances, electronics, vacation destinations, and many more. We\u00a0present an extensive empirical analysis showing that our methods\u00a0generate comparables with high precision and recall, and showing\u00a0that Web search query logs are a superior source for mining such\u00a0entities as compared to Web pages, typically used for extraction\u00a0tasks. We further compare the performance of our methods with\u00a0\u201crelated entities\u201d reported by Google Sets, and show a gain of\u00a039% in average precision and a gain of 30% in NCDG.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>People love comparing things: from home mortgages\u00a0and digital cameras to travel destinations and political philosophies.\u00a0Today, we are mostly limited to browsing documents after\u00a0issuing comparative queries to Web search engines, such as\u201c15-year vs. 30-year mortgage\u201d, \u201cNikon D90 \/ Canon 40D\u201d, \u201cOahu or\u00a0Maui\u201d, and \u201ccommunism vs. fascism\u201d. There is an opportunity\u00a0to improve the search experience by automatically [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"IEEE","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"IEEE Conference on Information Reuse and Integration (IEEE-IRI-11). Las Vegas, NV.","msr_editors":"","msr_how_published":"","msr_isbn":"978-1-4577-0966-1\/11","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"228-233","msr_page_range_start":"228","msr_page_range_end":"233","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"IEEE Conference on Information Reuse and Integration (IEEE-IRI-11). Las Vegas, NV.","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2011-08-03","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13556,13555],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-396287","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-search-information-retrieval","msr-locale-en_us"],"msr_publishername":"IEEE","msr_edition":"IEEE Conference on Information Reuse and Integration (IEEE-IRI-11). Las Vegas, NV.","msr_affiliation":"","msr_published_date":"2011-08-03","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"228-233","msr_chapter":"","msr_isbn":"978-1-4577-0966-1\/11","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"396293","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"iri11","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/07\/iri11.pdf","id":396293,"label_id":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[],"msr-author-ordering":[{"type":"text","value":"Alpa Jain","user_id":0,"rest_url":false},{"type":"user_nicename","value":"ppantel","user_id":33275,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=ppantel"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/396287","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/396287\/revisions"}],"predecessor-version":[{"id":396290,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/396287\/revisions\/396290"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=396287"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=396287"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=396287"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=396287"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=396287"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=396287"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=396287"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=396287"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=396287"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=396287"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=396287"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=396287"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=396287"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}