{"id":2143,"date":"2011-12-19T09:22:00","date_gmt":"2011-12-19T09:22:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/msr_er\/2011\/12\/19\/coping-with-data-deluge\/"},"modified":"2016-07-20T07:33:18","modified_gmt":"2016-07-20T14:33:18","slug":"coping-with-data-deluge","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/coping-with-data-deluge\/","title":{"rendered":"Coping with Data Deluge"},"content":{"rendered":"<p><span style=\"font-family: verdana,geneva; font-size: medium;\">Overwhelmed by data? You&rsquo;re not alone.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">Data mining has become one of the most critical research processes in this era of data-intensive science. There are, however, many areas of science where the usefulness of data mining is limited by the massive nature of the datasets. Consequently, scientists are desperately looking for new tools that can dig into the data faster and deeper. In the rapidly developing field of synoptic sky surveys, for example, transient signals from a variety of interesting astrophysical phenomena must be detected and characterized in (near) real-time. The resulting wealth of data is invaluable to researchers seeking new discoveries, but they need better computational methods to help them manage and analyze so much data.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\"><img decoding=\"async\" style=\"border: 0px currentColor;\" title=\"Coping with Data Deluge\" alt=\"Coping with Data Deluge\" src=\"https:\/\/msdnshared.blob.core.windows.net\/media\/MSDNBlogsFS\/prod.evol.blogs.msdn.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/01\/32\/81\/8877.data_deluge.jpg\" original-url=\"http:\/\/blogs.msdn.com\/resized-image.ashx\/__size\/486x337\/__key\/communityserver-blogs-components-weblogfiles\/00-00-01-32-81\/8877.data_5F00_deluge.jpg\" \/><\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">It was in response to such needs that Caltech&rsquo;s <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.kiss.caltech.edu\/\" target=\"_blank\">Keck Institute for Space Studies<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> sponsored a workshop, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.astro.caltech.edu\/digging\/\" target=\"_blank\">Digging Faster and Deeper:&nbsp; Algorithms for Computationally Limited Problems in Time-Domain Astronomy<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, from December 12 to 13. Bringing together more than 50 distinguished participants, the workshop focused on some of the unresolved data mining issues for future studies in time-domain astronomy and related fields.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">I was privileged to give two talks during day two of the workshop. In &ldquo;Discovery of Hidden Patterns in Data through Interactive Search,&rdquo; I presented the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/projects\/eif\/\" target=\"_blank\">Environmental Informatics Framework<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> (EIF), a strategy and technology platform that the Microsoft Research Connections <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/collaboration\/focus\/e3\/default.aspx\" target=\"_blank\">Earth, Energy, and Environment<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> group developed to help advance data exploration in environmental research. I demonstrated <a href=\"http:\/\/www.microsoft.com\/silverlight\/pivotviewer\/\" target=\"_blank\">Microsoft PivotViewer<\/a>, a faceted search technology included in EIF that enables users to visually and interactively search and discover hidden patterns in massive data or image sets.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">I was pleased to receive positive feedback from attendees about the work that Microsoft Research is doing for data-intensive sciences. As one participant noted to me in email, &ldquo;I have to admit that I wasn&rsquo;t aware of the work that Microsoft Research was doing, but I was very impressed with what I saw yesterday. The work you&rsquo;ve been doing on data visualization can only be described as stunning!&rdquo;<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">In &ldquo;Building a Better Scientist,&rdquo; my second talk of the day, I discussed how the fourth paradigm for data-intensive scientific discovery is changing the way scientists conduct research, and is, therefore, creating a need for a new generation of scientists with advanced computational mindsets. The presentation stimulated passionate discussions, and, as event chair George Djorgovski pointed out, it is a topic closely related to how fast and deep we can go with our data.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">&mdash;<em><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/people\/yanxu\/\" target=\"_blank\">Yan Xu<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Senior Research Program Manager, Microsoft Research Connections<\/em><\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\"><strong>Learn More<\/strong><\/span><\/p>\n<ul>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.kiss.caltech.edu\/\" target=\"_blank\">Keck Institute for Space Studies<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.astro.caltech.edu\/digging\/\" target=\"_blank\">Digging Faster and Deeper: Algorithms for Computationally Limited Problems in Time-Domain Astronomy<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/projects\/eif\/\" target=\"_blank\">Microsoft Environmental Informatics Framework<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> (EIF)<\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/collaboration\/focus\/e3\/default.aspx\" target=\"_blank\">Earth, Energy, and Environment at Microsoft Research Connections<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a href=\"http:\/\/www.microsoft.com\/silverlight\/pivotviewer\/\" target=\"_blank\">Microsoft PivotViewer<\/a><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/collaboration\/fourthparadigm\/default.aspx\" target=\"_blank\">The Fourth Paradigm: Data-Intensive Scientific Discovery<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Overwhelmed by data? You&rsquo;re not alone. Data mining has become one of the most critical research processes in this era of data-intensive science. There are, however, many areas of science where the usefulness of data mining is limited by the massive nature of the datasets. Consequently, scientists are desperately looking for new tools that can [&hellip;]<\/p>\n","protected":false},"author":32627,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[1],"tags":[194889,195112,186854,186454,195275,195490,195491,195612,195680,196111,196428,193504,196439,197137,197458,197857],"research-area":[],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-2143","post","type-post","status-publish","format-standard","hentry","category-research-blog","tag-caltech","tag-computational-methods","tag-data-mining","tag-data-visualization","tag-data-intensive-science","tag-environmental-informatics-framework-eif","tag-environmental-research","tag-fourth-paradigm","tag-george-djorgovski","tag-keck-institute-for-space-studies","tag-microsoft-pivotviewer","tag-microsoft-research","tag-microsoft-research-connections","tag-search-technology","tag-time-domain-astronomy","tag-yan-xu","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"December 19, 2011","formattedExcerpt":"Overwhelmed by data? You&rsquo;re not alone. Data mining has become one of the most critical research processes in this era of data-intensive science. There are, however, many areas of science where the usefulness of data mining is limited by the massive nature of the datasets.&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/2143","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/32627"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=2143"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/2143\/revisions"}],"predecessor-version":[{"id":262152,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/2143\/revisions\/262152"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=2143"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=2143"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=2143"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=2143"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=2143"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=2143"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=2143"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=2143"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=2143"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=2143"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=2143"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}