{"id":1703,"date":"2012-10-08T09:00:00","date_gmt":"2012-10-08T09:00:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/msr_er\/2012\/10\/08\/big-data-blows-into-the-windy-city\/"},"modified":"2016-07-20T07:32:31","modified_gmt":"2016-07-20T14:32:31","slug":"big-data-blows-into-the-windy-city","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/big-data-blows-into-the-windy-city\/","title":{"rendered":"Big Data Blows into the Windy City"},"content":{"rendered":"<p><span style=\"font-family: verdana,geneva; font-size: medium;\"><img decoding=\"async\" style=\"border: 0px currentColor; margin-right: auto; margin-left: auto; display: block;\" title=\"Ensuring data discoverability, accessibility, and consumability\" alt=\"Ensuring data discoverability, accessibility, and consumability\" src=\"https:\/\/msdnshared.blob.core.windows.net\/media\/MSDNBlogsFS\/prod.evol.blogs.msdn.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/01\/32\/81\/3857.Big_Data_blog.jpg\" original-url=\"http:\/\/blogs.msdn.com\/resized-image.ashx\/__size\/496x307\/__key\/communityserver-blogs-components-weblogfiles\/00-00-01-32-81\/3857.Big_5F00_Data_5F00_blog.jpg\" \/><\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">This week, the annual <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/events\/escience2012\/\" target=\"_blank\">Microsoft eScience Workshop<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> is being held in Chicago (the &ldquo;Windy City&rdquo;), providing an unparalleled opportunity for domain scientists, researchers, and technologists to discuss the benefits and difficulties of incorporating more computing and information technology into the scientific process. Over the years, the eScience workshop has provided a forum where scientists could voice their data and technology challenges and get input from those who&rsquo;ve confronted similar issues. <\/span><br \/>&nbsp;<br \/><span style=\"font-family: verdana,geneva; font-size: medium;\">Front and center this year are topics related to Big Data&mdash;be it the management of the rising data flood, the analysis of the data tsunami, or even the visualization of the data explosion. In addition, this year&#8217;s workshop explores questions about how to train and develop data scientists, and how citizen scientists can play a role in gaining insights from the vast amounts of information.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">Many of these topics are examined in the book, <em><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/collaboration\/fourthparadigm\/default.aspx\" target=\"_blank\">The Fourth Paradigm: Data-Intensive Scientific Discovery<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/em>, which is an excellent resource for these discussions. And, as evidenced in that book, the Big Data &ldquo;opportunity&rdquo; has actually been building for some time&mdash;but now it has reached the tipping point in terms of awareness across more science domains. The commoditization of devices, sensors, storage, and connectivity&mdash;paired with technologies like cloud computing&mdash;has made the idea of capturing and maintaining all data in those science domains a plausible reality. As a result, scientists are thinking about what can be done, rather than lamenting what could be done if only they had the research infrastructure. <\/span><br \/>&nbsp;<br \/><span style=\"font-family: verdana,geneva; font-size: medium;\">In preparing for this year&rsquo;s event, I looked back at the very <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/events\/scidata04\/agenda.aspx\" target=\"_blank\">first Microsoft eScience Workshop<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, held in 2004. I revisited <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"https:\/\/skydrive.live.com\/view.aspx?resid=84D3927C45742C81!1470&cid=84d3927c45742c81\" target=\"_blank\">Jim Gray&rsquo;s keynote<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and put together this six-slide composite of the main challenges Jim identified back then. As you&rsquo;ll notice, while some progress has been made, many of those challenges are still being addressed. For instance, global federation has remained a key issue for distributed and disparate databases. Do you move all the data to one location? Or do you ensure that the data owners continue to curate the data and safeguard the quality of the datasets? The approach taken by SkyQuery has really advanced federation, by demonstrating how multiple datasets can be queried seamlessly and by implementing novel approaches, such as the spatial join queries. If you want more details, check out the paper, <em><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/arxiv.org\/abs\/cs.DB\/0211023\" target=\"_blank\">SkyQuery: A WebService Approach to Federate Databases<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/em>.&nbsp;<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-family: verdana,geneva; font-size: medium;\"><img decoding=\"async\" style=\"border: 0px currentColor;\" title=\"Six-slide composite of the main challenges that Jim Gray identified at the first Microsoft eScience Workshop in 2004\" alt=\"Six-slide composite of the main challenges that Jim Gray identified at the first Microsoft eScience Workshop in 2004\" src=\"https:\/\/msdnshared.blob.core.windows.net\/media\/MSDNBlogsFS\/prod.evol.blogs.msdn.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/01\/32\/81\/0724.Big_Data_blog.jpg\" original-url=\"http:\/\/blogs.msdn.com\/resized-image.ashx\/__size\/496x0\/__key\/communityserver-blogs-components-weblogfiles\/00-00-01-32-81\/0724.Big_5F00_Data_5F00_blog.jpg\" \/><br \/><span style=\"color: #888888; font-size: small;\">Six-slide composite of the main challenges that Jim Gray identified at the first Microsoft eScience Workshop in 2004<\/span><\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">To truly tackle these data challenges, scientific datasets need the following attributes: discoverability, accessibility, and consumability. If a dataset doesn&#8217;t have all three, it might as well be kept in a file cabinet. There has been much work done lately on discoverability: for example, the emergence of different &ldquo;data.gov&rdquo; domain science catalogs&mdash;and even commercial ones like the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"https:\/\/datamarket.azure.com\/\" target=\"_blank\">Windows Azure Marketplace<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. The &ldquo;Open Data for Open Science&rdquo; session at this year&rsquo;s eScience Workshop explores how to address some of these challenges from the science side and looks at how simple, Internet-based protocols, such as <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.odata.org\/\" target=\"_blank\">OData<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> (the Open Data Protocol), can help ensure that the end-user scientist can use the data. <\/span><br \/>&nbsp;<br \/><span style=\"font-family: verdana,geneva; font-size: medium;\">The Monday evening event at the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.adlerplanetarium.org\/\" target=\"_blank\">Adler Planetarium<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> showcases how scientific data and information can be communicated to the public, through amazing 3-D tours powered by Microsoft Research <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.worldwidetelescope.org\/\" target=\"_blank\">WorldWide Telescope<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> (WWT) and brought to life in the planetarium&rsquo;s <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.adlerplanetarium.org\/experience\/shows\/theaters\/graingerskytheater\" target=\"_blank\">Grainger Sky Theater<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Microsoft researcher <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/people\/jfay\/\" target=\"_blank\">Jonathan Fay<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, architect of WWT, has been working with the Adler to ensure that tours that were originally developed to be shown in planetarium can be taken home and experienced later. An example of the great work from the Adler is the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.adlerplanetarium.org\/experience\/shows\/welcome\" target=\"_blank\"><em>Welcome to the Universe<\/em> show<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and the WWT tour narrated by astronomer <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.adlerplanetarium.org\/researchcollections\/researchers#msr\" target=\"_blank\">Mark SubbaRao<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. You can <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.worldwidetelescope.org\/webclient\/default.aspx?tour=http:\/\/svl.adlerplanetarium.org\/WTTU\/WTTU.wtt\" target=\"_blank\">play the tour<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> in your browser. You can find more tours powered by WorldWide Telescope at the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.layerscape.org\/Home\/Index\" target=\"_blank\">Layerscape website<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. <\/span><br \/>&nbsp;<br \/><span style=\"font-family: verdana,geneva; font-size: medium;\">Whether you&#8217;re attending the Microsoft eScience Workshop or just wishing you could, I encourage you to dive into these Big Data challenges.<\/span><\/p>\n<p><em><span style=\"font-family: verdana,geneva; font-size: medium;\">&mdash;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/people\/danf\/\" target=\"_blank\">Dan Fay<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Director, Earth, Energy, and Environment; Microsoft Research Connections<\/span><\/em><\/p>\n<p><strong><span style=\"font-family: verdana,geneva; font-size: medium;\">Learn More<\/span><\/strong><\/p>\n<ul>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/events\/escience2012\/\" target=\"_blank\">Microsoft eScience Workshop 2012<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><em><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/collaboration\/fourthparadigm\/default.aspx\" target=\"_blank\">The Fourth Paradigm: Data-Intensive Scientific Discovery<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/em><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/download.microsoft.com\/download\/6\/9\/8\/69832FF7-D30C-42FC-B36C-712BF4066BCD\/Science@Microsoft_InteractivePDF.pdf\" target=\"_blank\"><em>Science@Microsoft&mdash;The Fourth Paradigm in Practice <\/em>Book<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;(PDF, 10 MB)<\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a href=\"http:\/\/www.microsoft.com\/en-us\/researchconnections\/science\/stories\/default.aspx\" target=\"_blank\">Science@Microsoft Stories<\/a><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"https:\/\/datamarket.azure.com\/\" target=\"_blank\">Windows Azure Marketplace<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><span style=\"font-family: verdana,geneva; font-size: small;\"><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.odata.org\/\" target=\"_blank\">Open Data Protocol (OData)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.worldwidetelescope.org\/\" target=\"_blank\">WorldWide Telescope<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.layerscape.org\/Home\/Index\" target=\"_blank\">Layerscape<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/collaboration\/focus\/escience\/default.aspx\" target=\"_blank\">eScience at Microsoft Research Connections<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>This week, the annual Microsoft eScience Workshop is being held in Chicago (the &ldquo;Windy City&rdquo;), providing an unparalleled opportunity for domain scientists, researchers, and technologists to discuss the benefits and difficulties of incorporating more computing and information technology into the scientific process. Over the years, the eScience workshop has provided a forum where scientists could [&hellip;]<\/p>\n","protected":false},"author":32627,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[1],"tags":[194588,187154,186831,195240,195253,195255,195257,186454,195275,187066,195527,193592,193594,196408,196439,193560,196726,197123,197225,197427,197766,187311],"research-area":[],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-1703","post","type-post","status-publish","format-standard","hentry","category-research-blog","tag-adler-planetarium","tag-astronomy","tag-big-data","tag-dan-fay","tag-data-analysis","tag-data-curation","tag-data-management","tag-data-visualization","tag-data-intensive-science","tag-escience","tag-events","tag-jim-gray","tag-layerscape","tag-microsoft-escience-workshop","tag-microsoft-research-connections","tag-odata","tag-open-data-protocol","tag-scientific-datasets","tag-skyquery","tag-the-fourth-paradigm","tag-windows-azure-data-marketplace","tag-worldwide-telescope","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"October 8, 2012","formattedExcerpt":"This week, the annual Microsoft eScience Workshop is being held in Chicago (the &ldquo;Windy City&rdquo;), providing an unparalleled opportunity for domain scientists, researchers, and technologists to discuss the benefits and difficulties of incorporating more computing and information technology into the scientific process. Over the years,&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1703","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/32627"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1703"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1703\/revisions"}],"predecessor-version":[{"id":261846,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1703\/revisions\/261846"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1703"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1703"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1703"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1703"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1703"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1703"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1703"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1703"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1703"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1703"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1703"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}