{"id":213,"date":"2014-09-05T10:47:00","date_gmt":"2014-09-05T10:47:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/msr_er\/2014\/09\/05\/big-data-tamed-with-the-cloud\/"},"modified":"2016-07-20T07:29:45","modified_gmt":"2016-07-20T14:29:45","slug":"big-data-tamed-with-the-cloud","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/big-data-tamed-with-the-cloud\/","title":{"rendered":"Big data tamed with the cloud"},"content":{"rendered":"<p><span style=\"font-family: verdana,geneva; font-size: medium;\">Big data: it&rsquo;s the hot topic these days, promising breakthroughs in just about every field, from medicine to marketing to machine learning and more. But for many of us, the problems of managing big data hit home when we confront the welter of digital photos and videos we have recorded with our smartphones and cameras. Multiply this by the number of people doing this around the world and it is a big problem. On the surface, it does not seem like an endeavor on the order of treating cancer (more on that later), but it is a colossal headache to organize, classify, search, and retrieve our multimedia content&mdash;and designing systems to do this at scale effectively is a huge challenge.<\/span><\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" title=\"Big data management for pictures and video\" src=\"https:\/\/msdnshared.blob.core.windows.net\/media\/MSDNBlogsFS\/prod.evol.blogs.msdn.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/01\/32\/81\/2543.BigData_496x310.png\" original-url=\"http:\/\/blogs.msdn.com\/resized-image.ashx\/__size\/496x0\/__key\/communityserver-blogs-components-weblogfiles\/00-00-01-32-81\/2543.BigData_5F00_496x310.png\" alt=\"Big data management for pictures and video\" \/><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">Thankfully, Professor <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/dbis.cs.unibas.ch\/team\/heiko-schuldt\/dbis_staff_view\" target=\"_blank\">Heiko Schuldt<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/dbis.cs.unibas.ch\/team\/ivan-giangreco\/dbis_staff_view\" target=\"_blank\">Ivan Giangreco<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> of the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/dbis.cs.unibas.ch\/\" target=\"_blank\">Databases and Information Systems (DBIS) Group<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at the University of Basel are working on a project to do just that, and a whole lot more. Their integrated system harnesses the power of the cloud, through Microsoft Azure, to understand and sort through the terabytes of data that make up multimedia content to find and return like objects.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">The Basel team&rsquo;s system combines the power of relational databases, with the adaptability of information retrieval systems. The Basel system can handle and store any type of multimedia data, including their features. When an algorithm for feature extraction is defined, the system automatically executes the extraction, storage, and indexing of both the feature data and the object itself. This approach efficiently carries out Boolean queries as well as searches based on ranking images based on their feature similarity scores. In addition, it provides novel query paradigms and interfaces; for example, you can sketch an image or parts thereof and find images that are similar to your sketch.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">It&#8217;s exciting to see how this work has progressed since the Basel researchers attended our first European Microsoft Azure for Research training workshop at ETH Zurich last November. They successfully applied for an Azure Award, which got them up and running on the cloud within a few weeks. This allowed the team to quickly develop and deploy their system in a scalable way. Microsoft Azure is ideal as a fast, distributed storage and computing fabric for running the Basel team&rsquo;s project, whose MapReduce-style program can grow as millions of images are added to the system. By moving to the cloud, the Basel researchers have been able to develop, deploy, and demonstrate the system, testing their ideas at scale on the 14 million images that comprise the ImageNet database. They presented this work at the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.ieeebigdata.org\/2014\/\" target=\"_blank\">IEEE International Congress on Big Data (BigData 2014)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">Professor Schuldt explains how Azure has helped him with his research. &#8220;In large-scale image retrieval, both effectiveness and efficiency are essential requirements. Thanks to Microsoft&rsquo;s support and the use of the Azure cloud, we have been able to successfully address the retrieval efficiency so that we can concentrate further on retrieval effectiveness, especially by developing novel search paradigms and user interfaces based, for instance, on gestures or sketches.&#8221;<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">The Basel researchers are looking forward to tackling the even bigger <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/projects\/clickture\/\" target=\"_blank\">Bing Clickture<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> dataset, which contains 40 million images. They also plan to test the system on video content, in what they&rsquo;re calling the IMOTION project, which will &ldquo;multiply the challenges in terms of retrieval efficiency,&rdquo; notes Professor Schuldt. Their next paper was presented at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/sigir.org\/sigir2014\/\" target=\"_blank\">37th International ACM-SIGIR Conference on Research and Development in Information Retrieval<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, and we&#8217;re looking forward to seeing how the team continues to <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/dbis.cs.unibas.ch\/publications\/2014\/adam-sigir-2014\/dbis_publication_view\" target=\"_blank\">push the boundaries of big data by using Microsoft Azure<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">Now back to that earlier comment about treating cancer. Approaches similar to those used by the Basel team&rsquo;s project might, in fact, someday help us to better understand and treat cancer. The underlying computer science and cloud technologies could be used, for example, for <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/projects\/medimaging\/\" target=\"_blank\">managing and analyzing MRI scans of tumors<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\">The Basel team&rsquo;s project is just one example of how easy it is to get up and running on the cloud and accelerate your research&mdash;especially when by taking advantage of the Microsoft Azure for Research initiative, which offers not only training but also substantial grants of Azure storage and compute resources for qualified projects. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.azure4research.com\/\" target=\"_blank\">Read about the initiative<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and our <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-US\/projects\/azure\/awards.aspx\" target=\"_blank\">requests for proposals<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Who knows? Maybe your project will be the next big thing in big data.<\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\"><em>&mdash;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-us\/people\/kenjitak\/\" target=\"_blank\">Kenji Takeda<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Solutions Architect and Technical Manager, Microsoft Research<\/em><\/span><\/p>\n<p><span style=\"font-family: verdana,geneva; font-size: medium;\"><strong>Learn more<\/strong><\/span><\/p>\n<ul>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/dbis.cs.unibas.ch\/\" target=\"_blank\">Databases and Information Systems (DBIS) Group<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at the University of Basel<\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"https:\/\/imotion-project.eu\/projects-adam\/\" target=\"_blank\">IMOTION<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> project (University of Basel and partners)<\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/www.azure4research.com\/\" target=\"_blank\">Microsoft Azure for Research<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" href=\"http:\/\/research.microsoft.com\/en-US\/projects\/azure\/awards.aspx\" target=\"_blank\">Microsoft Azure for Research Award program<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/span><\/li>\n<li><span style=\"font-family: verdana,geneva; font-size: small;\"><a href=\"http:\/\/www.microsoft.com\/windowsazure\/windowsazure\/\" target=\"_blank\">Microsoft Azure<\/a><\/span><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Big data: it&rsquo;s the hot topic these days, promising breakthroughs in just about every field, from medicine to marketing to machine learning and more. But for many of us, the problems of managing big data hit home when we confront the welter of digital photos and videos we have recorded with our smartphones and cameras. [&hellip;]<\/p>\n","protected":false},"author":32627,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[1],"tags":[194750,193658,186831,186889,195258,195282,195784,195993,196117,193659,196575,197018,197574],"research-area":[],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-213","post","type-post","status-publish","format-standard","hentry","category-research-blog","tag-azure-award-program","tag-azure-for-research","tag-big-data","tag-cloud-computing","tag-data-management-and-retrieval","tag-databases-and-information-systems-group","tag-heiko-schuldt","tag-ivan-giangreco","tag-kenji-takeda","tag-microsoft-azure","tag-multimedia","tag-rfp","tag-university-of-basel","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"September 5, 2014","formattedExcerpt":"Big data: it&rsquo;s the hot topic these days, promising breakthroughs in just about every field, from medicine to marketing to machine learning and more. But for many of us, the problems of managing big data hit home when we confront the welter of digital photos&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/213","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/32627"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=213"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/213\/revisions"}],"predecessor-version":[{"id":260961,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/213\/revisions\/260961"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=213"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=213"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=213"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=213"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=213"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=213"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=213"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=213"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=213"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=213"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=213"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}