{"id":265,"date":"2013-04-02T09:00:00","date_gmt":"2013-04-02T09:00:00","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/inside_microsoft_research\/2013\/04\/02\/exploring-the-biases-of-big-data\/"},"modified":"2016-07-20T07:31:52","modified_gmt":"2016-07-20T14:31:52","slug":"exploring-the-biases-of-big-data","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/exploring-the-biases-of-big-data\/","title":{"rendered":"Exploring the Biases of Big Data"},"content":{"rendered":"<p class=\"posted-by\">Posted by <span class=\"author\">Rob Knies<\/span><\/p>\n<p class=\"posted-by\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/msdnshared.blob.core.windows.net\/media\/TNBlogsFS\/prod.evol.blogs.technet.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/00\/90\/35\/0385.Crawford.jpg\" original-url=\"http:\/\/blogs.technet.com\/cfs-file.ashx\/__key\/communityserver-blogs-components-weblogfiles\/00-00-00-90-35\/0385.Crawford.jpg\"><img decoding=\"async\" style=\"margin: 10px; border: 0px currentColor; float: left;\" title=\"Kate Crawford\" src=\"https:\/\/msdnshared.blob.core.windows.net\/media\/TNBlogsFS\/prod.evol.blogs.technet.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/00\/90\/35\/0385.Crawford.jpg\" original-url=\"http:\/\/blogs.technet.com\/resized-image.ashx\/__size\/550x0\/__key\/communityserver-blogs-components-weblogfiles\/00-00-00-90-35\/0385.Crawford.jpg\" alt=\"Kate Crawford\" width=\"166\" \/><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p class=\"posted-by\"><span class=\"author\">On Feb. 28, at the Santa Clara (Calif.) Convention Center, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"Kate Crawford\" href=\"http:\/\/www.katecrawford.net\/\" target=\"_blank\">Kate Crawford<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, principal researcher at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"Microsoft Research New England\" href=\"http:\/\/research.microsoft.com\/en-us\/labs\/newengland\/default.aspx\" target=\"_blank\">Microsoft Research New England<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, took the stage during the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"Strata Conference\" href=\"http:\/\/strataconf.com\/strata2013\/public\/content\/home\" target=\"_blank\">Strata Conference<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> to deliver an illuminating, 17-minute talk entitled Algorithmic Illusions: Hidden Biases of Big Data.<\/p>\n<p>During that presentation, she cautioned that data and collections of data are not objective. They are created and shaped by human beings, and understanding the unavoidable hidden biases people bring to data collection and analysis can be as significant as the data themselves.<\/p>\n<p>Now, on the heels of that appearance, Crawford is bringing a similar message to a different audience, that of the <em>Harvard Business Review<\/em>, which has just published her contributed article, <em><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"<em>The Hidden Biases in Big Data<\/em>&#8221; href=&#8221;http:\/\/blogs.hbr.org\/cs\/2013\/04\/the_hidden_biases_in_big_data.html&#8221; target=&#8221;_blank&#8221;>The Hidden Biases in Big Data<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/em>, that underscores the concepts she discussed during Strata 2013.<\/p>\n<p>In this brief but compelling article, Crawford raises questions that need to be raised when examining a big data set: &ldquo;&hellip; which people are excluded? Which places are less visible? What happens if you live in the shadow of big data sets?&rdquo;<\/p>\n<p>I had to know more, so I contacted her. The first thing I wondered about was how she began investigating the biases inherent in what increasingly is being invoked by the term &ldquo;big data.&rdquo; As it turns out, her investigation had been prompted by one of the world&rsquo;s biggest natural disasters in recent memory.<\/p>\n<p>&ldquo;I became fascinated with the insights and limits of big data when I started working with large sets of social-media data,&rdquo; Crawford says, &ldquo;most particularly while working on crisis communications projects back in 2010 and 2011, when Australia was experiencing the worst flooding on record. Collaborating with a team of social scientists, we were tracking tweets about the floods, seeking to understand communications patterns.<\/p>\n<p>&ldquo;People were using Twitter to share information and to squash rumors and to thank emergency services. The majority of tweets came from Queensland&rsquo;s capital city of Brisbane, but the most substantial damage and loss of life was in smaller towns and rural areas.&rdquo;<\/p>\n<p>The preponderance of city dwellers&rsquo; tweets meant that their observations were overshadowing the experiences of those most directly affected. The data set reflected a sort of bias.<\/p>\n<p> <iframe loading=\"lazy\" src=\"https:\/\/www.youtube.com\/embed\/irP5RCdpilc\" frameborder=\"0\" width=\"560\" height=\"315\"><\/iframe> <\/p>\n<p>&ldquo;It couldn&rsquo;t give us insight about the experiences in areas where people were cut off from telecommunications and power&mdash;or simply not using Twitter,&rdquo; adds Crawford, also a visiting professor at the MIT Center for Civic Media.<\/p>\n<p>&ldquo;Then, in mid-2011, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"danah boyd\" href=\"http:\/\/www.danah.org\/\" target=\"_blank\">danah boyd<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and I co-authored <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"a paper for the Oxford Internet Institute&rsquo;s conference\" href=\"http:\/\/papers.ssrn.com\/sol3\/papers.cfm?abstract_id=1926431\" target=\"_blank\">a paper for the Oxford Internet Institute&rsquo;s conference<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> that articulated some of our concerns about big data and social media. There was very little around at the time that was asking critical questions of how big data was being used.&ldquo;<\/p>\n<p>The specific example of the Queensland floods would seem to point to the Twitter platform as being the culprit, but that&rsquo;s not necessarily the case.<\/p>\n<p>&ldquo;Social-media data is one small slice of all the data that is out there,&rdquo; Crawford says. &ldquo;The same can be said for sensor data. But these are just examples of a bigger problem: Data sets from any source will have gaps and problems. There is no such thing as a data set that is untouched by human design: We decide what counts as data and what does not. Or, as Lisa Gitelman [media historian at New York University] has described, &lsquo;Data need to be imagined as data to exist.&rsquo;<\/p>\n<p>&ldquo;Big data is still subjective. It is still informed by disciplinary perspectives and the ever-changing histories of knowledge. Regardless of where the data come from, it&rsquo;s useful to ask about the grounding assumptions, the methods, and the possible errors.&rdquo;<\/p>\n<p>Crawford offers a novel mechanism for enhancing the value of big data.<\/p>\n<p>&ldquo;Multidimentional data&mdash;data with depth, as I call it&mdash;can come from using mixed research methodologies: combining big-data analytics with small data studies that bring out the depth, nuance, and context that big data often misses. Small data can also produce rich insights and different perspectives that are left out or are unreachable by big-data studies.<\/p>\n<p>&ldquo;But above all, social-science approaches help us to ask productive questions about data to prevent us from falling victim to our own cognitive biases that often suggest answers we expect or lead us to results we wish to find.&rdquo;<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Posted by Rob Knies On Feb. 28, at the Santa Clara (Calif.) Convention Center, Kate Crawford, principal researcher at Microsoft Research New England, took the stage during the Strata Conference to deliver an illuminating, 17-minute talk entitled Algorithmic Illusions: Hidden Biases of Big Data. During that presentation, she cautioned that data and collections of data [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[1],"tags":[200349,186831,194869,195244,201599,201803,202315,202457,202769,196654,203187,203441,203657,203881,204055,204203,194370],"research-area":[],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-265","post","type-post","status-publish","format-standard","hentry","category-research-blog","tag-algorithmic-illusions-hidden-biases-of-big-data","tag-big-data","tag-brisbane","tag-danah-boyd","tag-floods","tag-harvard-business-review","tag-kate-crawford","tag-lisa-gitelman","tag-microsoft-research-new-england","tag-new-york-university","tag-oxford-internet-institute","tag-queensland","tag-santa-clara-convention-center","tag-six-provocations-for-big-data","tag-strata-conference","tag-the-hidden-biases-in-big-data","tag-twitter","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"April 2, 2013","formattedExcerpt":"Posted by Rob Knies On Feb. 28, at the Santa Clara (Calif.) Convention Center, Kate Crawford, principal researcher at Microsoft Research New England, took the stage during the Strata Conference to deliver an illuminating, 17-minute talk entitled Algorithmic Illusions: Hidden Biases of Big Data.During that&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/265","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=265"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/265\/revisions"}],"predecessor-version":[{"id":261627,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/265\/revisions\/261627"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=265"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=265"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=265"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=265"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=265"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=265"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=265"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=265"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=265"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=265"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=265"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}