{"id":646776,"date":"2020-03-31T10:25:51","date_gmt":"2020-03-31T17:25:51","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=646776"},"modified":"2020-04-23T15:57:28","modified_gmt":"2020-04-23T22:57:28","slug":"alt-text-that-informs-meeting-the-needs-of-people-who-are-blind-or-low-vision","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/alt-text-that-informs-meeting-the-needs-of-people-who-are-blind-or-low-vision\/","title":{"rendered":"Alt text that informs: Meeting the needs of people who are blind or low vision"},"content":{"rendered":"<div id=\"attachment_646782\" style=\"width: 1438px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-646782\" class=\"wp-image-646782 size-full\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/Person-Shoes-Tree-Paper-Blog-Post-hero.png\" alt=\"Screenshots from six categories of websites used in the study: a news site (showing a CNN article), a social networking site (showing a personal image and funny caption), a shopping site (showing a shirt sold on Amazon), an employment site (for jobs at Nebraska Wesleyan University), an advertisement (showing a woman applying makeup), and a blog (showing a DIY tutorial for crafts).\" width=\"1428\" height=\"450\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/Person-Shoes-Tree-Paper-Blog-Post-hero.png 1428w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/Person-Shoes-Tree-Paper-Blog-Post-hero-300x95.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/Person-Shoes-Tree-Paper-Blog-Post-hero-1024x323.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/Person-Shoes-Tree-Paper-Blog-Post-hero-768x242.png 768w\" sizes=\"auto, (max-width: 1428px) 100vw, 1428px\" \/><p id=\"caption-attachment-646782\" class=\"wp-caption-text\">In a study about the kinds of details people who are blind or have low vision want when encountering digital images, researchers used several different types of sources as prompts. The above are examples of images participants encountered when browsing some of those sources. Participants reported desiring image descriptions that varied in detail depending on the context in which an image appeared.<\/p><\/div>\n<p>Image descriptions are vital to making digital content fully accessible to people who are blind or have low vision. However, as past research has shown, content authors often leave out these descriptions\u2014also known as \u201calternative text\u201d or \u201calt text\u201d\u2014for images <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/caption-crawler-enabling-reusable-alternative-text-descriptions-using-reverse-image-search-2\/\">on the web<\/a> and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/its-almost-like-theyre-trying-to-hide-it-how-user-provided-image-descriptions-have-failed-to-make-twiter-accessible\/\">on social media<\/a>, making AI-based vision-to-language services that automatically generate image descriptions particularly important. To learn more about the design requirements for AI systems to create useful descriptions, our research team conducted interviews with 28 people who are blind or have low vision about the types of details they want when encountering digital images and identified ways in which those wants vary depending on where an image is encountered.<\/p>\n<p>Our findings are presented in the paper <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/person-shoes-tree-is-the-person-naked-what-people-with-vision-impairments-want-in-image-descriptions\/\">\u201c&#8217;Person, Shoes, Tree. Is the Person Naked?&#8217; What People with Vision Impairments Want in Image Descriptions,&#8221;<\/a>\u00a0which was accepted at the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/chi2020.acm.org\/\">ACM CHI Conference on Human Factors in Computing Systems (CHI 2020)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. The work is part of the <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/microsoft-ability-initiative-a-collaborative-quest-to-innovate-in-image-captioning-for-people-who-are-blind-or-with-low-vision\/\">Microsoft Ability Initiative<\/a>, a two-year collaboration launched last year between Microsoft researchers in accessibility and computer vision and faculty and students in AI and human-computer interaction at The University of Texas at Austin. Funded by Microsoft Research and <a href=\"https:\/\/www.microsoft.com\/en-us\/ai\/ai-for-accessibility\">AI for Accessibility Program<\/a>, the initiative aims to increase the utility and usability of automatically generated image descriptions for people who are blind, a population that relies on screen reader technology to access computing devices. Screen readers present textual content as audio or Braille output and can only present digital images if they have an accompanying alt text description.<\/p>\n<p>The paper\u2019s title stems from one participant\u2019s observation regarding the inadequacy of current AI-based image descriptions. In the tag-based approach used by Facebook, for example, images are labeled with statements such as \u201cThis image may contain: a person, shoes, a tree.\u201d In describing their encounters with such labels, the participant wondered in a tongue-in-cheek way whether the person in the image was really wearing only shoes, highlighting the lack of critical details in today\u2019s AI-generated descriptions.<\/p>\n<p>\u201cIn this work, our first priority was to learn from people who are blind or have low vision about their daily experiences with image descriptions,\u201d said our<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/www.linkedin.com\/in\/abigale-stangl-ph-d-36443810\/\"> lead author on the paper, postdoctoral researcher Abigale Stangl from UT Austin<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. \u201cThe AI for Accessibility program enabled us to take the time to hear from teachers, lawyers, music producers, students, politicians, job seekers, and others and then communicate our empirical findings in a way that can improve computer vision algorithms.\u201d<\/p>\n<h3>Same image, different details<\/h3>\n<p>While earlier studies have identified that the usefulness of image descriptions is dependent on the digital context in which the image appears, our study gives insight into how image description wants vary based on context across seven different sources: news websites, social networking sites\/platforms, e-commerce websites, employer\/employment websites, online dating websites\/platforms, productivity applications, and e-publications.<\/p>\n<p>\u201cThough it is well understood that image descriptions are important to convey the purpose of an image, this research showed us that people who are blind or have low vision want image descriptions that are responsive to where they encounter the image,\u201d said Stangl. \u201cIn other words, people want different content for the same image depending on where they find it.\u201d<\/p>\n<p>For example, if a photo of a person appeared in a news story, people might want a description that includes details about the setting of the image to give a sense of place. But if a photo of a person appeared on a social media or dating website, people might want increased details about that person\u2019s appearance, including some details that may be subjective and\/or sensitive, such as race, perceived gender, and attractiveness. One participant mentioned that knowing the race and gender of people in photos of board members on an employer\/employment website might help them understand whether the company values a diverse workplace. These latter examples illustrate practical and ethical challenges for emerging AI systems, such as whether AI systems can\u2014or should\u2014be trained to provide subjective judgments or information about sensitive demographic attributes.<\/p>\n<table style=\"border-spacing: inherit; border-collapse: collapse; width: 100%; padding: 6px; text-align: center; border-bottom: 1px solid #000000;\">\n<tbody>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><strong>News<\/strong><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><strong>Social Networking<\/strong><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><strong>eCommerce<\/strong><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><strong>Employment<\/strong><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><strong>Dating<\/strong><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><strong>Productivity<\/strong><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><strong>E-Publication<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"align: middle; padding: 6px; border-bottom: 1px solid #000000; background-color: #dddddd;\" colspan=\"8\"><strong>Event\/Scene<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">People Present<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Text<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Activity<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Interaction<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Landmarks<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Building Features<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Weather<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Lighting<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"align: middle; padding: 6px; border-bottom: 1px solid #000000; background-color: #dddddd;\" colspan=\"8\"><strong>People<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Text<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Salient Objects<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Activity<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Gender<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Race\/Diversity<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Name of Person<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Celebrity Name<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Expression<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Attire\/Clean<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Body Shape\/Size<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Pets<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Hair Color<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Hair Style<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Eye Color<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Unique Physical<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Tattoos<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"align: middle; padding: 6px; border-bottom: 1px solid #000000; background-color: #dddddd;\" colspan=\"8\"><strong>Object<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Text<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Name<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Form<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Fit<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Color<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Overall Style<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Material<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Logos\/Symbols<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Damage<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">Unique Features<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\">x<\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<td style=\"padding: 6px; border-bottom: 1px solid #000000;\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: center;\"><sub>The above table is a cross-source analysis of participants\u2019 description preferences, indicated with an<em> x<\/em>, for seven source types. The preferences are grouped by common focuses of image composition\u2014event\/scene, people, and objects\u2014and the types of preferences were included if at least one participant indicated them as important to include. <\/sub><\/p>\n<p>As part of our research, we created a tabular reference guide identifying the types of details people find useful across sources. These findings can inform the design of future AI-based captioning tools. For instance, the details of interest to our participants suggest new categories of metadata that should be produced by crowd workers to feed into improved machine learning models. Our work also highlights the opportunity for creating custom vision-to-language models that are dependent on the context in which an image appears.<\/p>\n<h3>Accelerating advancement<\/h3>\n<p>The Microsoft Ability Initiative with UT Austin is part of the Microsoft Research <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/group\/ability\/\">Ability team\u2019s<\/a> ongoing focus on <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/image-accessibility\/#!publications\">image accessibility<\/a>. In addition to this research, we\u2019re jointly hosting an <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/vizwiz.org\/tasks-and-datasets\/image-captioning\/\">Image Captioning Data Challenge<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/vizwiz.org\/workshops\/2020-workshop\/\">VizWiz Grand Challenge Workshop<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> as part of the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"http:\/\/cvpr2020.thecvf.com\/\">International Conference on Computer Vision and Pattern Recognition (CVPR 2020)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, which is scheduled to take place in Seattle in June.<\/p>\n<p>\u201cWe are excited that this event will promote greater interaction between the diverse groups of researchers and practitioners working on image description technologies,\u201d said <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/www.ischool.utexas.edu\/people\/people-details?PersonID=305\">co-author and UT Austin professor Danna Gurari<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. \u201cUltimately, we expect this work will accelerate the conversion of cutting-edge research into market products that better empower people who are blind or have low vision to independently address their daily visual challenges.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Image descriptions are vital to making digital content fully accessible to people who are blind or have low vision. However, as past research has shown, content authors often leave out these descriptions\u2014also known as \u201calternative text\u201d or \u201calt text\u201d\u2014for images on the web and on social media, making AI-based vision-to-language services that automatically generate image [&hellip;]<\/p>\n","protected":false},"author":38838,"featured_media":646953,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[{"type":"user_nicename","value":"Meredith Ringel Morris","user_id":"32884"}],"msr_hide_image_in_river":0,"footnotes":""},"categories":[1],"tags":[],"research-area":[13554],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-646776","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-human-computer-interaction","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[283244],"related-projects":[638784],"related-events":[641571],"related-researchers":[],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/1400x788_No_logo_Generic15363_Featured-Image-960x540.png\" class=\"img-object-cover\" alt=\"Screenshots from five categories of websites used in the study: a news site (showing a CNN article), a social networking site (showing a personal image and funny caption, an employment site (for jobs at Nebraska Wesleyan University), an advertisement (showing a woman applying makeup), and a blog (showing a DIY tutorial for crafts).\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/1400x788_No_logo_Generic15363_Featured-Image-960x540.png 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/1400x788_No_logo_Generic15363_Featured-Image-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/1400x788_No_logo_Generic15363_Featured-Image-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/1400x788_No_logo_Generic15363_Featured-Image-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/1400x788_No_logo_Generic15363_Featured-Image-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/1400x788_No_logo_Generic15363_Featured-Image-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/1400x788_No_logo_Generic15363_Featured-Image-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/1400x788_No_logo_Generic15363_Featured-Image-640x360.png 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/1400x788_No_logo_Generic15363_Featured-Image-1280x720.png 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/1400x788_No_logo_Generic15363_Featured-Image.png 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"Meredith Ringel Morris","formattedDate":"March 31, 2020","formattedExcerpt":"Image descriptions are vital to making digital content fully accessible to people who are blind or have low vision. However, as past research has shown, content authors often leave out these descriptions\u2014also known as \u201calternative text\u201d or \u201calt text\u201d\u2014for images on the web and on&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/646776","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/38838"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=646776"}],"version-history":[{"count":24,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/646776\/revisions"}],"predecessor-version":[{"id":646950,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/646776\/revisions\/646950"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/646953"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=646776"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=646776"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=646776"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=646776"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=646776"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=646776"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=646776"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=646776"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=646776"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=646776"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=646776"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}