{"id":932478,"date":"2023-06-06T10:00:55","date_gmt":"2023-06-06T17:00:55","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=932478"},"modified":"2025-01-09T08:34:03","modified_gmt":"2025-01-09T16:34:03","slug":"asl-citizen","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/asl-citizen\/","title":{"rendered":"ASL Citizen"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background- card-background--full-bleed\">\n\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"3840\" height=\"1920\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/ASLCitizen_grid_blurred.png\" class=\"attachment-full size-full\" alt=\"A grid of screenshots from the ASL Citizen dataset, showing different people performing different signs in American Sign Language.\" style=\"object-position: 44% 41%\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/ASLCitizen_grid_blurred.png 3840w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/ASLCitizen_grid_blurred-300x150.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/ASLCitizen_grid_blurred-1024x512.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/ASLCitizen_grid_blurred-768x384.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/ASLCitizen_grid_blurred-1536x768.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/ASLCitizen_grid_blurred-2048x1024.png 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/ASLCitizen_grid_blurred-240x120.png 240w\" sizes=\"auto, (max-width: 3840px) 100vw, 3840px\" \/>\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 \">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 class=\"wp-block-heading\" id=\"asl-citizen\">ASL Citizen<\/h1>\n\n\n\n<p>A Community-sourced Dataset for Advancing Isolated Sign Language Recognition<\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<div class=\"yt-consent-placeholder\" role=\"region\" aria-label=\"Video playback requires cookie consent\" data-video-id=\"EohCVAZ1UwE\" data-poster=\"https:\/\/img.youtube.com\/vi\/EohCVAZ1UwE\/maxresdefault.jpg\"><iframe aria-hidden=\"true\" tabindex=\"-1\" title=\"Overview\" width=\"500\" height=\"281\" data-src=\"https:\/\/www.youtube-nocookie.com\/embed\/EohCVAZ1UwE?feature=oembed&rel=0&enablejsapi=1\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><div class=\"yt-consent-placeholder__overlay\"><button class=\"yt-consent-placeholder__play\"><svg width=\"42\" height=\"42\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><g fill=\"none\" fill-rule=\"evenodd\"><circle fill=\"#000\" opacity=\".556\" cx=\"21\" cy=\"21\" r=\"21\"\/><path stroke=\"#FFF\" d=\"M27.5 22l-12 8.5v-17z\"\/><\/g><\/svg><span class=\"yt-consent-placeholder__label\">Video playback requires cookie consent<\/span><\/button><\/div><\/div>\n<\/div><\/figure>\n\n\n\n<p>Signed languages are the primary languages of about <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/wfdeaf.org\/our-work\/\" target=\"_blank\" rel=\"noopener noreferrer\">70 million D\/deaf people worldwide<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Despite their importance, existing information and communication technologies are primarily designed for written or spoken language.&nbsp;Though automated solutions&nbsp;might help address such&nbsp;accessibility gaps, the state of sign language modeling is far behind that of spoken language modeling, largely due to lack of appropriate training data. Prior datasets suffer from a combination of small size, limited diversity, lack of real-world recording settings, poor labels, and lack of consent for video use.&nbsp;<\/p>\n\n\n\n<p>To help advance the state of sign language modeling, we created ASL Citizen &#8212; the first crowdsourced isolated sign language dataset, containing about 84k&nbsp;videos of 2.7k&nbsp;distinct signs from American Sign Language (ASL). This dataset is the largest-to-date Isolated Sign Language Recognition (ISLR) dataset. In addition to its size, unlike prior datasets, it contains everyday signers in everyday recording scenarios, and was collected with consent from each contributor under IRB approval. Deaf research team members were involved throughout.<\/p>\n\n\n\n<p>This dataset is released alongside&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/abs\/2304.05934\" target=\"_blank\" rel=\"noopener noreferrer\">our paper<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> reframing ISLR as a dictionary retrieval task and establishing state-of-the-art baselines. In dictionary retrieval, someone sees or thinks of a sign that they would like to look up; they try to repeat the sign in front of an everyday (RGB) camera; and then an ISLR algorithm returns a list of signs that are closest to the demonstrated sign.&nbsp;The results list may be accompanied by sign definitions in text or sign language video. This framing grounds ISLR research in a meaningful, real-world application. Our baselines leverage existing appearance and pose-based techniques, and with our dataset improve the state-of-the-art in ISLR from about 32% to 62% accuracy.<\/p>\n\n\n\n<p>This project was conducted at Microsoft Research with collaborators at multiple organizations.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Microsoft: Danielle Bragg (PI), Mary Bellard, Hal Daum\u00e9 III, Alex Lu, Vanessa Milan, Fyodor Minakov, Paul Oka, Philip Rosenfield, Chinmay Singh, William Thies<\/li>\n\n\n\n<li>Boston University: Lauren Berger, Naomi Caselli, Miriam Goldberg, Hannah Goldblatt, Kriston Pumphrey<\/li>\n\n\n\n<li>University of Washington: Aashaka Desai, Richard Ladner<\/li>\n\n\n\n<li>Rochester Institute of Technology: Abraham Glasser<\/li>\n<\/ul>\n\n\n\n<p><strong>Dataset License:<\/strong> Please see the supporting tab. If you are interested in commercial use, please contact&nbsp;<a href=\"mailto:ASL_Citizen@microsoft.com\" target=\"_blank\" rel=\"noreferrer noopener\">ASL_Citizen@microsoft.com<\/a>.&nbsp;<\/p>\n\n\n\n<p><strong>Dataset Download:<\/strong> <\/p>\n\n\n\n<p>To download via web interface, please visit: <a href=\"https:\/\/www.microsoft.com\/en-us\/download\/details.aspx?id=105253\" target=\"_blank\" rel=\"noreferrer noopener\">Download ASL Citizen from Official Microsoft Download Center<\/a><\/p>\n\n\n\n<p>To download via command line, please execute: wget https:\/\/download.microsoft.com\/download\/b\/8\/8\/b88c0bae-e6c1-43e1-8726-98cf5af36ca4\/ASL_Citizen.zip<\/p>\n\n\n\n<p><strong>Open-source Repo:<\/strong> <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/ASL-citizen-code\" target=\"_blank\" rel=\"noopener noreferrer\">https:\/\/github.com\/microsoft\/ASL-citizen-code<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> <\/p>\n\n\n\n<p><strong>Citation:<\/strong> If you use this dataset in your work, please cite <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/abs\/2304.05934\" target=\"_blank\" rel=\"noopener noreferrer\">our paper<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">@article{desai2023asl,\n  title={ASL Citizen: A Community-Sourced Dataset for Advancing Isolated Sign Language Recognition},\n  author={Desai, Aashaka and Berger, Lauren and Minakov, Fyodor O and Milan, Vanessa and Singh, Chinmay and Pumphrey, Kriston and Ladner, Richard E and Daum{\\'e} III, Hal and Lu, Alex X and Caselli, Naomi and Bragg, Danielle},\n  journal={arXiv preprint arXiv:2304.05934},\n  year={2023}\n}<\/pre>\n\n\n\n<p><strong>Acknowledgements:<\/strong> We are deeply grateful to all community members who participated in this dataset project.<\/p>\n\n\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<div class=\"yt-consent-placeholder\" role=\"region\" aria-label=\"Video playback requires cookie consent\" data-video-id=\"9B3-lABCvgY\" data-poster=\"https:\/\/img.youtube.com\/vi\/9B3-lABCvgY\/maxresdefault.jpg\"><iframe aria-hidden=\"true\" tabindex=\"-1\" title=\"Deaf Culture\" width=\"500\" height=\"281\" data-src=\"https:\/\/www.youtube-nocookie.com\/embed\/9B3-lABCvgY?feature=oembed&rel=0&enablejsapi=1\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><div class=\"yt-consent-placeholder__overlay\"><button class=\"yt-consent-placeholder__play\"><svg width=\"42\" height=\"42\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><g fill=\"none\" fill-rule=\"evenodd\"><circle fill=\"#000\" opacity=\".556\" cx=\"21\" cy=\"21\" r=\"21\"\/><path stroke=\"#FFF\" d=\"M27.5 22l-12 8.5v-17z\"\/><\/g><\/svg><span class=\"yt-consent-placeholder__label\">Video playback requires cookie consent<\/span><\/button><\/div><\/div>\n<\/div><\/figure>\n\n\n\n<p>Deaf and hard-of-hearing (DHH) individuals communicate in many ways, including&nbsp;through sign language. Along with being an accessible modality for DHH individuals, signed languages are culturally significant, forming a cornerstone of shared experience, cultural identity, and institutions for Deaf communities worldwide. &nbsp;<\/p>\n\n\n\n<p>American Sign Language (ASL) is the primary sign language used in North America (and several other parts of the world). Like other signed languages, signs in ASL are composed of phonological elements including handshapes, hand and body movements, hand location, and facial expressions. Vocabulary size is large, and complex rules govern how these signs are put together to make sentence. These rules for combining words, phonology, and morphology into sentences are rich, and substantially different from English. Sign execution also varies among signers, across contexts, and through regions and dialects. ASL is a complete, natural language in its own right.<\/p>\n\n\n\n<p>Signed languages play a central role in Deaf cultures and identity. Groups of people who primarily communicate in a signed language form distinct cultures as sociolinguistic minorities within the broader hearing majority. Within these communities, Deafness is a proud cultural identity. Despite the richness of signed languages and Deaf cultures, Deaf communities have a history of marginalization and oppression by the hearing majority. Many harmful misconceptions and biases exist within hearing communities about signed languages and Deaf people. For example, education systems have suppressed the use of signed languages, to the detriment of many deaf students.<\/p>\n\n\n\n<p>Given this context, it is particularly important that sign language technologies are developed in partnership with Deaf communities and with an understanding of Deaf culture and signed languages. To this end, we involved Deaf collaborators in key roles at every step of this project, including conception, recruitment, participation, analysis, and dissemination. We encourage those using this dataset to educate themselves on Deaf culture and American Sign Language in order to conduct research and build systems that are useful to Deaf community members while minimizing harms.<\/p>\n\n\n\n<p>As an entry point to more information on Deaf cultures and sign languages,&nbsp;please check out the following resources.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/nationaldeafcenter.org\/resource-items\/deaf-community-introduction\/\" target=\"_blank\" rel=\"noopener noreferrer\">The Deaf Community: An Introduction &#8211; National Deaf Center<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n\n\n\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.nad.org\/resources\/american-sign-language\/community-and-culture-frequently-asked-questions\/\" target=\"_blank\" rel=\"noopener noreferrer\">Community and Culture Frequently Asked Questions &#8211; National Association of the Deaf<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n\n\n\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/wfdeaf.org\/\">Home Page &#8211; World Federation of the Deaf<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n\n\n\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/www.museumofdeaf.org\/\">Home Page &#8211; Museum of Deaf History, Arts & Culture<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<\/ul>\n\n\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<div class=\"yt-consent-placeholder\" role=\"region\" aria-label=\"Video playback requires cookie consent\" data-video-id=\"2I7EU6uHmrI\" data-poster=\"https:\/\/img.youtube.com\/vi\/2I7EU6uHmrI\/maxresdefault.jpg\"><iframe aria-hidden=\"true\" tabindex=\"-1\" title=\"Dataset Description\" width=\"500\" height=\"281\" data-src=\"https:\/\/www.youtube-nocookie.com\/embed\/2I7EU6uHmrI?feature=oembed&rel=0&enablejsapi=1\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><div class=\"yt-consent-placeholder__overlay\"><button class=\"yt-consent-placeholder__play\"><svg width=\"42\" height=\"42\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><g fill=\"none\" fill-rule=\"evenodd\"><circle fill=\"#000\" opacity=\".556\" cx=\"21\" cy=\"21\" r=\"21\"\/><path stroke=\"#FFF\" d=\"M27.5 22l-12 8.5v-17z\"\/><\/g><\/svg><span class=\"yt-consent-placeholder__label\">Video playback requires cookie consent<\/span><\/button><\/div><\/div>\n<\/div><\/figure>\n\n\n\n<p>ASL Citizen is the first crowdsourced isolated sign language video dataset. The dataset contains about 84k video recordings of 2.7k isolated signs from American Sign Language (ASL), and is about four times larger than prior single-sign datasets. Our videos were recorded by 52 Deaf or hard of hearing signers through our novel sign language crowdsourcing platform, first proposed in our <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/dl.acm.org\/doi\/abs\/10.1145\/3555627\" target=\"_blank\" rel=\"noopener noreferrer\">prior work<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> exploring crowdsourcing for sign language video collection, and enhanced for scalability. Because the data is crowdsourced, it contains examples of everyday signers in everyday environments, providing more representative data for data-driven machine learning models intended to generalize in real-world settings, such as Isolated Sign Language Recognition (ISLR) models deployed for applications like sign language dictionary retrieval.<\/p>\n\n\n\n<p>On the platform, signers contributed videos for two simultaneous purposes: 1) to help build a community-sourced dictionary, and 2) to contribute to a dataset for research purposes. This dual-purpose design enables the creation of a direct community resource, while also enabling longer-term research. On the website, participants were prompted by a series of videos of isolated signs (as demonstrated by a highly proficient ASL model, who we refer to as the \u201cseed signer\u201d), and recorded their own version of each. This setup allowed us to capture a range of backgrounds and lighting conditions, similar to those present in everyday dictionary lookup queries. Our task design also enabled us to automatically label each video with the prompt sign, largely eliminating post-hoc labelling challenges that have limited large sign language data collections in the past. Finally, this setup enabled participants to engage in a full consent process, unlike past scraped sign language video collections. Table 1 below provides a comparison to past datasets.&nbsp;<\/p>\n\n\n\n<p>The sign vocabulary we use is taken from <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/asl-lex.org\/download.html\" target=\"_blank\" rel=\"noopener noreferrer\">ASL-LEX<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, which also provides detailed linguistic analysis of each sign. The signs in our corpus map onto the signs in their linguistic corpus, using the unique identifier token (referred to as the sign &#8220;Code&#8221; in ASL-LEX resources).<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"268\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table1-1024x268.png\" alt=\"Table comparing prior ISLR datasets for ASL compared to ASL Citizen. Columns -- Dataset, Vocab size, Videos, Videos\/sign, Signers, Collection, Consent. Rows -- RWTH BOSTN-50: 50, 483, 9.7, 3Deaf, Lab, Check; Purdue RVL-SLL: 39, 546, 14.0, 14 Deaf, Lab, Check. Boston ASLLVD; 2,742, 9,794, 3.6, 6 Deaf, Lab, Check; WLASL-2000: 2,000, 21,083, 10.5, 119 Unknown, Scraped, X; ASL Citizen: 2,731, 83,399, 30.5, 52 Deaf\/HH, Crowd, Check.\" class=\"wp-image-945624\" style=\"width:768px;height:201px\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table1-1024x268.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table1-300x79.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table1-768x201.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table1-1536x402.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table1-2048x536.png 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table1-240x63.png 240w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Table 1: Properties of existing ISLR datasets for ASL compared to our new dataset (<strong>ASL Citizen<\/strong>, last row).<\/figcaption><\/figure>\n\n\n\n<p>We release our dataset along with training, validation, and test splits (shown in Table 2 below). Each dataset participant has been assigned to one split. By creating signer-independent splits, we provide a setup for methods development that more aligns with real-world recognition applications, where users querying a model are unlikely to be previously seen in training data. The distribution of signers across splits was chosen to balance female-male gender ratio. We also provide generalized user demographics and gloss metadata to support further analysis. Please note that the seed signer is included in the dataset as P52.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"213\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table2-1024x213.png\" alt=\"Table showing statistics for ASL Citizen dataset splits. Columns -- Train, Val, Test. Rows -- Users: 35, 6, 11; Videos: 40,154, 10,304, 32,941; User distribution: 60% F, 83% F, 55% F; Video distribution: 54% F, 71 %F, 55% F.\" class=\"wp-image-945627\" style=\"width:768px;height:160px\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table2-1024x213.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table2-300x62.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table2-768x160.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table2-1536x319.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table2-240x50.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/dataset-table2.png 1776w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Table 2: Description of our splits: train, validation (val), and test. Splits are user-independent and designed to have roughly comparable gender breakdowns.<\/figcaption><\/figure>\n\n\n\n<p>The video counts displayed in publications and on this page may differ slightly. This is due to small dataset updates that occurred during preparation and analysis (e.g. to remove a small number of videos that displayed an error message during webcam recording). All collection and release procedures were reviewed and approved by our ethics review board and IRB.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>Version<\/td><td># Videos<\/td><td># Signs<\/td><td># Signers<\/td><td>Date, Publication(s)<\/td><\/tr><tr><td>Version 0.9<\/td><td>83,912<\/td><td>2,731<\/td><td>52<\/td><td>April 2023, arXiv initial publication<\/td><\/tr><tr><td>Version 1.0<\/td><td>83,399<\/td><td>2,731<\/td><td>52<\/td><td>June 2023<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\">Table 3: List of dataset versions.<\/figcaption><\/figure>\n\n\n\n<p>For additional details on the dataset, please see the datasheet in the supporting tab, and check out our publications.<\/p>\n\n\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<div class=\"yt-consent-placeholder\" role=\"region\" aria-label=\"Video playback requires cookie consent\" data-video-id=\"6ffm3mjwmZw\" data-poster=\"https:\/\/img.youtube.com\/vi\/6ffm3mjwmZw\/maxresdefault.jpg\"><iframe aria-hidden=\"true\" tabindex=\"-1\" title=\"Reccomended Use\" width=\"500\" height=\"281\" data-src=\"https:\/\/www.youtube-nocookie.com\/embed\/6ffm3mjwmZw?feature=oembed&rel=0&enablejsapi=1\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><div class=\"yt-consent-placeholder__overlay\"><button class=\"yt-consent-placeholder__play\"><svg width=\"42\" height=\"42\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><g fill=\"none\" fill-rule=\"evenodd\"><circle fill=\"#000\" opacity=\".556\" cx=\"21\" cy=\"21\" r=\"21\"\/><path stroke=\"#FFF\" d=\"M27.5 22l-12 8.5v-17z\"\/><\/g><\/svg><span class=\"yt-consent-placeholder__label\">Video playback requires cookie consent<\/span><\/button><\/div><\/div>\n<\/div><\/figure>\n\n\n\n<p>This dataset was designed primarily for work on isolated sign language recognition (ISLR), and within that space we recommend using this dataset for the task of dictionary retrieval. We define the task of video-based dictionary retrieval as: given a video of a person demonstrating a single sign through a webcam, the system retrieves a ranked list of dictionary entries that match that sign. This task is useful for creating reliable ASL-to-ASL or ASL-to-English dictionaries, which are essential tools for language learners and users.<\/p>\n\n\n\n<p>We caution against using this dataset for understanding continuous signing by tokenizing sequences of signs that might map to our dataset. Continuous signing &#8212; full sentences and longer signed content &#8212; introduces many grammatical and structural complexities not present in isolated signs (e.g. signs may be modulated, co-articulation effects, how context changes the meaning of signs, etc.). Many of these complexities are also absent from spoken\/written languages. At a minimum, this dataset would need to be used in conjunction with other datasets and\/or domain knowledge about sign language in order to tackle continuous recognition or translation.<\/p>\n\n\n\n<p>We ask that this dataset be used with an aim of making the world more equitable and just for deaf people, and with a commitment to do no harm. In that spirit, this dataset should not be used to develop technology that purports to replace sign language interpreters, fluent signing educators, and\/or other hard-won accommodations for deaf people. We also ask that users of this dataset make no attempt to identify participants, or to use this dataset for applications that might exploit participant identity or appearance, including (but not limited to) facial recognition, deepfakes, or identification of sensitive attributes like race.<\/p>\n\n\n\n<p>For whichever application you choose, we recommend using this data with meaningful involvement from Deaf community members at every step. As we describe in our linked paper, research and development of sign language technologies that involves Deaf community members in leadership roles with decision-making authority increases the quality of the work, and can help to ensure technologies are relevant and wanted. Historically, projects developed without meaningful Deaf involvement have not been well received and have damaged relationships between technologists and Deaf communities.<\/p>\n\n\n\n<p>Please see the links below for an entry point to more information about respectful sign language technology development.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/blog.castac.org\/2022\/04\/disability-dongle\/\" target=\"_blank\" rel=\"noopener noreferrer\">Disability Dongle &#8211; <span class=\"sr-only\"> (opens in new tab)<\/span><\/a><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/blog.castac.org\/author\/lizjackson\/\">Liz Jackson<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/blog.castac.org\/author\/alexhaagaard\/\">Alex Haagaard<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/blog.castac.org\/author\/ruawilliams\/\">Rua Williams<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n\n\n\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/depts.washington.edu\/asluw\/SignAloud-openletter.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">SignAloud Open Letter &#8211; Lance Forshay, Kristi Winter, Emily M. Bender<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n\n\n\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/aclanthology.org\/2021.mtsummit-at4ssl.2\/\" target=\"_blank\" rel=\"noopener noreferrer\">Is \u201cgood enough\u201d good enough? Ethical and responsible development of sign language technologies &#8211; <span class=\"sr-only\"> (opens in new tab)<\/span><\/a><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/aclanthology.org\/people\/m\/maartje-de-meulder\/\">Maartje De Meulder<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n\n\n\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/books.google.com\/books?hl=en&lr=&id=ohqff8DBt9gC&oi=fnd&pg=PR9&dq=james+charlton+nothing+about+us+without+us&ots=5BSsg-InLx&sig=BlM1jqTuRM1nvuNRhm1j2XSBmT8#v=onepage&q=james%20charlton%20nothing%20about%20us%20without%20us&f=false\" target=\"_blank\" rel=\"noopener noreferrer\">Nothing About Us Without Us: Disability Oppression and Empowerment &#8211; James I. Charlton<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/the-fate-landscape-of-sign-language-ai-datasets-an-interdisciplinary-perspective\/\">The FATE Landscape of Sign Language AI Datasets: An Interdisciplinary Perspective &#8211; Danielle Bragg, Naomi Caselli, Julie A. Hochgesang, Matt Huenerfauth, Leah Katz-Hernandez, Oscar Koller, Raja Kushalnagar, Christian Vogler, Richard E. Ladner<\/a><\/li>\n<\/ul>\n\n\n\n\n\n<p>Dataset contributors added videos for two simultaneous purposes: 1) to create a dataset to help advance research as described in the other tabs, and 2) to contribute to a community-sourced dictionary showcasing the signing community&#8217;s diversity. By creating this dictionary resource, we provide immediate and direct benefits to the signing community, while also pursuing longer-term benefits derived from research.<\/p>\n\n\n\n<p>Please check out the community-sourced dictionary here: <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/community.aslgames.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">https:\/\/community.aslgames.org\/<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n\n\n\n\n\n<h2 class=\"wp-block-heading\" id=\"microsoft-research-license-terms\">MICROSOFT RESEARCH LICENSE TERMS<\/h2>\n\n\n\n<p><strong>IF YOU LIVE IN THE UNITED STATES, PLEASE READ THE \u201cBINDING ARBITRATION AND CLASS ACTION WAIVER\u201d SECTION BELOW. IT AFFECTS HOW DISPUTES ARE RESOLVED.<\/strong><\/p>\n\n\n\n<p>These license terms are an agreement between you and Microsoft Corporation (or one of its affiliates). They apply to the source code, object code, or data (collectively \u201cMaterials\u201d) that accompany this license. IF YOU COMPLY WITH THESE LICENSE TERMS, YOU HAVE THE RIGHTS BELOW. BY USING THE MATERIALS, YOU ACCEPT THESE TERMS.<\/p>\n\n\n\n<p><strong>1) INSTALLATION AND USE RIGHTS to The Materials.<\/strong><\/p>\n\n\n\n<p>Subject to the terms of this agreement, you have the below rights, if applicable, to use the Materials solely for non-commercial, non-revenue generating, research purposes:<\/p>\n\n\n\n<p><strong>a)&nbsp;&nbsp;&nbsp; Source Code.<\/strong> If source code is included, you may use and modify the source code, but you may not distribute the source code.<\/p>\n\n\n\n<p><strong>b)&nbsp;&nbsp;&nbsp; <\/strong><strong>Object Code. <\/strong>If object code is included, you may use the object code, but you may not distribute the object code.<\/p>\n\n\n\n<p><strong>c)&nbsp;&nbsp;&nbsp; <\/strong><strong>Data. <\/strong>If data is included, you may use and modify the data, but your use and modification must be consistent with the consent under which the data was provided and\/or gathered and you may not distribute the data or your modifications to the data.<\/p>\n\n\n\n<p><strong>2)&nbsp;&nbsp;&nbsp; SCOPE OF LICENSE.<\/strong> The Materials are licensed, not sold. Microsoft reserves all other rights. Unless applicable law gives you more rights despite this limitation, you will not (and have no right to):<\/p>\n\n\n\n<p><strong>a)&nbsp;&nbsp;&nbsp; <\/strong>work around any technical limitations in the Materials that only allow you to use it in certain ways;<\/p>\n\n\n\n<p><strong>b)&nbsp;&nbsp;&nbsp; <\/strong>reverse engineer, decompile or disassemble the Materials;<\/p>\n\n\n\n<p><strong>c)&nbsp;&nbsp;&nbsp; <\/strong>remove, minimize, block, or modify any notices of Microsoft or its suppliers in the Materials;<\/p>\n\n\n\n<p><strong>d)&nbsp;&nbsp;&nbsp; <\/strong>use the Materials in any way that is against the law or to create or propagate malware; or<\/p>\n\n\n\n<p><strong>e)&nbsp;&nbsp;&nbsp; <\/strong>share, publish, distribute or lend the Materials, provide the Materials as a stand-alone hosted solution for others to use, or transfer the Materials or this agreement to any third party.<\/p>\n\n\n\n<p><strong>3)&nbsp;&nbsp;&nbsp; PERSONAL DATA.<\/strong>&nbsp; If the data (set forth in Section 1(c) above) includes or is found to include any data that enables any ability to identify an individual (\u201cPersonal Data\u201d), you will not use such Personal Data for any purpose other than was authorized and consented to by the data subject\/research participant.&nbsp; You will not use Personal Data to contact any person.&nbsp; You will keep Personal Data in strict confidence.&nbsp; You will not share any Personal Data that is collected or in your possession with any third party for any reason and as required under the original consent agreement.&nbsp; Further, you will destroy the Personal Data and any backup or copies, immediately upon the completion of your research.&nbsp;<\/p>\n\n\n\n<p><strong>4)&nbsp;&nbsp;&nbsp; <\/strong><strong>LICENSE TO MICROSOFT.&nbsp; <\/strong>Notwithstanding the limitations in Section 1, you may distribute your modifications back to Microsoft, and if you do provide Microsoft with modifications of the Materials, you hereby grant Microsoft, without any restrictions or limitations, a non-exclusive, perpetual, irrevocable, royalty-free, assignable and sub-licensable license, to reproduce, publicly perform or display, install, use, modify, post, distribute, make and have made, sell and transfer such modifications and derivatives for any purpose.<\/p>\n\n\n\n<p><strong>5)&nbsp;&nbsp;&nbsp; <\/strong><strong>Publication.&nbsp; <\/strong>You may publish (or present papers or articles) on your results from using the Materials provided that no material or substantial portion of the Materials is included in any such publication or presentation.<\/p>\n\n\n\n<p><strong>6)&nbsp;&nbsp;&nbsp; <\/strong><strong>FEEDBACK.<\/strong> Any feedback about the Materials provided by you to us is voluntarily given, and Microsoft shall be free to use the feedback as it sees fit without obligation or restriction of any kind, even if the feedback is designated by you as confidential.&nbsp; Such feedback shall be considered a contribution and licensed to Microsoft under the terms of Section 4 above.<\/p>\n\n\n\n<p><strong>7)&nbsp;&nbsp;&nbsp; <\/strong><strong>EXPORT RESTRICTIONS.<\/strong> You must comply with all domestic and international export laws and regulations that apply to the Materials, which include restrictions on destinations, end users, and end use. For further information on export restrictions, visit (aka.ms\/exporting).<\/p>\n\n\n\n<p><strong>8)&nbsp;&nbsp;&nbsp; <\/strong><strong>SUPPORT SERVICES.<\/strong> Microsoft is not obligated under this agreement to provide any support services for the Materials. Any support provided is \u201cas is\u201d, \u201cwith all faults\u201d, and without warranty of any kind.<a><\/a><a><\/a><\/p>\n\n\n\n<p><strong>9)&nbsp;&nbsp;&nbsp; <\/strong><strong>BINDING ARBITRATION AND CLASS ACTION WAIVER. This Section applies if you live in (or, if a business, your principal place of business is in) the United States.&nbsp; <\/strong>If you and Microsoft have a dispute, you and Microsoft agree to try for 60 days to resolve it informally. If you and Microsoft can\u2019t, you and Microsoft agree to <strong>binding individual arbitration before the American Arbitration Association<\/strong> under the Federal Arbitration Act (\u201cFAA\u201d), and <strong>not to sue in court in front of a judge or jury<\/strong>. Instead, a neutral arbitrator will decide. <strong>Class action lawsuits, class-wide arbitrations, private attorney-general actions,<\/strong> and any other proceeding where someone acts in a representative capacity <strong>are not allowed<\/strong>; nor is combining individual proceedings without the consent of all parties. The complete Arbitration Agreement contains more terms and is at aka.ms\/arb-agreement-1. You and Microsoft agree to these terms.<\/p>\n\n\n\n<p><strong>10) <\/strong><strong>ENTIRE AGREEMENT.<\/strong> This agreement, and any other terms Microsoft may provide for supplements, updates, or third-party applications, is the entire agreement for the Materials.<\/p>\n\n\n\n<p><strong>11) <\/strong><strong>APPLICABLE LAW AND PLACE TO RESOLVE DISPUTES.<\/strong> If you acquired the Materials in the United States or Canada, the laws of the state or province where you live (or, if a business, where your principal place of business is located) govern the interpretation of this agreement, claims for its breach, and all other claims (including consumer protection, unfair competition, and tort claims), regardless of conflict of laws principles, except that the FAA governs everything related to arbitration. If you acquired the Materials in any other country, its laws apply, except that the FAA governs everything related to arbitration. If U.S. federal jurisdiction exists, you and Microsoft consent to exclusive jurisdiction and venue in the federal court in King County, Washington for all disputes heard in court (excluding arbitration). If not, you and Microsoft consent to exclusive jurisdiction and venue in the Superior Court of King County, Washington for all disputes heard in court (excluding arbitration).<\/p>\n\n\n\n<p><strong>12) <\/strong><strong>CONSUMER RIGHTS; REGIONAL VARIATIONS.<\/strong> This agreement describes certain legal rights. You may have other rights, including consumer rights, under the laws of your state, province, or country. Separate and apart from your relationship with Microsoft, you may also have rights with respect to the party from which you acquired the Materials. This agreement does not change those other rights if the laws of your state, province, or country do not permit it to do so. For example, if you acquired the Materials in one of the below regions, or mandatory country law applies, then the following provisions apply to you:<\/p>\n\n\n\n<p><strong>a)&nbsp;&nbsp;&nbsp; <\/strong><strong>Australia.<\/strong> You have statutory guarantees under the Australian Consumer Law and nothing in this agreement is intended to affect those rights.<\/p>\n\n\n\n<p><strong>b)&nbsp;&nbsp;&nbsp; <\/strong><strong>Canada.<\/strong> If you acquired this software in Canada, you may stop receiving updates by turning off the automatic update feature, disconnecting your device from the Internet (if and when you re-connect to the Internet, however, the Materials will resume checking for and installing updates), or uninstalling the Materials. The product documentation, if any, may also specify how to turn off updates for your specific device or software.<\/p>\n\n\n\n<p><strong>c)&nbsp;&nbsp;&nbsp;&nbsp; <\/strong><strong>Germany and Austria.<\/strong><\/p>\n\n\n\n<p><strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/strong><strong>Warranty.<\/strong> The properly licensed software will perform substantially as described in any Microsoft materials that accompany the Materials. However, Microsoft gives no contractual guarantee in relation to the licensed software.<\/p>\n\n\n\n<p><strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ii.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/strong><strong>Limitation of Liability.<\/strong> In case of intentional conduct, gross negligence, claims based on the Product Liability Act, as well as, in case of death or personal or physical injury, Microsoft is liable according to the statutory law.<\/p>\n\n\n\n<p>Subject to the foregoing clause (ii), Microsoft will only be liable for slight negligence if Microsoft is in breach of such material contractual obligations, the fulfillment of which facilitate the due performance of this agreement, the breach of which would endanger the purpose of this agreement and the compliance with which a party may constantly trust in (so-called &#8220;cardinal obligations&#8221;). In other cases of slight negligence, Microsoft will not be liable for slight negligence.<\/p>\n\n\n\n<ol style=\"list-style-type:1\" class=\"wp-block-list\">\n<li>DISCLAIMER OF WARRANTY. THE MATERIALS ARE LICENSED \u201cAS IS.\u201d YOU BEAR THE RISK OF USING THEM. MICROSOFT GIVES NO EXPRESS WARRANTIES, GUARANTEES, OR CONDITIONS. TO THE EXTENT PERMITTED UNDER APPLICABLE LAWS, MICROSOFT EXCLUDES ALL IMPLIED WARRANTIES, INCLUDING MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT.<\/li>\n<\/ol>\n\n\n\n<ol style=\"list-style-type:1\" class=\"wp-block-list\">\n<li>LIMITATION ON AND EXCLUSION OF DAMAGES. IF YOU HAVE ANY BASIS FOR RECOVERING DAMAGES DESPITE THE PRECEDING DISCLAIMER OF WARRANTY, YOU CAN RECOVER FROM MICROSOFT AND ITS SUPPLIERS ONLY DIRECT DAMAGES UP TO U.S. $5.00. YOU CANNOT RECOVER ANY OTHER DAMAGES, INCLUDING CONSEQUENTIAL, LOST PROFITS, SPECIAL, INDIRECT OR INCIDENTAL DAMAGES.<\/li>\n<\/ol>\n\n\n\n<p>This limitation applies to (a) anything related to the Materials, services, content (including code) on third party Internet sites, or third party applications; and (b) claims for breach of contract, warranty, guarantee, or condition; strict liability, negligence, or other tort; or any other claim; in each case to the extent permitted by applicable law.<\/p>\n\n\n\n<p>It also applies even if Microsoft knew or should have known about the possibility of the damages. The above limitation or exclusion may not apply to you because your state, province, or country may not allow the exclusion or limitation of incidental, consequential, or other damages.<\/p>\n\n\n\n\n\n<h2 class=\"wp-block-heading is-style-default\" id=\"motivation\">Motivation<\/h2>\n\n\n\n<p><strong>For what purpose was the dataset created?<\/strong> Was there a specific task in mind? Was there a specific gap that needed to be filled? Please provide a description.<\/p>\n\n\n\n<p>The dataset was created to help enable research on isolated sign language recognition (ISLR) &#8211; i.e. recognizing individual signs from video clips &#8211; and sign language modeling more generally.<\/p>\n\n\n\n<p>Specifically, we frame ISLR as a dictionary retrieval task: given a self-recorded video of a user performing a single sign, we aim to retrieve the correct sign from a sign language dictionary. This dataset was created with this framing in mind, with the intent of grounding ISLR research in a practical application useful to the Deaf community. While we believe this dataset is suited for methods development for ISLR in general, this dataset specifically contains signs in American Sign Language (ASL).<\/p>\n\n\n\n<p>In designing our collection mechanism, we sought to address limitations of prior ISLR datasets. Previous datasets have been limited in terms of number of videos, vocabulary size (i.e. number of signs contained), real-world recording settings, presence and reliability of labels, Deaf representation, and\/or number of contributors. Some past datasets (in particular scraped datasets) have also included videos without explicit consent from the video creators or signers in the videos.<\/p>\n\n\n\n<p><strong>Who created this dataset (e.g., which team, research group) and on behalf of which entity (e.g., company, institution, organization)?<\/strong><\/p>\n\n\n\n<p>This dataset was created by Microsoft Research in collaboration with Boston University. Each organization&#8217;s involvement in collection is detailed below.<\/p>\n\n\n\n<p>Microsoft: platform design, platform engineering, primary ethics board (IRB) review of collection procedures (review of record), additional compliance review of and guidance for the platform (e.g. privacy, security, etc.), platform maintenance and debugging, technical support for participants, hosting of collection infrastructure (website, database, videos, backups, etc.), funding for participant compensation, data processing and cleaning, ethics and compliance board review of the dataset release (e.g. privacy, data cleaning, metadata, etc.), hosting of released assets (dataset, code, other supplementary materials)<\/p>\n\n\n\n<p>Boston University: platform feedback, seed sign recordings, secondary IRB review of collection procedures, participant recruitment, answering or redirecting participant questions, procurement and distribution of participant compensation<\/p>\n\n\n\n<p><strong>Who funded the creation of the dataset?<\/strong> If there is an associated grant, please provide the name of the grantor and the grant name and number.<\/p>\n\n\n\n<p>Microsoft primarily funded the creation of this dataset. Microsoft funded building and maintaining the collection platform, data storage and processing, participant compensation, and all time spent on the Microsoft activities listed above.<\/p>\n\n\n\n<p>Boston University funded all time spent on the Boston University activities listed above. Support was provided in part by National Science Foundation Grants: BCS-1625954 and BCS-1918556 to Karen Emmorey and Zed Sehyr, BCS-1918252 and BCS-1625793 to Naomi Caselli, and BCS-1625761 and BCS-1918261 to Ariel Cohen-Goldberg. Additional funding was from the National Institutes of Health National Institute on Deafness and Other Communication Disorders of and Office of Behavioral and Social Science Research under Award Number 1R01DC018279.<\/p>\n\n\n\n<p><strong>Any other comments?<\/strong><\/p>\n\n\n\n<p>None<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"composition\">Composition<\/h2>\n\n\n\n<p><strong>What do the instances that comprise the dataset represent (e.g., documents, photos, people, countries)?<\/strong> Are there multiple types of instances (e.g., movies, users, and ratings; people and interactions between them; nodes and edges)? Please provide a description.<\/p>\n\n\n\n<p>The instances are self-recorded videos of participants performing individual signs in ASL. Examples of still frames from this dataset are shown in our paper publication. Distribution of video lengths is shown in Figure 1.<\/p>\n\n\n\n<p><strong>How many instances are there in total (of each type, if appropriate)?<\/strong><\/p>\n\n\n\n<p>There are 83,399 instances of videos. In total, this data represents videos from 52 participants over a vocabulary size of 2,731 signs in ASL. On average, there are 30.5 videos for each sign and 1,604 videos per participant. The distribution of videos per participant is bimodal because we compensated participants for up to 3,000 videos. For 22 participants, the dataset includes 2992 +\/- 16 videos. For the remaining 30 participants, the dataset includes an average of 586 videos. (These video counts are for dataset Version 1.0, our first publicly released version of the dataset, after processing and cleaning.)<\/p>\n\n\n\n<p><strong>Does the dataset contain all possible instances or is it a sample (not necessarily random) of instances from a larger set?<\/strong> If the dataset is a sample, then what is the larger set? Is the sample representative of the larger set (e.g., geographic coverage)? If so, please describe how this representativeness was validated\/verified. If it is not representative of the larger set, please describe why not (e.g., to cover a more diverse range of instances, because instances were withheld or unavailable).<\/p>\n\n\n\n<p>The dataset contains a sample of single-sign videos, covering many fundamental ASL signs, and demonstrated by a sample of Deaf and hard-of-hearing community members in everyday environments.<\/p>\n\n\n\n<p>The ASL vocabulary was taken from ASL-LEX [2], which is a linguistically analyzed corpus of ASL vocabulary, covering many fundamental ASL signs. Specifically, our dataset contains 2,731 distinct signs (or glosses).<\/p>\n\n\n\n<p>We chose to adopt this vocabulary set because of the provision of detailed linguistic analysis of each sign, which complements the video set we provide, and allows for a richer set of uses for the videos. Due to ASL-LEX corpus updates across the time taken for data collection, six glosses in our dataset do not have corresponding linguistic information.<\/p>\n\n\n\n<p>The videos themselves contain a sample of the ASL community executing these signs to webcams in home environments. This type of crowdsourced collection has the benefit of not restricting geographic proximity (thus potentially expanding diversity), and capturing signers in their natural environments. Still, we recruited largely from our own Deaf community networks using snowball sampling. This type of convenience sampling can result in biases; for example, our sample of videos contains a high proportion of people who self-identified as female, compared to the population of ASL users at large.<\/p>\n\n\n\n<p><strong>What data does each instance consist of? \u201cRaw\u201d data (e.g., unprocessed text or images) or features?<\/strong> In either case, please provide a description.<\/p>\n\n\n\n<p>Each instance consists of a video file in .mp4 format. Each instance also has an associated gloss (or English transliteration), which was the target value for the signer. These glosses are consistent with a previous lexical database, ASL-LEX [2], and thus can be mapped onto standardized identifiers and phonological properties for the signs provided in this lexical database. Finally, each instance is also associated with an anonymous user identifier, identifying which of the 52 participants performed the sign.<\/p>\n\n\n\n<p><strong>Is there a label or target associated with each instance?<\/strong> If so, please provide a description.<\/p>\n\n\n\n<p>The label is the English gloss associated with each sign, as described above.<\/p>\n\n\n\n<p><strong>Is any information missing from individual instances?<\/strong> If so, please provide a description, explaining why this information is missing (e.g., because it was unavailable). This does not include intentionally removed information, but might include, e.g., redacted text.<\/p>\n\n\n\n<p>All instances have complete information. However, we not that some users have blank metadata in their demographic information. This is intentional, as provision of this information was entirely voluntary, and some users did not provide some fields.<\/p>\n\n\n\n<p><strong>Are relationships between individual instances made explicit (e.g., users\u2019 movie ratings, social network links)?<\/strong> If so, please describe how these relationships are made explicit.<\/p>\n\n\n\n<p>Yes. Each instance is tagged with a user ID identifying which user performed the sign. This user ID can be further associated with a separate metadata file containing demographic information on each of the users, such as the self-identified gender of the signer. We do not analyze this demographic information in our manuscript, but provide it because it could be useful for studying fairness (and other research).<\/p>\n\n\n\n<p><strong>Are there recommended data splits (e.g., training, development\/validation, testing)?<\/strong> If so, please provide a description of these splits, explaining the rationale behind them.<\/p>\n\n\n\n<p>Yes. Instances are labeled as either <code>train\" (training set),<\/code>val&#8221; (validation set), or &#8220;test&#8221; (test set), containing 40,154, 10,304, and 32,941 videos respectively. The data splits are stratified by user such that each user is unseen in the other data splits. These splits align with our dictionary retrieval task, because we expect users querying the dictionary to be unseen during training and model selection. Some participants contributed data over the entire vocabulary, while others only contributed data for a subset. To balance our test set metrics across the vocabulary, we assigned 11 participants who contributed 3,000 +\/- 5 videos (i.e. the maximum number of videos participants would be compensated for) to the test dataset. The other 11 participants who contributed 3,000 \\textpm \\ 5 videos, in addition to 30 participants who contributed a smaller number of videos, were otherwise split between the train and val set. We tried to balance gender identities across splits, as seen in Figure 2. Other than these factors, participants were randomly assigned to splits.<\/p>\n\n\n\n<p><strong>Are there any errors, sources of noise, or redundancies in the dataset?<\/strong> If so, please provide a description.<\/p>\n\n\n\n<p>We implemented some filters for blank videos, videos not containing people, and videos without signing as described above. However, these filters are basic, and may not have captured all videos with technical problems. Additionally, since these videos are self-recorded, not all users may perform the same sign for a gloss, since the same English gloss can sometimes refer to multiple signs (e.g. when there are regional variations of a sign, or when the English gloss is a homonym). We limited this issue by collecting data in an ASL-first process, where users watched a video of a sign prompt rather than reading an English gloss prompt, but in some instances, users may still not follow the seed signer (e.g. when their regional variation of a sign is not documented in the dictionary, or when the seed signer is using an outdated sign for signs that rapidly evolve). It is also possible that contributors may have made mistakes in signing, or submitted erroneous videos that our filters did not catch.<\/p>\n\n\n\n<p><strong>Is the dataset self-contained, or does it link to or otherwise rely on external resources (e.g., websites, tweets, other datasets)?<\/strong> If it links to or relies on external resources, a) are there guarantees that they will exist, and remain constant, over time; b) are there official archival versions of the complete dataset (i.e., including the external resources as they existed at the time the dataset was created); c) are there any restrictions (e.g., licenses, fees) associated with any of the external resources that might apply to a future user? Please provide descriptions of all external resources and any restrictions associated with them, as well as links or other access points, as appropriate.<\/p>\n\n\n\n<p>The dataset is self-contained, but the gloss labels can optionally be mapped to ASL-LEX, which provides detailed linguistic analysis of each sign in the vocabulary we used. The linguistic analysis download can be found at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/asl-lex.org\/download.html\" target=\"_blank\" rel=\"noopener noreferrer\">https:\/\/asl-lex.org\/download.html<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, and information about funding and licenses for ASL-LEX can be found at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/asl-lex.org\/about.html\" target=\"_blank\" rel=\"noopener noreferrer\">https:\/\/asl-lex.org\/about.html<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n\n\n\n<p><strong>Does the dataset contain data that might be considered confidential (e.g., data that is protected by legal privilege or by doctor-patient confidentiality, data that includes the content of individuals non-public communications)?<\/strong> If so, please provide a description.<\/p>\n\n\n\n<p>While the videos contain recordings of people signing, all contributors consented to participate in this dataset and agreed to terms of use for our web platform. The consent process provided detailed information about the project&#8217;s purpose, and explained that the dataset would be released to the public for research purposes.<\/p>\n\n\n\n<p><strong>Does the dataset contain data that, if viewed directly, might be offensive, insulting, threatening, or might otherwise cause anxiety?<\/strong> If so, please describe why.<\/p>\n\n\n\n<p>Generally no. The videos reflect signs a viewer would be exposed to in everyday conversational ASL, taken from an established corpus of vocabulary. However, some of the vocabulary may refer to content that some find offensive (e.g. a person&#8217;s private parts). In addition, because this database is a &#8220;snapshot&#8221; of the language at time of curation, some signs may be outdated and refer to stereotypes (e.g. around identity) phased out as the language has evolved and continues to evolve.<\/p>\n\n\n\n<p>We also believe the chance of erroneous offensive content is extremely low. We recruited from trusted groups, manually vetted the first and last video submitted by each user on each date of submission to verify good faith effort, passed all videos through libraries to detect and blur appearance of third parties, and finally did a manual review of all videos. We conducted our review and cleaning iteratively, under close guidance from Microsoft&#8217;s Ethics and Compliance team. We did not identify any offensive content in any of our reviews. All video blurring and omissions was done out of an abundance of care for our dataset participants (e.g. to remove a third party or personal content). However, it is impossible to guarantee that users did not submit videos that some may find offensive.<\/p>\n\n\n\n<p><strong>Does the dataset relate to people?<\/strong> If not, you may skip the remaining questions in this section.<\/p>\n\n\n\n<p>Yes.<\/p>\n\n\n\n<p><strong>Does the dataset identify any subpopulations (e.g., by age, gender)?<\/strong> If so, please describe how these subpopulations are identified and provide a description of their respective distributions within the dataset.<\/p>\n\n\n\n<p>No. We release general aggregated demographics as part of our paper publication, but do not release individual demographics, to help protect participant privacy. These aggregated demographics span gender, age, region, and years of ASL experience. Providing demographic data on the collection platform was fully voluntary (i.e. not required and not tied to compensation) and self-reported.<\/p>\n\n\n\n<p><strong>Is it possible to identify individuals (i.e., one or more natural persons), either directly or indirectly (i.e., in combination with other data) from the dataset?<\/strong> If so, please describe how.<\/p>\n\n\n\n<p>Yes. The videos contain uncensored faces and are generally filmed in the users&#8217; home environments. We chose not to censor user faces because facial expressions are critical linguistic components of ASL. Users provided consent for dataset release, and were able to delete videos or opt out of the dataset any time prior to release.<\/p>\n\n\n\n<p><strong>Does the dataset contain data that might be considered sensitive in any way (e.g., data that reveals racial or ethnic origins, sexual orientations, religious beliefs, political opinions or union memberships, or locations; financial or health data; biometric or genetic data; forms of government identification, such as social security numbers; criminal history)?<\/strong> If so, please provide a description.<\/p>\n\n\n\n<p>Not directly, but some sensitive attributes about participants might be guessable from the videos (e.g. race, or relation to the Deaf community).<\/p>\n\n\n\n<p><strong>Any other comments?<\/strong><\/p>\n\n\n\n<p>None.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"collection-process\">Collection Process<\/h2>\n\n\n\n<p><strong>How was the data associated with each instance acquired?<\/strong> Was the data directly observable (e.g., raw text, movie ratings), reported by subjects (e.g., survey responses), or indirectly inferred\/derived from other data (e.g., part-of-speech tags, model-based guesses for age or language)? If data was reported by subjects or indirectly inferred\/derived from other data, was the data validated\/verified? If so, please describe how.<\/p>\n\n\n\n<p>Videos were self-recorded and contributed by participants through a crowdsourcing web platform. We built on a platform described in [1], with optimizations to support scale. Demographics could optionally be entered into the platform as part of a user profile. Please see the Supplementary Materials in our paper publication for a detailed description of the optimized design components and rationale.<\/p>\n\n\n\n<p><strong>What mechanisms or procedures were used to collect the data (e.g., hardware apparatus or sensor, manual human curation, software program, software API)?<\/strong> How were these mechanisms or procedures validated?<\/p>\n\n\n\n<p>Users contributed videos through a web platform that accessed the user&#8217;s webcam to facilitate recording within the website itself. Contributors used their own hardware for recording (e.g., webcams). This setup is consistent with the type of setup that future dictionaries might have to demonstrate and look up a sign.<\/p>\n\n\n\n<p><strong>If the dataset is a sample from a larger set, what was the sampling strategy (e.g., deterministic, probabilistic with specific sampling probabilities)?<\/strong><\/p>\n\n\n\n<p>N\/A<\/p>\n\n\n\n<p><strong>Who was involved in the data collection process (e.g., students, crowdworkers, contractors) and how were they compensated (e.g., how much were crowdworkers paid)?<\/strong><\/p>\n\n\n\n<p>A team of researchers, engineers, and a designer were involved in the data collection. The team designed, built, and maintained the platform, and also managed recruitment, participant engagement, and compensation. The team was compensated through salary, stipend, or contract payment.<\/p>\n\n\n\n<p>Data contributors were also compensated monetarily. The seed signer was paid to record the seed sign videos. The rest of the data was crowdsourced. For every 300 signs recorded, these participants received a $30 Amazon gift card, for up to 3,000 signs.<\/p>\n\n\n\n<p><strong>Over what timeframe was the data collected? Does this timeframe match the creation timeframe of the data associated with the instances (e.g., recent crawl of old news articles)?<\/strong> If not, please describe the timeframe in which the data associated with the instances was created.<\/p>\n\n\n\n<p>Collection of the seed sign videos ran from April-May 2021. The collection of the community replications ran from July 2021 to April 2022.<\/p>\n\n\n\n<p><strong>Were any ethical review processes conducted (e.g., by an institutional review board)?<\/strong> If so, please provide a description of these review processes, including the outcomes, as well as a link or other access point to any supporting documentation.<\/p>\n\n\n\n<p>Yes. The data collection was reviewed by the two collaborating institutions&#8217; Institutional Review Boards (IRBs) &#8212; Microsoft (primary, IRB of record #418) and Boston University. The platform itself and the dataset release also underwent additional Ethics and Compliance reviews by Microsoft.<\/p>\n\n\n\n<p><strong>Does the dataset relate to people?<\/strong> If not, you may skip the remaining questions in this section.<\/p>\n\n\n\n<p>Yes.<\/p>\n\n\n\n<p><strong>Did you collect the data from the individuals in question directly, or obtain it via third parties or other sources (e.g., websites)?<\/strong><\/p>\n\n\n\n<p>Videos were self-contributed by individuals directly.<\/p>\n\n\n\n<p><strong>Were the individuals in question notified about the data collection?<\/strong> If so, please describe (or show with screenshots or other information) how notice was provided, and provide a link or other access point to, or otherwise reproduce, the exact language of the notification itself.<\/p>\n\n\n\n<p>Yes. When participants first visited our web platform, they engaged in a consent process, which provided detailed information about the procedures, benefits and risks, use of personal information, and other details about the project. In addition, the web platform provided an information page that explained the purpose of the project, a list of team members, and contact information. For the exact consent text, please visit <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/asl-citizen\/consent-form\/\">https:\/\/www.microsoft.com\/en-us\/research\/project\/asl-citizen\/consent-form\/<\/a>.<\/p>\n\n\n\n<p>In addition to the procedures described in the consent form, participants were prompted with instructions as they viewed prompt signs and recorded their own versions. Screenshots that include the task instructions are provided in Fig. 1 of [1].<\/p>\n\n\n\n<p><strong>Did the individuals in question consent to the collection and use of their data?<\/strong> If so, please describe (or show with screenshots or other information) how consent was requested and provided, and provide a link or other access point to, or otherwise reproduce, the exact language to which the individuals consented.<\/p>\n\n\n\n<p>Yes. (See answer and links above.)<\/p>\n\n\n\n<p><strong>If consent was obtained, were the consenting individuals provided with a mechanism to revoke their consent in the future or for certain uses?<\/strong> If so, please provide a description, as well as a link or other access point to the mechanism (if appropriate).<\/p>\n\n\n\n<p>Yes. Users could re-record videos, delete their videos from the collection, as well as withdraw from the dataset any time before public release.<\/p>\n\n\n\n<p><strong>Has an analysis of the potential impact of the dataset and its use on data subjects (e.g., a data protection impact analysis) been conducted?<\/strong> If so, please provide a description of this analysis, including the outcomes, as well as a link or other access point to any supporting documentation.<\/p>\n\n\n\n<p>Yes, a Data Protection Impact Analysis (DPIA) has been conducted, including taking a detailed inventory of the data types collected and stored and retention policy, and was successfully reviewed by Microsoft.<\/p>\n\n\n\n<p><strong>Any other comments?<\/strong><\/p>\n\n\n\n<p>None.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"preprocessing-cleaning-labelling\">Preprocessing\/cleaning\/labelling<\/h2>\n\n\n\n<p><strong>Was any preprocessing\/cleaning\/labeling of the data done (e.g., discretization or bucketing, tokenization, part-of-speech tagging, SIFT feature extraction, removal of instances, processing of missing values)?<\/strong> If so, please provide a description. If not, you may skip the remainder of the questions in this section.<\/p>\n\n\n\n<p>Yes. First, we removed empty videos automatically, by removing those under 150 KB in size or where YOLOv3 [4] did not detect a person (~50 videos). We manually reviewed the first and last videos recorded by each participant on each day and random samples throughout, checking for a list of sensitive content provided by our ethics and compliance board. Three types of personal content were identified for redaction: another person, certificates, and religious symbols. To protect third parties, we used YOLOv3 to detect if multiple people were present. For these videos and others with identified personal content, we blurred the background using MediaPipe holistic user segmentation. We blurred a subset of pixels for one user, since the personal content was reliably limited to a small area. We also removed one user&#8217;s videos, who recorded many videos without a sign in them, and videos of a written error message. Finally, we manually re-reviewed all videos.<\/p>\n\n\n\n<p>In our reviews, we did not identify any inappropriate content or bad-faith efforts. In total, we blurred the background of 268 videos where a second person was detected automatically, 293 additional videos with sensitive content, and 32 additional videos with a person missed by the automatic detection. We blurred a small fixed range of pixels for 2,933 videos, and omitted 513 videos where the blurring was insufficient or an error message (resulting from the data collection platform) showed.<\/p>\n\n\n\n<p><strong>Was the \u201craw\u201d data saved in addition to the preprocessed\/cleaned\/labeled data (e.g., to support unanticipated future uses)?<\/strong> If so, please provide a link or other access point to the \u201craw\u201d data.<\/p>\n\n\n\n<p>No, not publicly.<\/p>\n\n\n\n<p><strong>Is the software used to preprocess\/clean\/label the instances available?<\/strong> If so, please provide a link or other access point.<\/p>\n\n\n\n<p>No, but these procedures are easily reproducible using public software.<\/p>\n\n\n\n<p><strong>Any other comments?<\/strong><\/p>\n\n\n\n<p>None.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"uses\">Uses<\/h2>\n\n\n\n<p><strong>Has the dataset been used for any tasks already?<\/strong> If so, please provide a description.<\/p>\n\n\n\n<p>Yes. We provide supervised classification baselines in the manuscript, and show how these classifiers can be used to solve the dictionary retrieval problem.<\/p>\n\n\n\n<p><strong>Is there a repository that links to any or all papers or systems that use the dataset?<\/strong> If so, please provide a link or other access point.<\/p>\n\n\n\n<p>Yes. The link is available on our project page at <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/asl-citizen\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.microsoft.com\/en-us\/research\/project\/asl-citizen\/<\/a>.<\/p>\n\n\n\n<p><strong>What (other) tasks could the dataset be used for?<\/strong><\/p>\n\n\n\n<p>Many methods beyond supervised classification can be used to address the dictionary retrieval framing, including unsupervised learning, identification of linguistic features, domain adaptation, etc. To enable these approaches, we provide a larger test dataset. Besides our dictionary retrieval framing, this dataset could be used for a number of purposes both within sign language computing and outside, including pretraining for continuous sign language recognition, or motion tracking.<\/p>\n\n\n\n<p><strong>Is there anything about the composition of the dataset or the way it was collected and preprocessed\/cleaned\/labeled that might impact future uses?<\/strong> For example, is there anything that a future user might need to know to avoid uses that could result in unfair treatment of individuals or groups (e.g., stereotyping, quality of service issues) or other undesirable harms (e.g., financial harms, legal risks) If so, please provide a description. Is there anything a future user could do to mitigate these undesirable harms?<\/p>\n\n\n\n<p>Our dataset collection centers a sociolinguistic minority and disability community (the Deaf community) that is already subject to misconceptions, stereotypes, and marginalization. Sign language is a critical cultural component of this community and must be handled respectfully. Some machine learning efforts on sign language proceed without recognition of these existing inequities and cultural practices, and promote harmful misconceptions (e.g. that sign languages are simple, or just signed versions of English), use offensive language or stereotypes (e.g. outdated terminology like &#8220;hearing impaired&#8221; or &#8220;deaf and dumb&#8221;), or simply exploit the language as a commodity or &#8220;toy problem&#8221; without engaging with the community. Some practices that we outline in our paper can avoid these harms: these include including Deaf collaborators and community input in the work, and ensuring that they are compensated; using a critical problem framing that centers useful and culturally respectful applications (e.g. our dictionary retrieval framing); and ensuring that Deaf scholars are cited and their perspectives, concerns, and priorities are integrated into the design of machine learning algorithms.<\/p>\n\n\n\n<p><strong>Are there tasks for which the dataset should not be used?<\/strong> If so, please provide a description.<\/p>\n\n\n\n<p>We recommend using this data with meaningful involvement from Deaf community members in leadership roles with decision-making authority at every step from conception to execution. As we describe in our linked paper, research and development of sign language technologies that involves Deaf community members increases the quality of the work, and can help to ensure technologies are relevant and wanted. Historically, projects developed without meaningful Deaf involvement have not been well received [3] and have damaged relationships between technologists and deaf communities.<\/p>\n\n\n\n<p>We ask that this dataset is used with an aim of making the world more equitable and just for deaf people, and with a commitment to &#8220;do no harm&#8221;. In that spirit, this dataset should not be used to develop technology that purports to replace sign language interpreters, fluent signing educators, and\/or other hard-won accommodations for deaf people.<\/p>\n\n\n\n<p>This dataset was designed primarily for work on isolated sign recognition; signing in continuous sentences\u2014like what is needed for translating between ASL and English\u2014is very different. In particular, continuous sign recognition cannot be accomplished by identifying a sequence of signs from a standard dictionary (e.g. by matching to the signs in our dataset), due to grammatical and structural difference in continuous signing from sign modulation, co-articulation effects and contextual changes in the meaning of signs. At a minimum, this dataset would need to be used in conjunction with other datasets and\/or domain knowledge about sign language in order to tackle continuous recognition or translation.<\/p>\n\n\n\n<p><strong>Any other comments?<\/strong><\/p>\n\n\n\n<p>None.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"distribution\">Distribution<\/h2>\n\n\n\n<p><strong>Will the dataset be distributed to third parties outside of the entity (e.g., company, institution, organization) on behalf of which the dataset was created?<\/strong> If so, please provide a description.<\/p>\n\n\n\n<p>Yes. This dataset is released publicly, to help advance research on isolated sign language recognition.<\/p>\n\n\n\n<p><strong>How will the dataset will be distributed (e.g., tarball on website, API, GitHub)?<\/strong> Does the dataset have a digital object identifier (DOI)?<\/p>\n\n\n\n<p>The dataset will be made publicly available for download through the Microsoft Download Center.<\/p>\n\n\n\n<p>To download via web interface, please visit: <a href=\"https:\/\/www.microsoft.com\/en-us\/download\/details.aspx?id=105253\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.microsoft.com\/en-us\/download\/details.aspx?id=105253<\/a><\/p>\n\n\n\n<p>To download via command line, please execute: wget https:\/\/download.microsoft.com\/download\/b\/8\/8\/b88c0bae-e6c1-43e1-8726-98cf5af36ca4\/ASL_Citizen.zip<\/p>\n\n\n\n<p><strong>When will the dataset be distributed?<\/strong><\/p>\n\n\n\n<p>The dataset was released on 06\/12\/2023.<\/p>\n\n\n\n<p><strong>Will the dataset be distributed under a copyright or other intellectual property (IP) license, and\/or under applicable terms of use (ToU)?<\/strong> If so, please describe this license and\/or ToU, and provide a link or other access point to, or otherwise reproduce, any relevant licensing terms or ToU, as well as any fees associated with these restrictions.<\/p>\n\n\n\n<p>Yes, the dataset will be published under a license that permits use for research purposes. The license is provided at <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/asl-citizen\/dataset-license\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.microsoft.com\/en-us\/research\/project\/asl-citizen\/dataset-license\/<\/a>.<\/p>\n\n\n\n<p><strong>Have any third parties imposed IP-based or other restrictions on the data associated with the instances?<\/strong> If so, please describe these restrictions, and provide a link or other access point to, or otherwise reproduce, any relevant licensing terms, as well as any fees associated with these restrictions.<\/p>\n\n\n\n<p>No, there are no third-party restrictions on the data we release. However, the complimentary phonological evaluations of sign vocabulary in our dataset previously published by ASL-LEX are published under a CC BY-NC 4.0 license (see <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/asl-lex.org\/download.html\" target=\"_blank\" rel=\"noopener noreferrer\">https:\/\/asl-lex.org\/download.html<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>).<\/p>\n\n\n\n<p><strong>Do any export controls or other regulatory restrictions apply to the dataset or to individual instances?<\/strong> If so, please describe these restrictions, and provide a link or other access point to, or otherwise reproduce, any supporting documentation.<\/p>\n\n\n\n<p>No.<\/p>\n\n\n\n<p><strong>Any other comments?<\/strong><\/p>\n\n\n\n<p>None.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"maintenance\">Maintenance<\/h2>\n\n\n\n<p><strong>Who will be supporting\/hosting\/maintaining the dataset?<\/strong><\/p>\n\n\n\n<p>The dataset will be hosted on Microsoft Download Center.<\/p>\n\n\n\n<p><strong>How can the owner\/curator\/manager of the dataset be contacted (e.g., email address)?<\/strong><\/p>\n\n\n\n<p>Please contact <a href=\"mailto:ASL_Citizen@microsoft.com\">ASL_Citizen@microsoft.com<\/a> with any questions.<\/p>\n\n\n\n<p><strong>Is there an erratum?<\/strong> If so, please provide a link or other access point.<\/p>\n\n\n\n<p>A public-facing website is associated with the dataset (see <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/asl-citizen\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.microsoft.com\/en-us\/research\/project\/asl-citizen\/<\/a>). We will link to erratum on this website if necessary.<\/p>\n\n\n\n<p><strong>Will the dataset be updated (e.g., to correct labeling errors, add new instances, delete instances)?<\/strong> If so, please describe how often, by whom, and how updates will be communicated to users (e.g., mailing list, GitHub)?<\/p>\n\n\n\n<p>If updates are neccessary, we will update the dataset. We will release our dataset with a version number, to distinguish it with any future updated versions.<\/p>\n\n\n\n<p><strong>If the dataset relates to people, are there applicable limits on the retention of the data associated with the instances (e.g., were individuals in question told that their data would be retained for a fixed period of time and then deleted)?<\/strong> If so, please describe these limits and explain how they will be enforced.<\/p>\n\n\n\n<p>The dataset will be left up indefinitely, to maximize utility to research. Participants were informed that their contributions might be released in a public dataset.<\/p>\n\n\n\n<p><strong>Will older versions of the dataset continue to be supported\/hosted\/maintained?<\/strong> If so, please describe how. If not, please describe how its obsolescence will be communicated to users.<\/p>\n\n\n\n<p>All versions of the dataset will be released with a version number on Microsoft Download Center to enable differentiation.<\/p>\n\n\n\n<p><strong>If others want to extend\/augment\/build on\/contribute to the dataset, is there a mechanism for them to do so?<\/strong> If so, please provide a description. Will these contributions be validated\/verified? If so, please describe how. If not, why not? Is there a process for communicating\/distributing these contributions to other users? If so, please provide a description.<\/p>\n\n\n\n<p>We do not have a mechanism for others to contribute to our dataset directly. However, others could create comparable datasets by recording versions of the same signs (from ASL-LEX). Such a dataset could easily be combined with ours by indexing on the signs&#8217; unique identifiers.<\/p>\n\n\n\n<p><strong>Any other comments?<\/strong><\/p>\n\n\n\n<p>None.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"503\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/hist-1024x503.png\" alt=\"Histogram of video lengths. X-axis: lengths, Y-axis: video counts. The max is reached around 2.3,2.4, with about 3,500 videos.\" class=\"wp-image-949374\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/hist-1024x503.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/hist-300x147.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/hist-768x377.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/hist-1536x754.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/hist-2048x1006.png 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/hist-240x118.png 240w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Figure 1: Histogram of video lengths in ASL Citizen dataset.<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"613\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/vid_dist-1024x613.png\" alt=\"Histogram of Female (blue) and Male (orange) counts. X-axis: Train, Validation, Test. Y-axis: Video Count. Video count for Female is higher across all three settings. Training has the highest counts overall, then Test, then Validation.\" class=\"wp-image-949380\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/vid_dist-1024x613.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/vid_dist-300x180.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/vid_dist-768x460.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/vid_dist-1536x920.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/vid_dist-2048x1226.png 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/06\/vid_dist-240x144.png 240w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Figure 2: Distribution of videos across different data splits.<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"references\">References<\/h2>\n\n\n\n<p>[1] Danielle Bragg, Abraham Glasser, Fyodor Minakov, Naomi Caselli, and William Thies. Exploring Collection of Sign Language Videos through Crowdsourcing. <em>Proceedings of the ACM on Human-Computer Interaction<\/em>&nbsp;6.CSCW2 (2022): 1-24.<\/p>\n\n\n\n<p>[2] Naomi K Caselli, Zed Sevcikova Sehyr, Ariel M Cohen-Goldberg, and Karen Emmorey. ASL-LEX: A lexical database of American Sign Language. <em>Behavior research methods<\/em>&nbsp;49 (2017): 784-801<\/p>\n\n\n\n<p>[3] Michael Erard. Why sign-language gloves don\u2019t help deaf people. <em>The Atlantic<\/em>&nbsp;9 (2017)<\/p>\n\n\n\n<p>[4] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. <em>arXiv preprint arXiv:1804.02767<\/em>&nbsp;(2018).<\/p>\n\n\n\n\n\n<h2 class=\"wp-block-heading\" id=\"microsoft-research-project-participation-consent-form\">Microsoft Research Project Participation Consent Form<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"introduction\">INTRODUCTION<\/h3>\n\n\n\n<p>Thank you for deciding to volunteer in a Microsoft Corporation research project.&nbsp; You have no obligation to participate and you may decide to terminate your participation at any time.&nbsp; You also understand that the researcher has the right to withdraw you from participation in the project at any time. Below is a description of the research project, and your consent to participate.&nbsp; Read this information carefully. If you agree to participate, sign in the space provided.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"title-of-research-project\">TITLE OF RESEARCH PROJECT<\/h3>\n\n\n\n<p>ASL Dataset Community<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"principal-investigator\">Principal Investigator<\/h4>\n\n\n\n<p>Danielle Bragg<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"purpose\">PURPOSE<\/h3>\n\n\n\n<p>The purpose of this project is to collect sign language videos from volunteer contributors to advance sign language recognition, while fostering American Sign Language (ASL) community online. The website falls under the category of &#8220;citizen science&#8221;, where people contribute for the purpose of advancing science or research. Contributors will be able to do three things: 1) record videos of themselves executing specific signs, 2) validate that other contributors executed signs correctly, and 3) explore the communal dataset.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"procedures\">PROCEDURES<\/h3>\n\n\n\n<p>During this project, the following will happen:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>You will create a user profile, including a username, email address, and optional picture and demographics.<\/li>\n\n\n\n<li>You will then be able to do three main things on the website: 1) record videos of yourself signing, 2) validate that other contributors executed signs correctly, and 3) explore the communal dataset.<\/li>\n\n\n\n<li>For every 300 signs you record, we will send you a $30 Amazon gift card, for up to 3000 signs. The gift card will be sent to the email address associated with your profile.<\/li>\n<\/ul>\n\n\n\n<p>Microsoft may document and collect information about your participation by storing your profile information, the videos you submit, your ratings of other contributors\u2019 videos, and any other interactions with the site.<\/p>\n\n\n\n<p>Approximately 60 participants will be involved in this study.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"personal-information-and-confidentiality\">PERSONAL INFORMATION AND CONFIDENTIALITY<\/h3>\n\n\n\n<p>Microsoft Research is ultimately responsible for determining the purposes and uses of your personal information.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Personal information we collect.&nbsp;&nbsp; <\/strong>During the project we may collect personal information about you such as image, likeness, email, age, gender, and ASL experience.<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>How we use personal information.<\/strong>&nbsp; The personal information and other data collected during this project will be used primarily to perform research for purposes described in the introduction above. &nbsp;&nbsp;Such information and data, or the results of the research may eventually be used to develop and improve our commercial products, services or technologies.<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Right of Publicity<\/strong>.&nbsp; By submitting video(s) of yourself, you confirm that you are the depicted person in the video(s) you submit and you grant Microsoft an unrestricted, perpetual, worldwide, royalty-free, irrevocable license, with rights to assign and sublicense, to use your image and likeness for the research project above and in any related services, on a worldwide basis.<\/li>\n\n\n\n<li><strong>How we store and share your personal information.&nbsp; <\/strong>Your personal data will be stored for a period of up to 5 years from your last login. &nbsp;This project is a collaboration with Boston University, who will have access to collected data. In addition, we may release a dataset that includes videos and other demographics publicly to help advance research.<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>How you can access and control your personal information.<\/strong>&nbsp; If you wish to review or copy any personal information you provided during the study, log in to your account to view, edit or delete data from the live site. If you have any additional questions, please email the research team at: aslgames@microsoft.com.&nbsp; Please note that we will not be able to delete data that has already been shared publicly in a research dataset release. We will respond to questions or concerns within 30 days.<\/li>\n<\/ul>\n\n\n\n<p>For additional information on how Microsoft handles your personal information, please see the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/go.microsoft.com\/fwlink\/?LinkId=521839\">Microsoft Privacy Statement<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. &nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"research-results-feedback\">RESEARCH RESULTS & FEEDBACK<\/h3>\n\n\n\n<p>Microsoft will own all of the research data and analysis and other results (collectively \u201cResearch Results\u201d) generated from the information you provide and your participation in the research project. You may also provide suggestions, comments or other feedback (\u201cFeedback\u201d) to Microsoft with respect to the research project. Feedback is entirely voluntary, and Microsoft shall be free to use, disclose, reproduce, license, or otherwise distribute, and leverage the Feedback and Research Results.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"microsoft-and-confidentiality\">&nbsp;MICROSOFT AND CONFIDENTIALITY<\/h3>\n\n\n\n<p>The research project and information you learn by participating in the project is confidential to Microsoft.&nbsp; Sharing this confidential information with people other than those we\u2019ve identified above could negatively affect the scientific integrity of the research study and could even make it more difficult for Microsoft to develop new products based on the information obtained in this study. It is therefore important that you do not talk about the project outside of the study team (unless you are legally required to do so by a court or other government order).&nbsp; This does not apply if the information is general public knowledge or if you have a legal right to share the information.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"benefits-and-risks\">BENEFITS AND RISKS<\/h3>\n\n\n\n<p><strong>Benefits:&nbsp; &nbsp;&nbsp;<\/strong>The research team expects to collect videos of diverse signers from this project which we hope will improve the accuracy of sign language recognition systems for diverse signers, for example enabling the creation of drive-through. You will receive any public benefit that may come of these Research Results being shared with the greater scientific community.<\/p>\n\n\n\n<p><strong>Risks:<\/strong>&nbsp;&nbsp;&nbsp;&nbsp; During participation, you may experience discomfort at contributing videos of yourself. Because you will be able to view videos recorded by other contributors, it is also possible that you will view inappropriate or offensive content. To alleviate this risk, the website allows participants to flag inappropriate content, which will be reviewed by a moderator.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"future-use-of-your-identifiable-information\">FUTURE USE OF YOUR IDENTIFIABLE INFORMATION<\/h3>\n\n\n\n<p>Identifiers might be removed from your identifiable private information, and after such removal, the information could be used for future research studies or distributed to another investigator for future research studies without your (or your legally authorized representative\u2019s)&nbsp; additional informed consent,<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"payment-for-participation\">PAYMENT FOR PARTICIPATION<\/h3>\n\n\n\n<p>For every 300 signs you record, we will send you a $30 Amazon gift card, for up to 3000 signs. The gift card will be sent to the email address associated with your profile.<\/p>\n\n\n\n<p>Your data may be used to make new products, tests or findings.&nbsp; These may have value and may be developed and owned by Microsoft and\/or others.&nbsp; If this happens, there are no plans to pay you.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"participation\">PARTICIPATION<\/h3>\n\n\n\n<p>Taking part in research is always a choice. If you decide to be in the study, you can change your mind at any time without affecting any rights including payment to which you would otherwise be entitled. If you decide to withdraw, you should contact the person in charge of this study, and also inform that person if you would like your personal information removed as well.<\/p>\n\n\n\n<p>Microsoft or the person in charge of this study may discontinue the study or your individual participation in the study at any time without your consent for reasons including:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>your failure to follow directions<\/li>\n\n\n\n<li>it is discovered that you do not meet study requirements<\/li>\n\n\n\n<li>it is in your best interest medically<\/li>\n\n\n\n<li>the study is canceled<\/li>\n\n\n\n<li>administrative reasons<\/li>\n<\/ul>\n\n\n\n<p>If you leave the study, the study staff will still be able to use your information that they have already collected, however, you have the right to ask for it to be removed when you leave.<\/p>\n\n\n\n<p>Significant new findings that develop during the course of this study that might impact your willingness to be in this study will be given to you.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"contact-information\">CONTACT INFORMATION<\/h3>\n\n\n\n<p>Should you have any questions concerning this project, please contact the research team at aslgames@microsoft.com.&nbsp;<\/p>\n\n\n\n<p>Should you have any questions about your rights as a research subject, please contact Microsoft Research Ethics Program Feedback at MSRStudyfeedback@microsoft.com.<\/p>\n\n\n\n<p>CONSENT<\/p>\n\n\n\n<p>By clicking CONTINUE, you confirm that the study was explained to you, you had a chance to ask questions before beginning the study, and all your questions were answered satisfactorily. At any time, you may ask other questions. By clicking CONTINUE, you voluntarily consent to participate, and you do not give up any legal rights you have as a study participant.<\/p>\n\n\n\n<p>Please confirm your consent by clicking CONTINUE. If you wish, you may now save a copy of this consent form for future reference. On behalf of Microsoft, we thank you for your contribution and look forward to your research session.<\/p>\n\n\n","protected":false},"excerpt":{"rendered":"<p>A Community-sourced Dataset for Advancing Isolated Sign Language Recognition Signed languages are the primary languages of about 70 million D\/deaf people worldwide (opens in new tab). Despite their importance, existing information and communication technologies are primarily designed for written or spoken language.&nbsp;Though automated solutions&nbsp;might help address such&nbsp;accessibility gaps, the state of sign language modeling is [&hellip;]<\/p>\n","protected":false},"featured_media":945612,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":true,"_classifai_error":"","footnotes":""},"research-area":[13556,13562,13563,13545,13554,13555],"msr-locale":[268875],"msr-impact-theme":[261667,261670],"msr-pillar":[],"class_list":["post-932478","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-research-area-computer-vision","msr-research-area-data-platform-analytics","msr-research-area-human-language-technologies","msr-research-area-human-computer-interaction","msr-research-area-search-information-retrieval","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"2018-07-01","related-publications":[952002,1105437],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[955086,987693],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[{"type":"user_nicename","display_name":"Alex Lu","user_id":41036,"people_section":"Related people","alias":"lualex"},{"type":"user_nicename","display_name":"Chinmay Singh","user_id":36750,"people_section":"Related people","alias":"chsingh"},{"type":"user_nicename","display_name":"Philip Rosenfield","user_id":37562,"people_section":"Related people","alias":"phrosenf"}],"msr_research_lab":[199563,199571],"msr_impact_theme":["Empowerment","Resilience"],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/932478","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":108,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/932478\/revisions"}],"predecessor-version":[{"id":1116720,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/932478\/revisions\/1116720"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/945612"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=932478"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=932478"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=932478"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=932478"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=932478"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}