Special Session Description
Speech technologies exist for many high resource languages, and attempts are being made to reach the next billion users by building resources and systems for many more languages. In the past, the main focus of the speech community has been in building monolingual systems that are capable of processing speech in a single language. Multilingual communities pose special challenges for the design and development of speech processing systems. One of these challenges is code-switching, which is the switching of two or more languages at the conversation, utterance and sometimes even word level.
In addition to conversational speech, code-switching is now found in text in social media, instant messaging and blogs in multilingual communities. Monolingual natural language and speech systems fail when they encounter code-switched speech and text. There is also a lack of linguistic data and resources for code-switched speech and text, although one or more of the languages being mixed could be high-resource.
Code-switching provides various interesting challenges to the speech community, such as language modeling for mixed languages, acoustic modeling of mixed language speech, pronunciation modeling and language identification from speech. The special session will include oral presentations and a panel discussion. Please see the Special Session schedule tab for more details. We expect participants from academic and industry spanning a wide variety of language pairs and data sets. We also expect discussions on how to create speech and language resources for code-switching and sharing of data.
Topics of interest for this special session will include but are not limited to:
Special Session Schedule
The special session will be held on 21 August 2017, distributed over two slots. All 9 papers will be presented as oral presentations. In addition, we will also have a panel at the end of the second session in which we will discuss topics such as data and resources for code-switching research. More details about this panel discussion will be available shortly.
|Date||Time||Room||Presentation type||Paper code||Paper ID||Title||Authors|
|2017-08-21||11:20-11:40||F11||Oral||Mon-SS-1-11-1||301||Longitudinal Speaker Clustering and Verification Corpus with Code-Switching Frisian-Dutch Speech||Emre Yilmaz, Jelske Dijkstra, Hans Van de Velde, Frederik Kampstra, Jouke Algra, Henk Van den Heuvel, David Van Leeuwen|
|2017-08-21||11:40-12:00||F11||Oral||Mon-SS-1-11-2||391||Exploiting Untranscribed Broadcast Data for Improved Code-Switching Detection||Emre Yilmaz, Henk van den Heuvel, David van Leeuwen|
|2017-08-21||12:00-12:20||F11||Oral||Mon-SS-1-11-3||1198||Jee haan, I’d like both, por favor: Elicitation of a Code-Switched Corpus of Hindi-English and Spanish-English Human-Machine Dialog||Vikram Ramanarayanan, David Suendermann-Oeft|
|2017-08-21||12:20-12:40||F11||Oral||Mon-SS-1-11-4||1244||On building mixed lingual speech synthesis systems||SaiKrishna Rallabandi, Alan W Black|
|2017-08-21||12:40-13:00||F11||Oral||Mon-SS-1-11-5||1259||Speech Synthesis for Mixed-Language Navigation Instructions||Khyathi Chandu, Sai Krishna Rallabandi, Sunayana Sitaram, Alan W Black|
|2017-08-21||14:30-14:50||F11||Oral||Mon-SS-1-11-6||1373||Addressing Code-Switching in French/Algerian Arabic Speech||Amazouz Djegdjiga, Martine Adda-Decker, Lori Lamel|
|2017-08-21||14:50-15:10||F11||Oral||Mon-SS-1-11-7||1429||Metrics for modeling code-switching across corpora||Wally Guzman, Joseph Ricard, Jacqueline Serigos, Barbara Bullock, Almeida Jacqueline Toribio|
|2017-08-21||15:10-15:30||F11||Oral||Mon-SS-1-11-8||1437||Synthesising isiZulu-English code-switch bigrams using word embeddings||Ewald Van der westhuizen, Thomas Niesler|
|2017-08-21||15:30-15:50||F11||Oral||Mon-SS-1-11-9||1663||Crowdsourcing Universal Part-Of-Speech Tags for Code-Switching||Victor Soto, Julia Hirschberg|