{"id":642510,"date":"2020-03-24T09:03:44","date_gmt":"2020-03-24T16:03:44","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-event&#038;p=642510"},"modified":"2025-08-06T11:53:04","modified_gmt":"2025-08-06T18:53:04","slug":"icassp-2020","status":"publish","type":"msr-event","link":"https:\/\/www.microsoft.com\/en-us\/research\/event\/icassp-2020\/","title":{"rendered":"Microsoft @ ICASSP 2020"},"content":{"rendered":"\n\n<p><strong>Website:<\/strong> <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/2020.ieeeicassp.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">ICASSP 2020<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<p>Microsoft is proud to be a silver sponsor of the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/2020.ieeeicassp.org\/\" target=\"_blank\" rel=\"noopener\">45<sup>th<\/sup> International Conference on Acoustics, Speech and Signal Processing (ICASSP)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<h2>Tuesday, May 5<\/h2>\n<h3>11:30 \u2013 13:30 CEST<\/h3>\n<p>MLSP-P2: Applications in Speech and Audio<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053972\/\"><strong>Multi-Label Sound Event Retrieval Using A Deep Learning-Based Siamese Structure With A Pairwise Presence Matrix<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nJianyu Fan,\u00a0<strong>Eric Nichols<\/strong>,\u00a0<strong>Daniel Tompkins<\/strong>, Ana Elisa Me\u0301ndez Me\u0301ndez, Benjamin Elizalde, Philippe Pasquier<\/p>\n<h3>11:50 \u2013 12:10 CEST<\/h3>\n<p>SPE-L1: End-to-end Speech Recognition I: Streaming<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054098\/\"><strong>Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nHirofumi Inaguma,\u00a0<strong>Yashesh Gaur<\/strong>,\u00a0<strong>Liang Lu<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>16:30 \u2013 18:30 CEST<\/h3>\n<p>SPE-P3: Machine Learning for Speech Synthesis I<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054337\/\"><strong>Improving Prosody with Linguistic and Bert Derived Features in Multi-Speaker Based Mandarin Chinese Neural TTS<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Yujia Xiao<\/strong>,\u00a0<strong>Lei He<\/strong>,\u00a0<strong>Huaiping Ming<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/frankkps\/\">Frank K. Soong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>17:30 \u2013 17:50 CEST<\/h3>\n<p>AUD-L2: Deep Learning for Source Separation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054266\/\"><strong>Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nYi Luo,\u00a0<strong>Zhuo Chen<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tayoshio\/\">Takuya Yoshioka<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<hr \/>\n<h2>Wednesday, May 6<\/h2>\n<h3>9:00 \u2013 11:00 CEST<\/h3>\n<p>AUD-P4: Feedback, Noise, and Reverberation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053785\/\"><strong>Joint Beamforming and Reverberation Cancellation Using a Constrained Kalman Filter with Multichannel Linear Prediction<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nSahar Hashemgeloogerdi,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sebraun\/\">Sebastian Braun<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>AUD-P4: Feedback, Noise, and Reverberation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053025\/\"><strong>Predicting Word Error Rate for Reverberant Speech<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/hagamper\/\">Hannes Gamper<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/diemmano\/\">Dimitra Emmanouilidou<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sebraun\/\">Sebastian Braun<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ivantash\/\">Ivan Tashev<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SPE-P5: Deep Speaker Recognition Models<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053767\/\"><strong>Improving Deep CNN Networks with Long Temporal Context for Text-independent Speaker Verification<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Yong Zhao<\/strong>,\u00a0<strong>Tianyan Zhou<\/strong>,\u00a0<strong>Zhuo Chen<\/strong>,\u00a0<strong>Jian Wu<\/strong><\/p>\n<h3>9:20 \u2013 9:40 CEST<\/h3>\n<p>SPE-L6: Speech Enhancement II: Single Channel<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/cmsworkshops.com\/ICASSP2020\/Papers\/ViewPaper_MS.asp?PaperNum=4448\"><strong>Low-Latency Single Channel Speech Enhancement Using U-Net Convolutional Neural Networks<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Ahmet E. Bulut<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kazukoi\/\">Kazuhito Koishida<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>11:30 \u2013 13:30 CEST<\/h3>\n<p>SAM-P3: Sparsity, Super-Resolution and Imaging<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/1911.08015\"><strong>Low-Rank Toeplits Matrix Estimation Via Random Ultra-Sparse Rulers<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nHannah Lawrence, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jerrl\/\">Jerry Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Cameron Musco, Christopher Musco<\/p>\n<p>SPE-P8: Robust Speech Recognition<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053455\/\"><strong>A Practical Two-Stage Training Strategy for Multi-Stream End-to-End Speech Recognition<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nRuizhi Li, Gregory Sell,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/xiaofewa\/\">Xiaofei Wang<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Shinji Watanabe, Hynek Hermansky<\/p>\n<h3>16:30 \u2013 16:50 CEST<\/h3>\n<p>IFS-L2: Privacy, Biometrics and Information Security<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053729\/\"><strong>Privacy-Preserving Phishing Web Page Classification Via Fully Homomorphic Encryption<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nEdward Chou,\u00a0<strong>Arun Gururajan<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kilai\/\">Kim Laine<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Nitin Kumar Goel<\/strong>,\u00a0<strong>Anna Bertiger<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jstokes\/\">Jack W. Stokes<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>16:30 \u2013 18:30 CEST<\/h3>\n<p>HLT-P1: Spoken Language Understanding and Dialogue I<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053599\/\"><strong>Fast Domain Adaptation for Goal-Oriented Dialogue Using A Hybrid Generative-Retrieval Transformer<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nIgor Shalyminov,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/alsordon\/\">Alessandro Sordoni<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/adatkins\/\">Adam Atkinson<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/haschulz\/\">Hannes Schulz<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SPE-P9: End-to-end Speech Recognition III: General Topics<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054663\/\"><strong>Exploring Pre-training with Alignments for RNN Transducer based End-to-End Speech Recognition<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nHu Hu,\u00a0<strong>Rui Zhao<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Liang Lu<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<hr \/>\n<h2>Thursday, May 7<\/h2>\n<h3>9:00 \u2013 11:00 CEST<\/h3>\n<p>HLT-P2: Speech and Language Analysis<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053167\/\"><strong>Combining Acoustics, Content and interaction Features to Find Hot Spots in Meetings<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nDave Makhervaks,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/wihintho\/\">William Hinthorn<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/didimit\/\">Dimitrios Dimitriadis<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Andreas Stolcke<\/p>\n<h3>10:20 \u2013 10:40 CEST<\/h3>\n<p>AUD-L6: Acoustic Environments and Spatial Audio II<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054091\/\"><strong>Fast Acoustic Scattering Using Convolutional Neural Networks<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nZiqi Fan,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/vivineet\/\">Vibhav Vineet<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/hagamper\/\">Hannes Gamper<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/nikunjr\/\">Nikunj Raghuvanshi<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>10:40 \u2013 11:00 CEST<\/h3>\n<p>SPE-L11: Speech Separation and Extraction I: Single Channel<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053068\/\"><strong>An Online Speaker-Aware Speech Separation Approach Based on Time-Domain Representation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nHui Wang, Yan Song,\u00a0<strong>Zeng-Xi Li<\/strong>, Ian McLoughlin, Li-Rong Dai<\/p>\n<h3>11:30 \u2013 13:30 CEST<\/h3>\n<p>SPE-P12: Machine Learning for Speech Synthesis II<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053704\/\"><strong>Improving LPCNET-Based Text-to-Speech with Linear Prediction-Structured Mixture Density Network<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nMin-Jae Hwang, Eunwoo Song, Ryuichi Yamamoto,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/frankkps\/\">Frank Soong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Hong-Goo Kang<\/p>\n<p>SPE-P13: Speech Separation and Extraction III<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053426\/\"><strong>Continuous Speech Separation: Dataset and Analysis<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Zhuo Chen<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tayoshio\/\">Takuya Yoshioka<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Liang Lu<\/strong>,\u00a0<strong>Tianyan Zhou<\/strong>,\u00a0<strong>Zhong Meng<\/strong>,\u00a0<strong>Yi Luo<\/strong>,\u00a0<strong>Jian Wu<\/strong>,\u00a0<strong>Xiong Xiao<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>12:10 \u2013 12:30 CEST<\/h3>\n<p>SPE-L12: Speech Separation and Extraction II: Multi-channel<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054177\/\"><strong>End-to-End Microphone Permutation and Number Invariant Multi-Channel Speech Separation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nYi Luo,\u00a0<strong>Zhuo Chen<\/strong>, Nima Mesgarani,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tayoshio\/\">Takuya Yoshioka<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>16:30 \u2013 18:30 CEST<\/h3>\n<p>MMSP-P3:\u00a0 Multimedia Signal Processing<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053766\/\"><strong>Supervised Deep Hashing for Efficient Audio Event Retrieval<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nArindam Jati,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/diemmano\/\">Dimitra Emmanouilidou<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>MMSP-P3:\u00a0 Multimedia Signal Processing<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053171\/\"><strong>Multimodal Active Speaker Detection and Virtual Cinematography for Video Conferencing<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Ross Cutler<\/strong>, Ramin Mehran, Sam Johnson,\u00a0<strong>Cha Zhang<\/strong>, Adam Kirk, Oliver Whyte, Adarsh Kowdle<\/p>\n<p>SPE-P15: Speech Recognition: Adaptation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053300\/\"><strong>L-Vector: Neural Label Embedding for Domain Adaptation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Zhong Meng<\/strong>, Hu Hu,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Changliang Liu<\/strong>,\u00a0<strong>Yan Huang<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Chin-Hui Lee<\/p>\n<p>SPE-P15: Speech Recognition: Adaptation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053545\/\"><strong>Acoustic Model Adaptation for Presentation Transcription and Intelligent Meeting Assistant Systems<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Yan Huang<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SPE-P15: Speech Recognition: Adaptation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053104\/\"><strong>Using Personalized Speech Synthesis and Neural Language Generator for Rapid Speaker Adaptation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Yan Huang<\/strong>,\u00a0<strong>Lei He<\/strong>,\u00a0<strong>Wenning Wei<\/strong>,\u00a0<strong>William Gale<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SS-P1: Signal Processing Education: Trends and Innovations<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053380\/\"><strong>A Dataset for Measuring Reading Levels in India at Scale<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nDolly Agarwal,\u00a0<strong>Jayant Gupchup<\/strong>, Nishant Baghel<\/p>\n<h3>17:30 \u2013 17:30 CEST<\/h3>\n<p>IDSP-L2: Industry Session on Large-Scale Distributed Learning Strategies<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9052983\/\"><strong>Parallelizing Adam Optimizer with Blockwise Model-Update Filtering<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Kai Chen<\/strong>, Haisong Ding,\u00a0<strong>Qiang Huo<\/strong><\/p>\n<hr \/>\n<h2>Friday, May 8<\/h2>\n<h3>8:00 \u2013 10:00 CEST<\/h3>\n<p>IFS-P1: Information Hiding, Biometrics and Security<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053670\/\"><strong>Texception: A Character\/Word-Level Deep Learning Model for Phishing URL Detection<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Farid Tajaddodianfar<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jstokes\/\">Jack W. Stokes<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Arun Gururajan<\/strong><\/p>\n<p>SAM-P6: Detection, Estimation and Classification<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053825\/\"><strong>Static Visual Spatial Priors For DOA Estimation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nPawel Swietojanski,\u00a0<strong>Ondrej Miksik<\/strong><\/p>\n<p>SPE-P16: Word Spotting<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053191\/\"><strong>Adaptation of RNN Transducer with Text-to-Speech Technology for Keyword Spotting<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Eva Sharma<\/strong>,\u00a0<strong>Guoli Ye<\/strong>,\u00a0<strong>Wenning Wei<\/strong>,\u00a0<strong>Rui Zhao<\/strong>,\u00a0<strong>Yao Tian<\/strong>,\u00a0<strong>Jian Wu<\/strong>,\u00a0<strong>Lei He<\/strong>,\u00a0<strong>Ed Lin<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SPE-P17: Speech Enhancement IV<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054528\/\"><strong>AV(SE) \u00b2: Audio-Visual Squeeze-Excite Speech Enhancement<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nMichael Iuzzolino,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kazukoi\/\">Kazuhito Koishida<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>8:20 \u2013 8:40 CEST<\/h3>\n<p>HLT-L2: Language Modeling<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053483\/\"><strong>Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nJunhao Xu,\u00a0<strong>Xie Chen<\/strong>, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Mei-Ling Meng<\/p>\n<h3>9:40 \u2013 10:00 CEST<\/h3>\n<p>MLSP-L10: Deep Neural Network Structures<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053105\/\"><strong>Neural Attentive Multiview Machines<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Oren Barkan<\/strong>,\u00a0<strong>Ori Katz<\/strong>,\u00a0<strong>Noam Koenigstein<\/strong><\/p>\n<h3>11:45 \u2013 13:45 CEST<\/h3>\n<p>AUD-P11: Signal Enhancement and Restoration II<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053649\/\"><strong>Geometrically Constrained Independent Vector Analysis for Directional Speech Enhancement<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nLi Li,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kazukoi\/\">Kazuhito Koishida<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>AUD-P11: Signal Enhancement and Restoration II<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054254\/\"><strong>Weighted Speech Distortion Losses for Neural-Network-Based Real-Time Speech Enhancement<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nYangyang Xia,\u00a0<strong>Sebastian Braun<\/strong>,\u00a0<strong>Chandan Reddy<\/strong>,\u00a0<strong>Harishchandra Dubey<\/strong>,\u00a0<strong>Ross Cutler<\/strong>,\u00a0<strong>Ivan Tashev<\/strong><\/p>\n<p>HLT-P5: Multilingual Processing of Language<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053752\/\"><strong>Addressing Accent Mismatch in Mandarin-English Code-Switching Speech Recognition<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Zhili Tan<\/strong>,\u00a0<strong>Xinghua Fan<\/strong>,\u00a0<strong>Hui Zhu<\/strong>,\u00a0<strong>Ed Lin<\/strong><\/p>\n<p>IFS-P2: Anonymization, Security and Privacy<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/cmsworkshops.com\/ICASSP2020\/Papers\/ViewPaper.asp?PaperNum=2004\"><strong>Detection of Malicious VSCRIPT Using Static and Dynamic Analysis with Recurrent Deep Learning<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jstokes\/\">Jack W. Stokes<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Rakshit Agrawal,\u00a0<strong>Geoff McDonald<\/strong><\/p>\n<p>SPE-P19: Machine Learning for Speech Synthesis III<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/1910.10909\"><strong>ESPNET-TTS: Unified, Reproducible, and Integartable Open Source End-to-End Text-to-Speech Toolkit<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nTomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/xuta\/\">Xu Tan<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SPE-P20: Speech Recognition: Acoustic Modelling II<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054387\/\"><strong>High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Rui Zhao<\/strong>,\u00a0<strong>Eric Sun<\/strong>,\u00a0<strong>Jeremy Wong<\/strong>,\u00a0<strong>Amit Das<\/strong>,\u00a0<strong>Zhong Meng<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>12:25 \u2013 12:45 CEST<\/h3>\n<p>SPE-L16: Speaker Diarization<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054176\/\"><strong>Speaker Diarization with Session-Level Speaker Embedding Refinement Using Graph Neural Networks<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nJixuan Wang,\u00a0<strong>Xiong Xiao<\/strong>,\u00a0<strong>Jian Wu<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ranjanir\/\">Ranjani Ramamurthy<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Frank Rudzicz, Michael Brudno<\/p>\n<h3>13:05 \u2013 13:25 CEST<\/h3>\n<p>SPE-L16: Speaker Diarization<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053152\/\"><strong>A Memory Augmented Architecture for Continuous Speaker Identification in Meetings<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nNikolaos Flemotomos,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/didimit\/\">Dimitrios Dimitriadis<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>15:15 \u2013 17:15 CEST<\/h3>\n<p>SPE-P21: Voice Conversion<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054010\/\"><strong>An Improved Frame-Unit-Selection Based Voice Conversion System Without Parallel Training Data<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nFeng-Long Xie, Xin-Hui Li, Bo Liu, Yi-Bin Zheng, Li Meng, Li Lu,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/frankkps\/\">Frank K. Soong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>16:15 \u2013 16:30 CEST<\/h3>\n<p>MLSP-L11: Attention Needs<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053071\/\"><strong>Attentive Item2vec: Neural Attentive User Representations<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Oren Barkan<\/strong>, Avi Caciularu,\u00a0<strong>Ori Katz<\/strong>,\u00a0<strong>Noam Koenigstein<\/strong><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Microsoft is proud to be a silver sponsor of the 45th International Conference on Acoustics, Speech and Signal Processing (ICASSP). Stop by our booth to chat with our experts, see demos of our latest research and find out about career opportunities with Microsoft.<\/p>\n","protected":false},"featured_media":644613,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_startdate":"2020-05-04","msr_enddate":"2020-05-08","msr_location":"Virtual","msr_expirationdate":"","msr_event_recording_link":"","msr_event_link":"","msr_event_link_redirect":false,"msr_event_time":"","msr_hide_region":true,"msr_private_event":false,"msr_hide_image_in_river":0,"footnotes":""},"research-area":[243062,13545],"msr-region":[239178,256048],"msr-event-type":[197941],"msr-video-type":[],"msr-locale":[268875],"msr-program-audience":[],"msr-post-option":[],"msr-impact-theme":[],"class_list":["post-642510","msr-event","type-msr-event","status-publish","has-post-thumbnail","hentry","msr-research-area-audio-acoustics","msr-research-area-human-language-technologies","msr-region-europe","msr-region-global","msr-event-type-conferences","msr-locale-en_us"],"msr_about":"<!-- wp:msr\/event-details {\"title\":\"Microsoft @ ICASSP 2020\",\"backgroundColor\":\"grey\",\"image\":{\"id\":644613,\"url\":\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/ICASSP_header_1920x720_v2.jpg\",\"alt\":\"\"}} \/-->\n\n<!-- wp:msr\/content-tabs --><!-- wp:msr\/content-tab {\"title\":\"About\"} --><!-- wp:freeform --><p><strong>Website:<\/strong> <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/2020.ieeeicassp.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">ICASSP 2020<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<p>Microsoft is proud to be a silver sponsor of the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/2020.ieeeicassp.org\/\" target=\"_blank\" rel=\"noopener\">45<sup>th<\/sup> International Conference on Acoustics, Speech and Signal Processing (ICASSP)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<!-- \/wp:freeform --><!-- \/wp:msr\/content-tab --><!-- wp:msr\/content-tab {\"title\":\"Sessions\"} --><!-- wp:freeform --><h2>Tuesday, May 5<\/h2>\n<h3>11:30 \u2013 13:30 CEST<\/h3>\n<p>MLSP-P2: Applications in Speech and Audio<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053972\/\"><strong>Multi-Label Sound Event Retrieval Using A Deep Learning-Based Siamese Structure With A Pairwise Presence Matrix<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nJianyu Fan,\u00a0<strong>Eric Nichols<\/strong>,\u00a0<strong>Daniel Tompkins<\/strong>, Ana Elisa Me\u0301ndez Me\u0301ndez, Benjamin Elizalde, Philippe Pasquier<\/p>\n<h3>11:50 \u2013 12:10 CEST<\/h3>\n<p>SPE-L1: End-to-end Speech Recognition I: Streaming<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054098\/\"><strong>Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nHirofumi Inaguma,\u00a0<strong>Yashesh Gaur<\/strong>,\u00a0<strong>Liang Lu<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>16:30 \u2013 18:30 CEST<\/h3>\n<p>SPE-P3: Machine Learning for Speech Synthesis I<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054337\/\"><strong>Improving Prosody with Linguistic and Bert Derived Features in Multi-Speaker Based Mandarin Chinese Neural TTS<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Yujia Xiao<\/strong>,\u00a0<strong>Lei He<\/strong>,\u00a0<strong>Huaiping Ming<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/frankkps\/\">Frank K. Soong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>17:30 \u2013 17:50 CEST<\/h3>\n<p>AUD-L2: Deep Learning for Source Separation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054266\/\"><strong>Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nYi Luo,\u00a0<strong>Zhuo Chen<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tayoshio\/\">Takuya Yoshioka<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<hr \/>\n<h2>Wednesday, May 6<\/h2>\n<h3>9:00 \u2013 11:00 CEST<\/h3>\n<p>AUD-P4: Feedback, Noise, and Reverberation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053785\/\"><strong>Joint Beamforming and Reverberation Cancellation Using a Constrained Kalman Filter with Multichannel Linear Prediction<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nSahar Hashemgeloogerdi,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sebraun\/\">Sebastian Braun<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>AUD-P4: Feedback, Noise, and Reverberation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053025\/\"><strong>Predicting Word Error Rate for Reverberant Speech<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/hagamper\/\">Hannes Gamper<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/diemmano\/\">Dimitra Emmanouilidou<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sebraun\/\">Sebastian Braun<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ivantash\/\">Ivan Tashev<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SPE-P5: Deep Speaker Recognition Models<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053767\/\"><strong>Improving Deep CNN Networks with Long Temporal Context for Text-independent Speaker Verification<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Yong Zhao<\/strong>,\u00a0<strong>Tianyan Zhou<\/strong>,\u00a0<strong>Zhuo Chen<\/strong>,\u00a0<strong>Jian Wu<\/strong><\/p>\n<h3>9:20 \u2013 9:40 CEST<\/h3>\n<p>SPE-L6: Speech Enhancement II: Single Channel<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/cmsworkshops.com\/ICASSP2020\/Papers\/ViewPaper_MS.asp?PaperNum=4448\"><strong>Low-Latency Single Channel Speech Enhancement Using U-Net Convolutional Neural Networks<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Ahmet E. Bulut<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kazukoi\/\">Kazuhito Koishida<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>11:30 \u2013 13:30 CEST<\/h3>\n<p>SAM-P3: Sparsity, Super-Resolution and Imaging<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/1911.08015\"><strong>Low-Rank Toeplits Matrix Estimation Via Random Ultra-Sparse Rulers<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nHannah Lawrence, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jerrl\/\">Jerry Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Cameron Musco, Christopher Musco<\/p>\n<p>SPE-P8: Robust Speech Recognition<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053455\/\"><strong>A Practical Two-Stage Training Strategy for Multi-Stream End-to-End Speech Recognition<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nRuizhi Li, Gregory Sell,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/xiaofewa\/\">Xiaofei Wang<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Shinji Watanabe, Hynek Hermansky<\/p>\n<h3>16:30 \u2013 16:50 CEST<\/h3>\n<p>IFS-L2: Privacy, Biometrics and Information Security<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053729\/\"><strong>Privacy-Preserving Phishing Web Page Classification Via Fully Homomorphic Encryption<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nEdward Chou,\u00a0<strong>Arun Gururajan<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kilai\/\">Kim Laine<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Nitin Kumar Goel<\/strong>,\u00a0<strong>Anna Bertiger<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jstokes\/\">Jack W. Stokes<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>16:30 \u2013 18:30 CEST<\/h3>\n<p>HLT-P1: Spoken Language Understanding and Dialogue I<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053599\/\"><strong>Fast Domain Adaptation for Goal-Oriented Dialogue Using A Hybrid Generative-Retrieval Transformer<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nIgor Shalyminov,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/alsordon\/\">Alessandro Sordoni<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/adatkins\/\">Adam Atkinson<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/haschulz\/\">Hannes Schulz<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SPE-P9: End-to-end Speech Recognition III: General Topics<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054663\/\"><strong>Exploring Pre-training with Alignments for RNN Transducer based End-to-End Speech Recognition<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nHu Hu,\u00a0<strong>Rui Zhao<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Liang Lu<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<hr \/>\n<h2>Thursday, May 7<\/h2>\n<h3>9:00 \u2013 11:00 CEST<\/h3>\n<p>HLT-P2: Speech and Language Analysis<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053167\/\"><strong>Combining Acoustics, Content and interaction Features to Find Hot Spots in Meetings<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nDave Makhervaks,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/wihintho\/\">William Hinthorn<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/didimit\/\">Dimitrios Dimitriadis<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Andreas Stolcke<\/p>\n<h3>10:20 \u2013 10:40 CEST<\/h3>\n<p>AUD-L6: Acoustic Environments and Spatial Audio II<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054091\/\"><strong>Fast Acoustic Scattering Using Convolutional Neural Networks<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nZiqi Fan,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/vivineet\/\">Vibhav Vineet<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/hagamper\/\">Hannes Gamper<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/nikunjr\/\">Nikunj Raghuvanshi<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>10:40 \u2013 11:00 CEST<\/h3>\n<p>SPE-L11: Speech Separation and Extraction I: Single Channel<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053068\/\"><strong>An Online Speaker-Aware Speech Separation Approach Based on Time-Domain Representation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nHui Wang, Yan Song,\u00a0<strong>Zeng-Xi Li<\/strong>, Ian McLoughlin, Li-Rong Dai<\/p>\n<h3>11:30 \u2013 13:30 CEST<\/h3>\n<p>SPE-P12: Machine Learning for Speech Synthesis II<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053704\/\"><strong>Improving LPCNET-Based Text-to-Speech with Linear Prediction-Structured Mixture Density Network<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nMin-Jae Hwang, Eunwoo Song, Ryuichi Yamamoto,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/frankkps\/\">Frank Soong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Hong-Goo Kang<\/p>\n<p>SPE-P13: Speech Separation and Extraction III<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053426\/\"><strong>Continuous Speech Separation: Dataset and Analysis<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Zhuo Chen<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tayoshio\/\">Takuya Yoshioka<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Liang Lu<\/strong>,\u00a0<strong>Tianyan Zhou<\/strong>,\u00a0<strong>Zhong Meng<\/strong>,\u00a0<strong>Yi Luo<\/strong>,\u00a0<strong>Jian Wu<\/strong>,\u00a0<strong>Xiong Xiao<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>12:10 \u2013 12:30 CEST<\/h3>\n<p>SPE-L12: Speech Separation and Extraction II: Multi-channel<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054177\/\"><strong>End-to-End Microphone Permutation and Number Invariant Multi-Channel Speech Separation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nYi Luo,\u00a0<strong>Zhuo Chen<\/strong>, Nima Mesgarani,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tayoshio\/\">Takuya Yoshioka<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>16:30 \u2013 18:30 CEST<\/h3>\n<p>MMSP-P3:\u00a0 Multimedia Signal Processing<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053766\/\"><strong>Supervised Deep Hashing for Efficient Audio Event Retrieval<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nArindam Jati,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/diemmano\/\">Dimitra Emmanouilidou<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>MMSP-P3:\u00a0 Multimedia Signal Processing<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053171\/\"><strong>Multimodal Active Speaker Detection and Virtual Cinematography for Video Conferencing<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Ross Cutler<\/strong>, Ramin Mehran, Sam Johnson,\u00a0<strong>Cha Zhang<\/strong>, Adam Kirk, Oliver Whyte, Adarsh Kowdle<\/p>\n<p>SPE-P15: Speech Recognition: Adaptation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053300\/\"><strong>L-Vector: Neural Label Embedding for Domain Adaptation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Zhong Meng<\/strong>, Hu Hu,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Changliang Liu<\/strong>,\u00a0<strong>Yan Huang<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Chin-Hui Lee<\/p>\n<p>SPE-P15: Speech Recognition: Adaptation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053545\/\"><strong>Acoustic Model Adaptation for Presentation Transcription and Intelligent Meeting Assistant Systems<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Yan Huang<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SPE-P15: Speech Recognition: Adaptation<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053104\/\"><strong>Using Personalized Speech Synthesis and Neural Language Generator for Rapid Speaker Adaptation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Yan Huang<\/strong>,\u00a0<strong>Lei He<\/strong>,\u00a0<strong>Wenning Wei<\/strong>,\u00a0<strong>William Gale<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SS-P1: Signal Processing Education: Trends and Innovations<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053380\/\"><strong>A Dataset for Measuring Reading Levels in India at Scale<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nDolly Agarwal,\u00a0<strong>Jayant Gupchup<\/strong>, Nishant Baghel<\/p>\n<h3>17:30 \u2013 17:30 CEST<\/h3>\n<p>IDSP-L2: Industry Session on Large-Scale Distributed Learning Strategies<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9052983\/\"><strong>Parallelizing Adam Optimizer with Blockwise Model-Update Filtering<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Kai Chen<\/strong>, Haisong Ding,\u00a0<strong>Qiang Huo<\/strong><\/p>\n<hr \/>\n<h2>Friday, May 8<\/h2>\n<h3>8:00 \u2013 10:00 CEST<\/h3>\n<p>IFS-P1: Information Hiding, Biometrics and Security<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053670\/\"><strong>Texception: A Character\/Word-Level Deep Learning Model for Phishing URL Detection<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Farid Tajaddodianfar<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jstokes\/\">Jack W. Stokes<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Arun Gururajan<\/strong><\/p>\n<p>SAM-P6: Detection, Estimation and Classification<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053825\/\"><strong>Static Visual Spatial Priors For DOA Estimation<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nPawel Swietojanski,\u00a0<strong>Ondrej Miksik<\/strong><\/p>\n<p>SPE-P16: Word Spotting<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053191\/\"><strong>Adaptation of RNN Transducer with Text-to-Speech Technology for Keyword Spotting<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Eva Sharma<\/strong>,\u00a0<strong>Guoli Ye<\/strong>,\u00a0<strong>Wenning Wei<\/strong>,\u00a0<strong>Rui Zhao<\/strong>,\u00a0<strong>Yao Tian<\/strong>,\u00a0<strong>Jian Wu<\/strong>,\u00a0<strong>Lei He<\/strong>,\u00a0<strong>Ed Lin<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SPE-P17: Speech Enhancement IV<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054528\/\"><strong>AV(SE) \u00b2: Audio-Visual Squeeze-Excite Speech Enhancement<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nMichael Iuzzolino,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kazukoi\/\">Kazuhito Koishida<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>8:20 \u2013 8:40 CEST<\/h3>\n<p>HLT-L2: Language Modeling<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053483\/\"><strong>Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nJunhao Xu,\u00a0<strong>Xie Chen<\/strong>, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Mei-Ling Meng<\/p>\n<h3>9:40 \u2013 10:00 CEST<\/h3>\n<p>MLSP-L10: Deep Neural Network Structures<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053105\/\"><strong>Neural Attentive Multiview Machines<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Oren Barkan<\/strong>,\u00a0<strong>Ori Katz<\/strong>,\u00a0<strong>Noam Koenigstein<\/strong><\/p>\n<h3>11:45 \u2013 13:45 CEST<\/h3>\n<p>AUD-P11: Signal Enhancement and Restoration II<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053649\/\"><strong>Geometrically Constrained Independent Vector Analysis for Directional Speech Enhancement<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nLi Li,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kazukoi\/\">Kazuhito Koishida<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>AUD-P11: Signal Enhancement and Restoration II<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054254\/\"><strong>Weighted Speech Distortion Losses for Neural-Network-Based Real-Time Speech Enhancement<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nYangyang Xia,\u00a0<strong>Sebastian Braun<\/strong>,\u00a0<strong>Chandan Reddy<\/strong>,\u00a0<strong>Harishchandra Dubey<\/strong>,\u00a0<strong>Ross Cutler<\/strong>,\u00a0<strong>Ivan Tashev<\/strong><\/p>\n<p>HLT-P5: Multilingual Processing of Language<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053752\/\"><strong>Addressing Accent Mismatch in Mandarin-English Code-Switching Speech Recognition<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Zhili Tan<\/strong>,\u00a0<strong>Xinghua Fan<\/strong>,\u00a0<strong>Hui Zhu<\/strong>,\u00a0<strong>Ed Lin<\/strong><\/p>\n<p>IFS-P2: Anonymization, Security and Privacy<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/cmsworkshops.com\/ICASSP2020\/Papers\/ViewPaper.asp?PaperNum=2004\"><strong>Detection of Malicious VSCRIPT Using Static and Dynamic Analysis with Recurrent Deep Learning<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jstokes\/\">Jack W. Stokes<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Rakshit Agrawal,\u00a0<strong>Geoff McDonald<\/strong><\/p>\n<p>SPE-P19: Machine Learning for Speech Synthesis III<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/1910.10909\"><strong>ESPNET-TTS: Unified, Reproducible, and Integartable Open Source End-to-End Text-to-Speech Toolkit<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nTomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/xuta\/\">Xu Tan<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>SPE-P20: Speech Recognition: Acoustic Modelling II<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054387\/\"><strong>High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<strong>Rui Zhao<\/strong>,\u00a0<strong>Eric Sun<\/strong>,\u00a0<strong>Jeremy Wong<\/strong>,\u00a0<strong>Amit Das<\/strong>,\u00a0<strong>Zhong Meng<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>12:25 \u2013 12:45 CEST<\/h3>\n<p>SPE-L16: Speaker Diarization<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054176\/\"><strong>Speaker Diarization with Session-Level Speaker Embedding Refinement Using Graph Neural Networks<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nJixuan Wang,\u00a0<strong>Xiong Xiao<\/strong>,\u00a0<strong>Jian Wu<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ranjanir\/\">Ranjani Ramamurthy<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Frank Rudzicz, Michael Brudno<\/p>\n<h3>13:05 \u2013 13:25 CEST<\/h3>\n<p>SPE-L16: Speaker Diarization<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053152\/\"><strong>A Memory Augmented Architecture for Continuous Speaker Identification in Meetings<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nNikolaos Flemotomos,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/didimit\/\">Dimitrios Dimitriadis<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>15:15 \u2013 17:15 CEST<\/h3>\n<p>SPE-P21: Voice Conversion<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9054010\/\"><strong>An Improved Frame-Unit-Selection Based Voice Conversion System Without Parallel Training Data<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\nFeng-Long Xie, Xin-Hui Li, Bo Liu, Yi-Bin Zheng, Li Meng, Li Lu,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/frankkps\/\">Frank K. Soong<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<h3>16:15 \u2013 16:30 CEST<\/h3>\n<p>MLSP-L11: Attention Needs<br \/>\n<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" target=\"_blank\" href=\"https:\/\/ieeexplore.ieee.org\/document\/9053071\/\"><strong>Attentive Item2vec: Neural Attentive User Representations<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><br \/>\n<strong>Oren Barkan<\/strong>, Avi Caciularu,\u00a0<strong>Ori Katz<\/strong>,\u00a0<strong>Noam Koenigstein<\/strong><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<!-- \/wp:freeform --><!-- \/wp:msr\/content-tab --><!-- \/wp:msr\/content-tabs -->","tab-content":[{"id":0,"name":"About","content":"Microsoft is proud to be a silver sponsor of the <a href=\"https:\/\/2020.ieeeicassp.org\/\" target=\"_blank\" rel=\"noopener\">45<sup>th<\/sup> International Conference on Acoustics, Speech and Signal Processing (ICASSP)<\/a>."},{"id":1,"name":"Sessions","content":"<h2>Tuesday, May 5<\/h2>\r\n<h3>11:30 \u2013 13:30 CEST<\/h3>\r\nMLSP-P2: Applications in Speech and Audio\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053972\/\"><strong>Multi-Label Sound Event Retrieval Using A Deep Learning-Based Siamese Structure With A Pairwise Presence Matrix<\/strong><\/a>\r\nJianyu Fan,\u00a0<strong>Eric Nichols<\/strong>,\u00a0<strong>Daniel Tompkins<\/strong>, Ana Elisa Me\u0301ndez Me\u0301ndez, Benjamin Elizalde, Philippe Pasquier\r\n<h3>11:50 \u2013 12:10 CEST<\/h3>\r\nSPE-L1: End-to-end Speech Recognition I: Streaming\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9054098\/\"><strong>Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR<\/strong><\/a>\r\nHirofumi Inaguma,\u00a0<strong>Yashesh Gaur<\/strong>,\u00a0<strong>Liang Lu<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<\/a>\r\n<h3>16:30 \u2013 18:30 CEST<\/h3>\r\nSPE-P3: Machine Learning for Speech Synthesis I\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9054337\/\"><strong>Improving Prosody with Linguistic and Bert Derived Features in Multi-Speaker Based Mandarin Chinese Neural TTS<\/strong><\/a>\r\n<strong>Yujia Xiao<\/strong>,\u00a0<strong>Lei He<\/strong>,\u00a0<strong>Huaiping Ming<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/frankkps\/\">Frank K. Soong<\/a>\r\n<h3>17:30 \u2013 17:50 CEST<\/h3>\r\nAUD-L2: Deep Learning for Source Separation\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9054266\/\"><strong>Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation<\/strong><\/a>\r\nYi Luo,\u00a0<strong>Zhuo Chen<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tayoshio\/\">Takuya Yoshioka<\/a>\r\n\r\n<hr \/>\r\n\r\n<h2>Wednesday, May 6<\/h2>\r\n<h3>9:00 \u2013 11:00 CEST<\/h3>\r\nAUD-P4: Feedback, Noise, and Reverberation\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053785\/\"><strong>Joint Beamforming and Reverberation Cancellation Using a Constrained Kalman Filter with Multichannel Linear Prediction<\/strong><\/a>\r\nSahar Hashemgeloogerdi,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sebraun\/\">Sebastian Braun<\/a>\r\n\r\nAUD-P4: Feedback, Noise, and Reverberation\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053025\/\"><strong>Predicting Word Error Rate for Reverberant Speech<\/strong><\/a>\r\n<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/hagamper\/\">Hannes Gamper<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/diemmano\/\">Dimitra Emmanouilidou<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sebraun\/\">Sebastian Braun<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ivantash\/\">Ivan Tashev<\/a>\r\n\r\nSPE-P5: Deep Speaker Recognition Models\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053767\/\"><strong>Improving Deep CNN Networks with Long Temporal Context for Text-independent Speaker Verification<\/strong><\/a>\r\n<strong>Yong Zhao<\/strong>,\u00a0<strong>Tianyan Zhou<\/strong>,\u00a0<strong>Zhuo Chen<\/strong>,\u00a0<strong>Jian Wu<\/strong>\r\n<h3>9:20 \u2013 9:40 CEST<\/h3>\r\nSPE-L6: Speech Enhancement II: Single Channel\r\n<a href=\"https:\/\/cmsworkshops.com\/ICASSP2020\/Papers\/ViewPaper_MS.asp?PaperNum=4448\"><strong>Low-Latency Single Channel Speech Enhancement Using U-Net Convolutional Neural Networks<\/strong><\/a>\r\n<strong>Ahmet E. Bulut<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kazukoi\/\">Kazuhito Koishida<\/a>\r\n<h3>11:30 \u2013 13:30 CEST<\/h3>\r\nSAM-P3: Sparsity, Super-Resolution and Imaging\r\n<a href=\"https:\/\/arxiv.org\/abs\/1911.08015\"><strong>Low-Rank Toeplits Matrix Estimation Via Random Ultra-Sparse Rulers<\/strong><\/a>\r\nHannah Lawrence, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jerrl\/\">Jerry Li<\/a>, Cameron Musco, Christopher Musco\r\n\r\nSPE-P8: Robust Speech Recognition\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053455\/\"><strong>A Practical Two-Stage Training Strategy for Multi-Stream End-to-End Speech Recognition<\/strong><\/a>\r\nRuizhi Li, Gregory Sell,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/xiaofewa\/\">Xiaofei Wang<\/a>, Shinji Watanabe, Hynek Hermansky\r\n<h3>16:30 \u2013 16:50 CEST<\/h3>\r\nIFS-L2: Privacy, Biometrics and Information Security\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053729\/\"><strong>Privacy-Preserving Phishing Web Page Classification Via Fully Homomorphic Encryption<\/strong><\/a>\r\nEdward Chou,\u00a0<strong>Arun Gururajan<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kilai\/\">Kim Laine<\/a>,\u00a0<strong>Nitin Kumar Goel<\/strong>,\u00a0<strong>Anna Bertiger<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jstokes\/\">Jack W. Stokes<\/a>\r\n<h3>16:30 \u2013 18:30 CEST<\/h3>\r\nHLT-P1: Spoken Language Understanding and Dialogue I\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053599\/\"><strong>Fast Domain Adaptation for Goal-Oriented Dialogue Using A Hybrid Generative-Retrieval Transformer<\/strong><\/a>\r\nIgor Shalyminov,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/alsordon\/\">Alessandro Sordoni<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/adatkins\/\">Adam Atkinson<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/haschulz\/\">Hannes Schulz<\/a>\r\n\r\nSPE-P9: End-to-end Speech Recognition III: General Topics\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9054663\/\"><strong>Exploring Pre-training with Alignments for RNN Transducer based End-to-End Speech Recognition<\/strong><\/a>\r\nHu Hu,\u00a0<strong>Rui Zhao<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<\/a>,\u00a0<strong>Liang Lu<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<\/a>\r\n\r\n<hr \/>\r\n\r\n<h2>Thursday, May 7<\/h2>\r\n<h3>9:00 \u2013 11:00 CEST<\/h3>\r\nHLT-P2: Speech and Language Analysis\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053167\/\"><strong>Combining Acoustics, Content and interaction Features to Find Hot Spots in Meetings<\/strong><\/a>\r\nDave Makhervaks,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/wihintho\/\">William Hinthorn<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/didimit\/\">Dimitrios Dimitriadis<\/a>, Andreas Stolcke\r\n<h3>10:20 \u2013 10:40 CEST<\/h3>\r\nAUD-L6: Acoustic Environments and Spatial Audio II\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9054091\/\"><strong>Fast Acoustic Scattering Using Convolutional Neural Networks<\/strong><\/a>\r\nZiqi Fan,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/vivineet\/\">Vibhav Vineet<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/hagamper\/\">Hannes Gamper<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/nikunjr\/\">Nikunj Raghuvanshi<\/a>\r\n<h3>10:40 \u2013 11:00 CEST<\/h3>\r\nSPE-L11: Speech Separation and Extraction I: Single Channel\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053068\/\"><strong>An Online Speaker-Aware Speech Separation Approach Based on Time-Domain Representation<\/strong><\/a>\r\nHui Wang, Yan Song,\u00a0<strong>Zeng-Xi Li<\/strong>, Ian McLoughlin, Li-Rong Dai\r\n<h3>11:30 \u2013 13:30 CEST<\/h3>\r\nSPE-P12: Machine Learning for Speech Synthesis II\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053704\/\"><strong>Improving LPCNET-Based Text-to-Speech with Linear Prediction-Structured Mixture Density Network<\/strong><\/a>\r\nMin-Jae Hwang, Eunwoo Song, Ryuichi Yamamoto,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/frankkps\/\">Frank Soong<\/a>, Hong-Goo Kang\r\n\r\nSPE-P13: Speech Separation and Extraction III\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053426\/\"><strong>Continuous Speech Separation: Dataset and Analysis<\/strong><\/a>\r\n<strong>Zhuo Chen<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tayoshio\/\">Takuya Yoshioka<\/a>,\u00a0<strong>Liang Lu<\/strong>,\u00a0<strong>Tianyan Zhou<\/strong>,\u00a0<strong>Zhong Meng<\/strong>,\u00a0<strong>Yi Luo<\/strong>,\u00a0<strong>Jian Wu<\/strong>,\u00a0<strong>Xiong Xiao<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<\/a>\r\n<h3>12:10 \u2013 12:30 CEST<\/h3>\r\nSPE-L12: Speech Separation and Extraction II: Multi-channel\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9054177\/\"><strong>End-to-End Microphone Permutation and Number Invariant Multi-Channel Speech Separation<\/strong><\/a>\r\nYi Luo,\u00a0<strong>Zhuo Chen<\/strong>, Nima Mesgarani,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tayoshio\/\">Takuya Yoshioka<\/a>\r\n<h3>16:30 \u2013 18:30 CEST<\/h3>\r\nMMSP-P3:\u00a0 Multimedia Signal Processing\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053766\/\"><strong>Supervised Deep Hashing for Efficient Audio Event Retrieval<\/strong><\/a>\r\nArindam Jati,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/diemmano\/\">Dimitra Emmanouilidou<\/a>\r\n\r\nMMSP-P3:\u00a0 Multimedia Signal Processing\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053171\/\"><strong>Multimodal Active Speaker Detection and Virtual Cinematography for Video Conferencing<\/strong><\/a>\r\n<strong>Ross Cutler<\/strong>, Ramin Mehran, Sam Johnson,\u00a0<strong>Cha Zhang<\/strong>, Adam Kirk, Oliver Whyte, Adarsh Kowdle\r\n\r\nSPE-P15: Speech Recognition: Adaptation\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053300\/\"><strong>L-Vector: Neural Label Embedding for Domain Adaptation<\/strong><\/a>\r\n<strong>Zhong Meng<\/strong>, Hu Hu,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<\/a>,\u00a0<strong>Changliang Liu<\/strong>,\u00a0<strong>Yan Huang<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<\/a>, Chin-Hui Lee\r\n\r\nSPE-P15: Speech Recognition: Adaptation\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053545\/\"><strong>Acoustic Model Adaptation for Presentation Transcription and Intelligent Meeting Assistant Systems<\/strong><\/a>\r\n<strong>Yan Huang<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<\/a>\r\n\r\nSPE-P15: Speech Recognition: Adaptation\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053104\/\"><strong>Using Personalized Speech Synthesis and Neural Language Generator for Rapid Speaker Adaptation<\/strong><\/a>\r\n<strong>Yan Huang<\/strong>,\u00a0<strong>Lei He<\/strong>,\u00a0<strong>Wenning Wei<\/strong>,\u00a0<strong>William Gale<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<\/a>\r\n\r\nSS-P1: Signal Processing Education: Trends and Innovations\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053380\/\"><strong>A Dataset for Measuring Reading Levels in India at Scale<\/strong><\/a>\r\nDolly Agarwal,\u00a0<strong>Jayant Gupchup<\/strong>, Nishant Baghel\r\n<h3>17:30 \u2013 17:30 CEST<\/h3>\r\nIDSP-L2: Industry Session on Large-Scale Distributed Learning Strategies\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9052983\/\"><strong>Parallelizing Adam Optimizer with Blockwise Model-Update Filtering<\/strong><\/a>\r\n<strong>Kai Chen<\/strong>, Haisong Ding,\u00a0<strong>Qiang Huo<\/strong>\r\n\r\n<hr \/>\r\n\r\n<h2>Friday, May 8<\/h2>\r\n<h3>8:00 \u2013 10:00 CEST<\/h3>\r\nIFS-P1: Information Hiding, Biometrics and Security\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053670\/\"><strong>Texception: A Character\/Word-Level Deep Learning Model for Phishing URL Detection<\/strong><\/a>\r\n<strong>Farid Tajaddodianfar<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jstokes\/\">Jack W. Stokes<\/a>,\u00a0<strong>Arun Gururajan<\/strong>\r\n\r\nSAM-P6: Detection, Estimation and Classification\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053825\/\"><strong>Static Visual Spatial Priors For DOA Estimation<\/strong><\/a>\r\nPawel Swietojanski,\u00a0<strong>Ondrej Miksik<\/strong>\r\n\r\nSPE-P16: Word Spotting\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053191\/\"><strong>Adaptation of RNN Transducer with Text-to-Speech Technology for Keyword Spotting<\/strong><\/a>\r\n<strong>Eva Sharma<\/strong>,\u00a0<strong>Guoli Ye<\/strong>,\u00a0<strong>Wenning Wei<\/strong>,\u00a0<strong>Rui Zhao<\/strong>,\u00a0<strong>Yao Tian<\/strong>,\u00a0<strong>Jian Wu<\/strong>,\u00a0<strong>Lei He<\/strong>,\u00a0<strong>Ed Lin<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<\/a>\r\n\r\nSPE-P17: Speech Enhancement IV\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9054528\/\"><strong>AV(SE) \u00b2: Audio-Visual Squeeze-Excite Speech Enhancement<\/strong><\/a>\r\nMichael Iuzzolino,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kazukoi\/\">Kazuhito Koishida<\/a>\r\n<h3>8:20 \u2013 8:40 CEST<\/h3>\r\nHLT-L2: Language Modeling\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053483\/\"><strong>Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers<\/strong><\/a>\r\nJunhao Xu,\u00a0<strong>Xie Chen<\/strong>, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Mei-Ling Meng\r\n<h3>9:40 \u2013 10:00 CEST<\/h3>\r\nMLSP-L10: Deep Neural Network Structures\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053105\/\"><strong>Neural Attentive Multiview Machines<\/strong><\/a>\r\n<strong>Oren Barkan<\/strong>,\u00a0<strong>Ori Katz<\/strong>,\u00a0<strong>Noam Koenigstein<\/strong>\r\n<h3>11:45 \u2013 13:45 CEST<\/h3>\r\nAUD-P11: Signal Enhancement and Restoration II\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053649\/\"><strong>Geometrically Constrained Independent Vector Analysis for Directional Speech Enhancement<\/strong><\/a>\r\nLi Li,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kazukoi\/\">Kazuhito Koishida<\/a>\r\n\r\nAUD-P11: Signal Enhancement and Restoration II\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9054254\/\"><strong>Weighted Speech Distortion Losses for Neural-Network-Based Real-Time Speech Enhancement<\/strong><\/a>\r\nYangyang Xia,\u00a0<strong>Sebastian Braun<\/strong>,\u00a0<strong>Chandan Reddy<\/strong>,\u00a0<strong>Harishchandra Dubey<\/strong>,\u00a0<strong>Ross Cutler<\/strong>,\u00a0<strong>Ivan Tashev<\/strong>\r\n\r\nHLT-P5: Multilingual Processing of Language\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053752\/\"><strong>Addressing Accent Mismatch in Mandarin-English Code-Switching Speech Recognition<\/strong><\/a>\r\n<strong>Zhili Tan<\/strong>,\u00a0<strong>Xinghua Fan<\/strong>,\u00a0<strong>Hui Zhu<\/strong>,\u00a0<strong>Ed Lin<\/strong>\r\n\r\nIFS-P2: Anonymization, Security and Privacy\r\n<a href=\"https:\/\/cmsworkshops.com\/ICASSP2020\/Papers\/ViewPaper.asp?PaperNum=2004\"><strong>Detection of Malicious VSCRIPT Using Static and Dynamic Analysis with Recurrent Deep Learning<\/strong><\/a>\r\n<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jstokes\/\">Jack W. Stokes<\/a>, Rakshit Agrawal,\u00a0<strong>Geoff McDonald<\/strong>\r\n\r\nSPE-P19: Machine Learning for Speech Synthesis III\r\n<a href=\"https:\/\/arxiv.org\/abs\/1910.10909\"><strong>ESPNET-TTS: Unified, Reproducible, and Integartable Open Source End-to-End Text-to-Speech Toolkit<\/strong><\/a>\r\nTomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/xuta\/\">Xu Tan<\/a>\r\n\r\nSPE-P20: Speech Recognition: Acoustic Modelling II\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9054387\/\"><strong>High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model<\/strong><\/a>\r\n<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jinyli\/\">Jinyu Li<\/a>,\u00a0<strong>Rui Zhao<\/strong>,\u00a0<strong>Eric Sun<\/strong>,\u00a0<strong>Jeremy Wong<\/strong>,\u00a0<strong>Amit Das<\/strong>,\u00a0<strong>Zhong Meng<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ygong\/\">Yifan Gong<\/a>\r\n<h3>12:25 \u2013 12:45 CEST<\/h3>\r\nSPE-L16: Speaker Diarization\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9054176\/\"><strong>Speaker Diarization with Session-Level Speaker Embedding Refinement Using Graph Neural Networks<\/strong><\/a>\r\nJixuan Wang,\u00a0<strong>Xiong Xiao<\/strong>,\u00a0<strong>Jian Wu<\/strong>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/ranjanir\/\">Ranjani Ramamurthy<\/a>, Frank Rudzicz, Michael Brudno\r\n<h3>13:05 \u2013 13:25 CEST<\/h3>\r\nSPE-L16: Speaker Diarization\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053152\/\"><strong>A Memory Augmented Architecture for Continuous Speaker Identification in Meetings<\/strong><\/a>\r\nNikolaos Flemotomos,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/didimit\/\">Dimitrios Dimitriadis<\/a>\r\n<h3>15:15 \u2013 17:15 CEST<\/h3>\r\nSPE-P21: Voice Conversion\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9054010\/\"><strong>An Improved Frame-Unit-Selection Based Voice Conversion System Without Parallel Training Data<\/strong><\/a>\r\nFeng-Long Xie, Xin-Hui Li, Bo Liu, Yi-Bin Zheng, Li Meng, Li Lu,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/frankkps\/\">Frank K. Soong<\/a>\r\n<h3>16:15 \u2013 16:30 CEST<\/h3>\r\nMLSP-L11: Attention Needs\r\n<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9053071\/\"><strong>Attentive Item2vec: Neural Attentive User Representations<\/strong><\/a>\r\n<strong>Oren Barkan<\/strong>, Avi Caciularu,\u00a0<strong>Ori Katz<\/strong>,\u00a0<strong>Noam Koenigstein<\/strong>"}],"msr_startdate":"2020-05-04","msr_enddate":"2020-05-08","msr_event_time":"","msr_location":"Virtual","msr_event_link":"","msr_event_recording_link":"","msr_startdate_formatted":"May 4, 2020","msr_register_text":"Watch now","msr_cta_link":"","msr_cta_text":"","msr_cta_bi_name":"","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/ICASSP_header_1920x720_v2-960x540.jpg\" class=\"img-object-cover\" alt=\"Microsoft at ICASSP 2020 in Barcelona, Spain\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/ICASSP_header_1920x720_v2-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/ICASSP_header_1920x720_v2-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/ICASSP_header_1920x720_v2-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/ICASSP_header_1920x720_v2-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/ICASSP_header_1920x720_v2-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2020\/03\/ICASSP_header_1920x720_v2-1280x720.jpg 1280w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","event_excerpt":"Microsoft is proud to be a silver sponsor of the 45th International Conference on Acoustics, Speech and Signal Processing (ICASSP). Stop by our booth to chat with our experts, see demos of our latest research and find out about career opportunities with Microsoft.","msr_research_lab":[],"related-researchers":[],"msr_impact_theme":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-opportunities":[],"related-publications":[658509,658848,705277,810856],"related-videos":[],"related-posts":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/642510","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-event"}],"version-history":[{"count":3,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/642510\/revisions"}],"predecessor-version":[{"id":1146977,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/642510\/revisions\/1146977"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/644613"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=642510"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=642510"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=642510"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=642510"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=642510"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=642510"},{"taxonomy":"msr-program-audience","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-program-audience?post=642510"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=642510"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=642510"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}