Microsoft @ ICASSP 2019

Microsoft @ ICASSP 2019

About

Microsoft is excited to be a Silver sponsor of the 44th International Conference on Acoustics, Speech, and Signal Processing (ICASSP) May 12 – 17, 2019, in Brighton, UK. Stop by our booth to chat with our experts, see demos of our latest research and find out more about career opportunities with Microsoft.

Session chairs

Frank K. Soong
Ivan Tashev
Jinyu Li
David Wipf

Microsoft attendees

Amy Siebenthaler
Andreas Stolcke
Anthony Stark
Dimitra Emmanouilidou
Dimitrios Dimitriadis
Eric Sun
Fei Zuo
Frank K. Soong
Hamid Palangi
Hannes Gamper
Ivan Tashev
Jack Stokes
Jian Wu
Jianfeng Gao
Jinyu Li
Kazuhito Koishida
Kshitiz Kumar
Lei He
Michael Levit
Mortaza Doulaty
Nanshan Zeng
Nikunj Raghuvanshi
Oren Barkan
Sarangarajan Parthasarathy
Sebastian Braun
Shifeng Pan
Shuayb Zarar
Sungjin Lee
Tasos Anastasakos
Xiaoyang Chen
Xuedong Huang
Yan Huang
Yao Tian
Yashesh Gaur
Yifan Gong
Yong Zhao

 

Accepted Papers

A Pitch-Aware Approach to Single-Channel Speech Separation
Tuesday, May 14, 2019 | 1:30 PM–3:30 PM | Music Source Separation and Spatial Audio | Poster Area E

Ke Wang, Frank Soong, Lei Xie

A Sparsity Measure for Echo Density Growth in General Environments
Tuesday, May 14, 2019 | 1:30 PM–3:30 PM | Acoustic Environments and Music Analysis | Poster Area D

Helena Peic Tukuljac, Ville Pulkki, Hannes Gamper, Keith Godin, Ivan Tashev, Nikunj Raghuvanshi

Blind Room Volume Estimation from Single-Channel Noisy Speech
Tuesday, May 14, 2019 | 1:30 PM–3:30 PM | Acoustic Environments and Music Analysis | Poster Area D

Andrea Genovese, Hannes Gamper, Ville Pulkki, Nikunj Raghuvanshi, Ivan Tashev

Improving Binaural Ambisonics Decoding by Spherical Harmonics Domain Tapering and Coloration Compensation
Tuesday, May 14, 2019 | 1:30 PM–3:30 PM | Music Source Separation and Spatial Audio | Poster Area E

Christoph Hold, Hannes Gamper, Ville Pulkki, Nikunj Raghuvanshi, Ivan Tashev

Static and Dynamic State Predictions for Acoustic Model Combination
Tuesday, May 14, 2019 | 1:30 PM–3:30 PM | Deep Learning Applications I | Auditorium 2

Kshitiz Kumar, Yifan Gong


Gaussian Process LSTM Recurrent Neural Network Language Models for Speech Recognition
Tuesday, May 14, 2019 | 5:30 PM–7:30 PM | Language Modeling, ASR and Punctuation Prediction | Poster Area C

Max W.Y. Lam, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng

Investigation of Sampling Techniques for Maximum Entropy Language Modeling Training
Tuesday, May 14, 2019 | 5:30 PM–7:30 PM | Language Modeling, ASR and Punctuation Prediction | Poster Area C

Xie Chen, Jun Zhang, Tasos Anastasakos, Fil Alleva

Recurrent Neural Network Language Model Training Using Natural Gradient
Tuesday, May 14, 2019 | 5:30 PM–7:30 PM | Language Modeling, ASR and Punctuation Prediction | Poster Area C

Jianwei Yu, Max W.Y. Lam, Xie Chen, Shoukang Hu, Songxiang Liu, Xixin Wu, Xunying Liu, Helen Meng

Towards Code-Switching ASR for End-to-End CTC Models
Tuesday, May 14, 2019 | 5:30 PM–7:30 PM | Multi-lingual Speech Recognition | Poster Area A

Ke Li, Jinyu Li, Guoli Ye, Rui Zhao, Yifan Gong


Adversarial Speaker Verification
Wednesday, May 15, 2019 | 8:30 AM–10:30 AM | Features and Robustness for Speaker Identification | Poster Area B

Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong

Attention in Recurrent Neural Networks for Ransomware Detection
Wednesday, May 15, 2019 | 8:30 AM–10:30 AM | Deep Learning III | Poster Area G

Rakshit Agrawal, Jack W. Stokes, Karthik Selvaraj, Mady Marinescu

Encrypted Speech Recognition Using Deep Polynomial Networks
Wednesday, May 15, 2019 | 8:30 AM–10:30 AM | Novel Architectures and Training Strategies for ASR | Auditorium 1

Shixiong Zhang, Yifan Gong, Dong Yu

Single-Channel Speech Extraction Using Speaker Inventory and Attention Network
Wednesday, May 15, 2019 | 8:30 AM–10:30 AM | Source Separation and Speech Enhancement I | Meeting Room 1

Xiong Xiao, Zhuo Chen, Takuya Yoshioka, Hakan Erdogan, Changliang Liu, Dimitrios Dimitriadis, Jasha Droppo, Yifan Gong

Universal Acoustic Modeling Using Neural Mixture Models
Wednesday, May 15, 2019 | 8:30 AM–10:30 AM | Novel Architectures and Training Strategies for ASR | Auditorium 1

Amit Das, Jinyu Li, Changliang Liu, Yifan Gong


Adversarial Speaker Adaptation
Wednesday, May 15, 2019 | 1:30 PM–3:30 PM | Feature Learning and Adaptation for ASR | Auditorium 1

Zhong Meng, Jinyu Li, Yifan Gong

Detecting Cyber Attacks Using Anomaly Detection with Explanations and Expert Feedback
Wednesday, May 15, 2019 | 1:30 PM–3:30 PM | Learning Theory and Methods I | Auditorium 2

Md Amran Siddiqui, Jack W. Stokes, Christian Seifert, Evan Argyle, Robert McCann, Joshua Neil, Justin Carroll


Directional Interference Suppression Using a Spatial Relative Transfer Function Feature
Wednesday, May 15, 2019 | 4:00 PM–6:00 PM | Quality Measures and Sensor Array Processing | Poster Area D

Sebastian Braun, Ivan Tashev

NN-Based Ordinal Regression for Assessing Fluency of ESL Speech
Wednesday, May 15, 2019 | 4:00 PM–6:00 PM | Training Regimes for Emotion and Sentiment Analysis | Poster Area C

Shaoguang Mao, Zhiyong Wu, Jingshuai Jiang, Peiyun Liu, Frank Soong

Non-Intrusive Speech Quality Assessment Using Neural Networks
Wednesday, May 15, 2019 | 4:00 PM–6:00 PM | Quality Measures and Sensor Array Processing | Poster Area D

Anderson R. Avila, Hannes Gamper, Chandan Reddy, Ross Cutler, Ivan Tashev, Johannes Gehrke


Conditional Teacher-Student Learning
Thursday, May 16, 2019 | 8:00 AM–10:00 AM | ASR Training Strategies and Toolkits | Poster Area A

Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong

Decoding Homomorphically Encrypted Flac Audio Without Decryption
Thursday, May 16, 2019 | 8:00 AM–10:00 AM | Audio Security and Source Separation | Poster Area D

Yuanyuan Tang, Bin Zhu, Xiaojing Ma, Mathiopoulos P. Takis, Xia Xie, Hong Huang


Improving Layer Trajectory LSTM with Future Context Frames
Thursday, May 16, 2019 | 1:00 PM–3:00 PM | New Features, Models and Representations/Audio Visual ASR | Poster Area A

Jinyu Li, Liang Lu, Changliang Liu, Yifan Gong


Contextual Out-of-Domain Utterance Handling with Counterfeit Data Augmentation
Thursday, May 16, 2019 | 3:30 PM–5:30 PM | Dialogue | Syndicate 1

Sungjin Lee, Igor Shalyminov

Dilated Residual Network with Multi-Head Self-Attention for Speech Emotion Recognition
Thursday, May 16, 2019 | 3:30 PM–5:30 PM | Architectures for Emotion and Sentiment Analysis | Poster Area B

Runnan Li, Zhiyong Wu, Jia Jia, Sheng Zhao, Helen Meng


Attentive Adversarial Learning for Domain-Invariant Training
Thursday, May 16, 2019 | 6:00 PM–8:00 PM | Robust Speech Recognition | Poster Area A

Zhong Meng, Jinyu Li, Yifan Gong

Speech Super Resolution Generative Adversarial Network
Thursday, May 16, 2019 | 6:00 PM–8:00 PM | Audio and Speech Applications | Poster Area G

Sefik Emre Eskimez, Kazuhito Koishida

Word Characters and Phone Pronunciation Embedding for ASR Confidence Classifier
Thursday, May 16, 2019 | 6:00 PM–8:00 PM | Signal Processing for Emerging and Practical Applications | Poster Area E

Session Chair: Ivan Tashev
Kshitiz Kumar, Tasos Anastasakos, Yifan Gong


Acoustic and Lexical Sentiment Analysis for Customer Service Calls
Friday, May 17, 2019 | 8:30 AM–10:30 AM | Using Multiple Perspectives in Emotion and Sentiment Analysis | Syndicate 3

Bryan Li, Dimitrios Dimitriadis, Andreas Stolcke

Domain Adversarial Training for Improving Keyword Spotting Performance of ESL Speech
Friday, May 17, 2019 | 8:30 AM–10:30 AM | Artificial Intelligence Based Human-Machine Conversation Technology for Interactive Education | Syndicate 1

Session Chairs: Yao Qian, Helen Meng, Frank K. Soong
Jingyong Hou, Pengcheng Guo, Sining Sun, Frank K. Soong, Wenping Hu, Lei Xie

Learning Latent Representations for Style Control and Transfer in End-to-End Speech Synthesis
Friday, May 17, 2019 | 8:30 AM–10:30 AM | Speech Synthesis II | Poster Area B

Ya-Jie Zhang, Shifeng Pan, Lei He, Zhen-Hua Ling


Low-Latency Speaker-Independent Continuous Speech Separation
Friday, May 17, 2019 | 1:30 PM–3:30 PM | Speech Separation, Enhancement and Denoising | Poster Area A

Takuya Yoshioka, Zhuo Chen, Changliang Liu, Xiong Xiao, Hakan Erdogan, Dimitrios Dimitriadis


Cross Modal Audio Search and Retrieval with Joint Embeddings Based on Text and Audio
Friday, May 17, 2019 | 4:00 PM–6:00 PM | Multimedia Analysis | Poster Area C

Benjamin Elizalde, Shuayb Zarar, Bhiksha Raj