Portrait of Mike Seltzer

Mike Seltzer

Principal Researcher


I have been a Researcher in the Speech Technology Group at Microsoft Research since October, 2003. I did my graduate work in the department of Electrical and Computer Engineering at Carnegie Mellon University, receiving my M.S. in 2000 and my Ph.D. in 2003. While at CMU, I was a member of the Robust Speech Recognition group, led by my advisor Professor Richard Stern. My dissertation research focused on improving recognition accuracy in hands-free environments using microphone arrays. From 1996-1998, I worked at Teradyne, Inc. as an Applications Engineer. Teradyne makes automatic test equipment (ATE) for the semiconductor industry. I received my B.S. at Brown University in 1996. At Brown, I worked in the Laboratory for Engineering Man/Machine Systems (LEMS) with Professor Harvey Silverman on a huge microphone array project that was called, in fact, the Huge Microphone Array.

My research interests include: Speech recognition in adverse environments, Acoustic modeling, Microphone array processing, Speech enhancement, Machine learning for speech and audio applications


Dialog and Conversational Systems Research

Established: March 14, 2014

Conversational systems interact with people through language to assist, enable, or entertain. Research at Microsoft spans dialogs that use language exclusively, or in conjunctions with additional modalities like gesture; where language is spoken or in text; and in a variety of settings, such as conversational systems in apps or devices, and situated interactions in the real world. Projects Spoken Language Understanding

Acoustic Modeling

Established: January 29, 2004

Acoustic modeling of speech typically refers to the process of establishing statistical representations for the feature vector sequences computed from the speech waveform. Hidden Markov Model (HMM) is one most common type of acoustuc models. Other acosutic models include segmental models, super-segmental models (including hidden dynamic models), neural networks, maximum entropy models, and (hidden) conditional random fields, etc. Acoustic modeling also encompasses "pronunciation modeling", which describes how a sequence or multi-sequences of fundamental speech units (such as phones or…

Noise Robust Speech Recognition

Established: February 19, 2002

Techniques to improve the robustness of automatic speech recognition systems to noise and channel mismatches Robustness of ASR Technology to Background Noise You have probably seen that most people using a speech dictation software are wearing a close-talking microphone. So, why has senior researcher Li Deng been trying to get rid of close-talking microphones? Close-talking microphones pick up relatively little background noise and speech recognition systems can obtain decent accuracy with them. If you are…