Portrait of Frank Seide

Frank Seide

Principal Researcher


Frank Seide is a Senior Researcher and Research Manager at Microsoft Research Asia, Beijing, responsible for Research efforts on transcription of phone calls and voicemail and content-based indexing of video and audio.



Established: October 13, 2008

The Microsoft Audio Video Indexing Service (MAVIS) uses state of the art speech recognition technology developed at Microsoft Research to enable searching of audio and video files with speech. Additionally, MAVIS automatically generates closed captions and keywords which can increase accessibility and discoverability of audio and video files with speech content. MAVIS is available as a cloud service running on the Windows Azure platform. MAVIS is now available programmatically through Azure Media Services and referred to as the "Azure Media Services Indexer" (Indexer).…




An Introduction to Computational Networks and the Computational Network Toolkit
Dong Yu, Adam Eversole, Mike Seltzer, Kaisheng Yao, Oleksii Kuchaiev, Yu Zhang, Frank Seide, Zhiheng Huang, Brian Guenter, Huaming Wang, Jasha Droppo, Geoffrey Zweig, Chris Rossbach, Jie Gao, Andreas Stolcke, Jon Currey, Malcolm Slaney, Guoguo Chen, Amit Agarwal, Chris Basoglu, Marko Padmilac, Alexey Kamenev, Vladimir Ivanov, Scott Cypher, Hari Parthasarathi, Bhaskar Mitra, Baolin Peng, Xuedong Huang, Microsoft Research, October 1, 2014, View abstract, Download PDF











Frank was born in Hamburg, Germany. In 1993, he received a Master degree in electrical engineering from University of Technology of Hamburg-Harburg. His research interests are in the area of automatic speech recognition and audio analysis, with current focus on modeling and algorithms for large-vocabulary conversational speech recognition, spoken-dialogue systems, and audio search.

From 1993-97, Frank worked at the speech research group of Philips Research in Aachen, Germany, on spoken-dialogue systems. He then transferred to Taiwan as one of the founding members of Philips Research East-Asia, Taipei, to lead a research project on Mandarin speech recognition.

In June 2001, he joined the speech group at Microsoft Research Asia, initially as a Researcher, since 2003 as Project Leader for offline speech applications, and since October 2006 as Research Manager. Since September 2010, he holds the title of Senior Researcher.

Research Interests

  • Current area of interest: Speech as Content
  • Stuff I worked on in the past
    • Using GPGPU for speech recognition
    • Music processing: Query by Humming, Music Steering (demo video)
    • Computer Auditory Scene Analysis (CASA)
    • Dialogue Systems