About

I am a Principal Lead Scientist at Applied Sciences Group in Experiences + Devices organization. I have been with Microsoft since 2000. My area of interests is in signal processing and machine learning for audio, speech, computer vision, and other sensor data.

Past projects

  • Audio and voice compression: Bitrate/bandwidth scalable codec, MELP codec at 1.2kbps, and Windows Media Audio and Voice codec
  • Audio matching: Voice note application and music recognition service
  • Microphone array processing: Beamforming and sound source localization
  • Audio/voice detection and recognition: Keyword spotting and speaker identification
  • Speech enhancement: Audio/visual fusion and bandwidth expansion

Education

  • B.S degree in Electrical Engineering from the Tokyo Institute of Technology, Japan, in 1994
  • M.S. degree in Electrical Engineering from the Tokyo Institute of Technology, Japan, in 1995
  • Ph.D. degree in Electrical Engineering from the Tokyo Institute of Technology, Japan, in 1998. Dissertation title: Speech Coding Based on Mel-Generalized Cepstral Analysis
  • Post doctoral researcher at Signal Compression Lab in the University of California, Santa Barbara, 1998-2000

 

Other

Journal Papers

  • S.M. Eskimez, K. Koishida, and Z. Duan, “Adversarial training for speech super-resolution,” IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 2, pp. 347-358, May 2019.
  • C. Zhang, K. Koishida, and J. Hansen, “Text independent speaker verification based on triplet convolutional neural network embeddings,” IEEE/ACM Trans. Audio, Speech, and Language Processing, vol. 26, no. 9, pp. 1633–1644, Sept. 2018.
  • K. Koishida, K. Tokuda, T. Masuko, and T. Kobayashi, Vector quantization of speech spectral parameters using statistics of static and dynamic features, IEICE Trans. Inf. & Syst. , vol. E84-D, no. 10, pp. 1427-1434, Oct. 2001.
  • K. Koishida, G. Hirabayashi, K. Tokuda, and T. Kobayashi, A 16kbit/s wideband CELP-based speech coder using mel-generalized cepstral analysis, IEICE Trans. Inf. & Syst., vol.E83-D, no.4, pp.876-883, Apr. 2000.
  • K. Koishida, K. Tokuda, T. Kobayashi and S. Imai, “CELP speech coding based on mel-generalized cepstral analysis,” IEICE Trans., vol.J81-A, no.2, pp.252-260, Feb. 1998 (in Japanese).
  • K. Koishida, K. Tokuda, T. Kobayashi and S. Imai, “Spectral representation of speech based on mel-generalized cepstral coefficients and its properties,” IEICE Trans., vol.J80-A, no.11, pp.1999-2006, Nov. 1997 (in Japanese).

Selected Conference Papers

  • S.M. Eskimez and K. Koishida, “Speech super resolution generative adversarial network,” in Proc. ICASSP, 2019, pp. 3717-3721.
  • C. Zhang, and K. Koishida, “End-to-end text-independent speaker verification with flexibility in utterance duration,” in Proc. IEEE Automatic Speech Recognition and Understanding Workshop, 2017, pp. 584–590.
  • C. Zhang, and K. Koishida, “End-to-end text-independent speaker verification with triplet loss on short utterances,” in Proc. INTERSPEECH, 2017, pp. 1487–1491.
  • S. Mehrotra, W. Chen, K. Koishida, and N. Thumpudi,  “Hybrid low bitrate audio coding using adaptive gain shape vector quantization,” in Proc. IEEE 10th Workshop on Multimedia Signal Processing, 2008, pp. 927-932.
  • T. Wang, K. Koishida, V. Cuperman, A. Gersho, and J.S. Collura, “A 1200/2400 bps coding suite based on MELP,” in Proc. IEEE Workshop on Speech Coding, 2002, pp. 90-92.
  • K. Koishida, V. Cuperman, and A. Gersho, “A 16kbit/s bandwidth scalable audio coder based on the G.729 standard,” in Proc. ICASSP, 2000, pp. 1149-1152.
  • T. Wang, K. Koishida, V. Cuperman, A. Gersho, and J.S. Collura, “A 1200 bps speech coder based on MELP,” in Proc. ICASSP, 2000, pp. 1375-1378.
  • K. Koishida, J. Linden, V. Cuperman, and A. Gersho, “Enhancing MPEG-4 CELP by jointly optimized inter/intra-frame LSP predictors,” in Proc. IEEE Workshop on Speech Coding, 2000, pp. 90-92.
  • K. Koishida, G. Hirabayashi, K. Tokuda, and T. Kobayashi, “A 16kbit/s wideband CELP coder using mel-generalized cepstral analysis and its subjective evaluation,” in Proc. ICSLP, 1998, vol. 6, pp. 2583-2586.
  • K. Koishida, G. Hirabayashi, K. Tokuda, and T. Kobayashi, “A wideband CELP speech coder at 16 kbit/s based on mel-generalized cepstral analysis,” in Proc. ICASSP, 1998, vol. 1, pp. 161-164.
  • K. Koishida, K. Tokuda, T. Masuko, and T. Kobayashi, “Spectral quantization using statistics of static and dynamic features,” in Proc. IEEE Workshop on Speech Coding, 1997, pp. 19-20.
  • K. Koishida, K. Tokuda, T. Kobayashi, and S. Imai, “Efficient encoding of mel-generalized cepstrum for CELP coders,” in Proc. ICASSP, 1997, vol. 2, pp. 1355-1358.
  • K. Koishida, K. Tokuda, T. Kobayashi, and S. Imai, “CELP coding system based on mel-generalized cepstral analysis,” in Proc. ICSLP, 1996, vol. 1, pp. 314-317.
  • K. Koishida, K. Tokuda, T. Kobayashi, and S. Imai, “CELP coding system based on mel-cepstral analysis,” in Proc. ICASSP, 1995, vol. 1, pp. 33-317.

Ph.D Thesis

  • “Low bit rate speech coding based on mel-generalized cepstral analysis,” Tokyo Institute of Technology, 1998.