Work in this area seeks to use computational tools to enable musical creativity, in particular to give novices a variety of new approaches to experience musical creativity. Signal processing and machine learning systems are combined with insights from traditional music creation processes to develop new tools and new paradigms for music. Projects Songsmith Songsmith generates musical accompaniment to match a singer’s voice. Just choose a musical style, sing into your PC’s microphone, and Songsmith will…
I’m Sumit Basu, a Principal Researcher in the Medical Devices Group at Microsoft Research, Redmond. My research focus is on developing interactive, machine-learning based power tools to assist users in understanding and extracting answers from complex data – physiological signals, teaching material/textbooks, computer systems, auditory signals like speech or music, scientific data, document collections, or the web. These power tools sometimes work by observing a user as they perform a task, then assisting them in their efforts once it understands what’s going on; in other cases (as in teaching) they provide inputs to the user and adaptively refine their strategy based on what works best. The interactive aspect comes from having humans in a tight loop with the learning algorithm: instead of getting a big batch of labeled data, interactive learning tasks involve a delicate dance between the human and the algorithm to achieve sufficient performance with a minimum of operator effort.
These days, I’m particularly interested in how we can use such technologies to detect, analyze, and derive insights from physiological signals with the goal of helping patients monitor and improve their cardiovascular health. This is a deep and complex area, involving problems in signal processing, signal quality estimation, real-time classification, and data mining, as well as fundamental aspects of cardiovascular physiology. If you’re a bright graduate student interested in such problems and curious about internship opportunities, drop me a line!
Established: November 23, 2016
Overview This page contains supplementary material for our AAAI 2010 paper: “User-Specific Learning for Recognizing a Singer’s Intended Pitch”. The full citation for our paper follows, along with a link to the paper itself: Guillory A, Basu S, and Morris D. User-Specific Learning for Recognizing a Singer’s Intended Pitch. Proceedings of AAAI 2010, July 2010. For more information about this work, contact Dan Morris (firstname.lastname@example.org) and Sumit Basu (email@example.com). Abstract We consider the problem of…
We present data-driven methods for supporting musical creativity by capturing the statistics of a musical database. Specifically, we introduce a system that supports users in exploring the high-dimensional space of musical chord sequences by parameterizing the variation among chord sequences in popular music. We provide a novel user interface that exposes these learned parameters as control axes, and we propose two automatic approaches for defining these axes. One approach is based on a novel clustering…
Established: November 23, 2016
Want to Give MySong a Try? MySong is now Songsmith; you can download a free trial (or the full version, if you’re a teacher who wants to use it in your classroom) at: http://research.microsoft.com/songsmith Happy Songsmith’ing! Like to Write Music? Most folks never get a chance to answer this question, since writing music takes years of experience... if you don’t play an instrument or spend lots of time around music, you’ll probably never get to…
Established: November 8, 2010
Sho is an interactive environment for data analysis and scientific computing that lets you seamlessly connect scripts (in IronPython) with compiled code (in .NET) to enable fast and flexible prototyping. The environment includes powerful and efficient libraries for linear algebra as well as data visualization that can be used from any .NET language, as well as a feature-rich interactive shell for rapid development. Sho is available under the following licenses:…
Established: February 18, 2008
While typical news-aggregation sites do a good job of clustering news stories according to topic, they leave the reader without information about which stories figure prominently in political discourse. BLEWS uses political blogs to categorize news stories according to their reception in the conservative and liberal blogospheres. It visualizes information about which stories are linked to from conservative and liberal blogs, and it indicates the level of emotional charge in the discussion of the news…
Speaker Identification using a Microphone Array and a Joint HMM with Speech Spectrum and Angle of ArrivalJack W. Stokes, John Platt, Sumit Basu, Institute of Electrical and Electronics Engineers, Inc., July 1, 2006,
Smart Headphones: Enhancing Auditory Awareness Through Robust Speech Detection And Source LocalizationSumit Basu, Brian Clarkson, Alex Pentland, in Proceedings of the Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP '01). Salt Lake City, UT. May, 2001., May 1, 2001,
Empirical Results on the Generalization Capabilities and Convergence Properties of the Bayes Point MachineSumit Basu, in Perceptual Computing Section, The MIT Media Laboratory, December 1, 1999,
Vision Steered Beam-forming and Transaural Rendering for the Artificial Life Interactive Video Environment, (ALIVE)Michael A. Casey, William G. Gardner, Sumit Basu, in Proceedings of the Audio Engineering Society 99th Conference (AES '95). New York, New York., November 1, 1995,
July 21, 2016
University of Washington
July 29, 2014
Anoop Gupta, Sumit Basu, and Rakesh Agrawal
July 19, 2011
July 19, 2011
Jan Vitek and Sumit Basu
Purdue University, Microsoft Research
June 6, 2008
Danyel Fisher, Douglas Downey, Chris Quirk, Scott Drellishak, Kelly O'Hara, Emily M. Bender, Sumit Basu, Matthew Hurst, Arnd Christian König, Michael Gamon, Chris Brockett, Dmitriy Belenko, Bill Dolan, Jianfeng Gao, and Lucy Vanderwende
Powergrading Short Answer Grading Corpus
A brief history
I received my BS (1995), MEng (1997), and PhD (2002) all from MIT in Electrical Engineering and Computer Science. I did my graduate work at the Media Lab with Professor Alex Pentland. My doctoral thesis, “Conversational Scene Analysis,” examined how machine learning and signal processing techniques could be used to understand the structure of conversational interactions from auditory signals without recognizing words. The common thread through all of my work to date has been the combination of human interaction and machine learning; fortunately there are an endless array of application areas of this ilk, especially if one is flexible in one’s definition of interaction.
I recently joined the Medical Devices Group at Microsoft Research. We’ll have much more to say about the exciting project we’re working on very soon.
Our new paper, “Deep Questions without Deep Understanding”, on a new technique for generating high-level (deep) questions from large spans of text (i.e., entire Wikipedia sections, as opposed to individual sentences), will be appearing in July at ACL 2015.
ML for Physiological Signals: using machine learning to detect, analyze, and derive insights from physiological signals to help patients monitor and improve their cardiovascular health.
Teaching with Machine Learning: using machine learning to help students and teachers of all ages and all types of educational goals achieve their objectives more effectively and efficiently.
Sho: a powerful interactive environment for scientific computing and prototyping based on IronPython. Find out more and download it here. Also check out this code for getting real-time skeleton data from Kinect in Sho .
Songsmith: a songwriting tool that takes melodies and helps develop accompaniments for them: based on this research with Dan Morris, it’s now a product (with much help from the MSR Advanced Development Team). Check it out and download the trial here. It’s also now free to many educational institutions via MSDN Academic Alliance and the Innovative Teachers’ Network.
Music Analysis/Synthesis: using machine learning to help users understand, manipulate, and create music
Systems and Machine Learning: using machine learning to address problems in computer systems
Conversational Scene Analysis: seeking structure and content from conversational patterns
Sponsorship Chair, Intelligent User Interfaces 2015 (IUI’15) .
PC Member, Learning at Scale 2015 (L@S 2015) .
Co-Chair (with Jonathan Huang and Kalyan Veeramachaneni), DDE 2013: Workshop on Data-Driven Education at NIPS 2013.
PC Member, AAAI 2011 NECTAR Track
Co-Chair (with Ashish Kapoor), Workshop on Analysis and Design of Algorithms for Interactive Machine Learning (ADA-IML’09) at NIPS 2009.
PC Member, IJCAI’09.
Co-Chair (with Archana Ganapathi, Emre Kiciman, and Fei Sha), MLSys’07: Workshop on Statistical Learning Techniques for Systems Problems at NIPS 2007.
PC Member, Systems and Machine Learning Workshop, 2007 (SysML 2007). at NSDI 2007.
For fall quarter 2007, Emre Kiciman and I taught a graduate course on Systems Applications of Machine Learning (cse599n) at the University of Washington.
Other quarters, I co-taught the Markovia Seminar (cse590mv) on Machine Learning with Tanzeem Choudhury at the University of Washington.