Sumit Basu

Senior Principal Researcher

About

I’m Sumit Basu, a Senior Principal Researcher in the Biomedical Computing Group in the Health Futures organization at Microsoft Research, Redmond. My current research focus is on investigating whether the benefits of generative large language models (LLMs), which have proved so powerful in text, code, and image domains, might also extend to the languages of biology: DNA, RNA, proteins, and beyond. While we are actively measuring the performance of these models against benchmark tasks, we are most passionate about seeking out how these models may make a real impact on the most challenging problems in biology, the keys to diagnosing and treating disease.

From a broader view, my research over the last three decades has been centered on developing interactive, machine-learning based power tools to assist users in understanding and extracting answers from complex data – genomic signals, physiological signals, teaching material/textbooks, computer systems, auditory signals, scientific data, document collections, the web, and more. These power tools sometimes work by observing a user as they perform a task, then assisting them in their efforts once it understands what’s going on; in other cases they analyze data on a user’s behalf to provide candidate insights, then adaptively refining their strategy based on their feedback. The interactive aspect comes from having humans in a tight loop with the learning algorithm, a delicate dance between human and machine. Ultimately, rather than replacing humans, this approach seeks to amplify human capabilities, empowering all of us to reach further and faster towards new discoveries.

If you’re a bright graduate student or researcher interested in such problems and curious about internship/full-time opportunities, please drop me a line or check our current job listings in Health Futures here.