Kaushik Chakrabarti is a senior researcher in the Data Management, Mining and Exploration group at Microsoft Research. His research interests include database management, information retrieval and data, text and web mining. He has published  50+ research papers in the above areas and holds 30+ patents. Two of his papers have won best paper awards (ACM SIGMOD 2001 and VLDB 2000). His work has shipped in several Microsoft products including Bing, Cortana, Office 365 and Dynamics. He regularly serves on the program committees of top-tier conferences like ACM SIGMOD, ICDE, WWW, ICDM and ICME as well as reviewer of reputed journals like ACM TODS, VLDB Journal and TKDE. In the past, he has served as an associate editor of TKDE,  member of the editorial board of Distributed and Parallel Databases Journal, PC vice-chair of ICDE 2012 conference and program co-chair of DBRank 2011 (at VLDB 2011). Kaushik received his Bachelors degree in Computer Science and Engineering from Indian Institute of Technology, Kharagpur and his M.S. and Ph.D. degrees in Computer Science from University of Illinois at Urbana Champaign.

Research Interests

Kaushik’s interests spans many aspects of data management, information retrieval and data mining. He is specifically interested in the following topics:

  • New ways to search and explore datasets on the web and inside the enterprise
  • Extracting structured data from unstructured or semi-structured information
  • Scalable data analytics in emerging platforms like MapReduce and graph engines
  • Automatic ways to visualize data

Recent News

(6/2016) Algorithmically generated table answers ships in Bing! This is based on our research on web tables. Bing now shows a table as an answer to a query with list or superlative intent. Try a search like drugs for high cholesterollargest cities in the worldlongest life expectancy countries, top computer science schoolsrichest county in usarenaissance painters from italy or mlb stadiums on Bing and see table answers! We are working on a V2 to dramatically increase the coverage!

(8/2015) Synonym technology ships in Bing’s entity-linking technology which in turn powers several applications like Bing Snapp, Ask Cortana, and Bing Knowledge Widget!

(4/2015) Algorithmically generated table captions ships in Bing! This is based on our research on web tables we have been doing over the last few years. Try a search like highest mountains in usalist of futurama characters, airports in florida or breaking bad episodes on Bing and see table captions! Table caption is shown as part of the snippet of the top (sometimes second or third) algo result and typically complements the information shown in the vertical answer (e.g., carousel) shown on top.

(12/2013) I have been invited to serve as area chair for the Applications and Experiences (DB Track) area in CIKM 2014.

(10/2013) Synonyms technology ships in Dynamics AX for Retail to enhance e-commerce search

(7/2013) Synonym API becomes part of Bing Dev Center

(7/2013) Microsoft announced that PowerQuery will be part of Office 365

(3/2013) Our work on web table search shipped in Excel Power Query. This work was done in close collaboration with SQL Server and Excel groups. Please try the add-in out and give us feedback.

(10/2012) Our synonym work was released externally via the Synonym API. Please try it out and give us feedback!

(7/2012) I have been invited to give a keynote talk at the Workshop on Entity-oriented and Semantic Search (JIWES) at SIGIR 2012. Here is the title and abstract.

(12/2011) I have been invited to serve as the area chair for the “Query Processing and Optimization” track in ICDE 2012

(8/2011) I have been invited to serve as a workshop co-chair for VLDB 2014 to be held at the beautiful Hangzhou, China.

(7/2011) The Distributed and Parallel Databases Journal (Springer’s international journal on database management and information retrieval) is planning a special issue on ranking in databases. I am the editor of this special issue. If you are working on ranking in databases, please consider submitting your work to this special issue. The deadline for the paper submission is October 7, 2011. The call for papers can be found on the journal web site: http://www.springer.com/journal/10619

(4/2011) The fifth international workshop on ranking in databases (DBRank) 2011 will be held in conjunction with VLDB 2011 in Seattle, WA, USA. Davide Martinenghi and I are the program co-chairs. If you are working on ranking, please consider submitting a paper to DBRank 2011. The deadline is June 7, 2011.



Concept Expansion

Established: November 10, 2014

Given a concept name, and seed entities, return entities and tables in this concept. Sway Presentation

Query Result Navigation

Established: January 9, 2014

Exploratory queries on a database often returns too few or too many results (e.g., a home search query on a database of available homes). In such cases, the user faces the challenges of (i) navigating through too many results and/or (ii) refining the query. This project focuses on innovative ways to help the user when the face the above challenges. Specifically, we explored two ways to help the user. First, to rank the query results…

Synonym Mining

Established: January 7, 2014

The same entity is often referred to in a variety of ways. For example, the camera Canon 600d is also referred to as "canon rebel t3i", the celebrity Jennifer Lopez is also referred to as "jlo" and Seattle Tacoma International Airport is also referred to as "sea tac". These are known as synonyms. Without knowledge of synonyms, many applications like e-commerce search will fail to return relevant results. We leverage the data assets amassed by…

Web Data Extraction and Search

Established: February 9, 2013

The goal of this project is to extract structured data on the web (like html tables, lists, spreadsheets etc.) and make it accessible/searchable on Bing and Office 365. Some of the technical challenges: Table classification and understanding: The vast majority of html tables are used for formatting/layout purposes; they do not any contain useful content . How do we automatically filter out such tables? Furthermore, there are various types of tables like relational tables (each row…

Entity Search and Query Portals

Established: March 20, 2011

The goal of entity search is to return entities (e.g., people, products, locations) relevant to a keyword query. The goal of Query Portals is to go one step further and return not only the names of relevant entities but a rich set of information associated with each entity. Often, users issuing keyword searches are not looking for documents but for entities residing in a structured database. Consider a user searching for products (product search), people (expert search/celebrity…

Data Exploration

Established: June 8, 2004

This is a project area rather than a specific project. This project area focuses on novel ways to query, browse, extract, explore, mine and manage various kinds of data residing within the enterprise and on the web: structured data in relational databases, tabular data embedded in web pages, enterprise documents and spreadsheets as well as unstructured data in query logs, text documents and social media. Our research is relevant to both enterprise and consumer scenarios…























  • Ph.D., Computer Science, University of Illinois at Urbana Champaign, 2001.
  • M.S., Computer Science, University of Illinois at Urbana Champaign, 1999.
  • B. Tech. , Computer Science and Engineering, Indian Institute of Technology, 1996.

Recent Professional Activities

Best Paper Awards

  • Our paper Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases awarded Best Paper of SIGMOD 2001
  • Our paper Approximate Query Processing Using Wavelets awarded “Best of VLDB 2000” (along with 4 other papers) and invited to “Best of VLDB 2000” issue of VLDB Journal


All my work is in collaboration with the terrific engineers and researchers in the Data Management, Exploration and Mining Group. I also collaborate with various business groups within Microsoft to transfer our research work to Microsoft’s products and services. Finally, I had the good fortune to work with an amazing set of students over the last decade:

  • Seung-won Hwang (2003, 2004)
  • Dong Xin (2005, 2006)
  • Tianyi Wu (2008)
  • Mei Hui (2009)
  • Senjuti Basu Roy (2010)
  • Bahman Bahmani (2010, mentored by Dong Xin)
  • Chi Wang (2011, 2012, 2013)
  • Ndapa Nakashole (2011)
  • Mohamed Yakout (2011, mentored by Kris Ganjam)
  • Manish Gupta (2011, mentored by Tao Cheng)
  • Meihui Zhang (2012)
  • Bilyana Taneva (2012, mentored by Tao Cheng)
  • Jeffrey Jestes (2012, mentored by Kris Ganjam)
  • Yanyan Shen (2013)
  • Mohan Yang (2013, mentored by Bolin Ding)
  • Fotis Psallidas (2014, 2015)
  • Vasilis Verroios (2015)
  • Amit Chavan (2016)