Machine Learning, Natural Language Systems and Applications

Understanding, analyzing and processing data is at the core of building most of the modern large scale systems and intelligent applications. We work on various aspects of data-driven algorithms including fast machine learning algorithms that can learn from large scale and noisy data, learning from natural language and multilingual data, and applications of these algorithms to Web Search, Computational Advertisement, Recommendation Systems, Natural Language Systems, Music Retrieval, etc. Some of the current focus areas of the group are:

  • Learning from a large number of data points including distributed machine learning, single box learning at scale, and streaming algorithms.
  • Learning from a large number of dimensions including high dimensional statistics, low rank and sparse learning, dimensionality reduction and feature selection.
  • Learning from a large number of tasks including extreme classification, learning bandits with a large number of arms.
  • Hashing and dimensionality reduction techniques for multilingual technologies, including cross-language text search/classification, transliteration, spelling normalization and fuzzy search of entities.
  • IR and NLP techniques for code-mixing and script-mixing.
  • Text mining techniques for harvesting experiential knowledge from natural language data.
  • Music retrieval and recommendation.

Find out more about our research on: