Small Variance Asymptotics, Bayesian Nonparametrics, and k-means

August 2, 2012
Brian Kulis | Ohio State University

Bayesian approaches to clustering permit great flexibility existing models can handle cases when the number of clusters is not known upfront, or when one wants to share clusters across multiple data sets. Despite this flexibility, simpler methods such as k-means are the preferred choice in many applications due to their simplicity and scalability.

One way to view k-means from a probabilistic perspective is as arising from a mixture of Gaussians model where the covariance of each cluster tends to zero. This talk will explore the use of similar asymptotics over a rich class of Bayesian nonparametric models, leading to several new algorithms that feature the simplicity of k-means as well as the flexibility of Bayesian nonparametrics. Among the methods discussed include: i) a k-means-like algorithm based on asymptotics of the Dirichlet process mixture model that does not fix the number of clusters upfront, ii) an algorithm for clustering multiple data sets based on the hierarchical Dirichlet process, iii) an overlapping clustering algorithm based on asymptotics of the beta process, iv) a k-means-like topic modeling algorithm arising from asymptotics over a Bayesian nonparametric hierarchical multinomial mixture.

Speaker Details

I am an assistant professor in the CSE department at Ohio State University.
Previously, I spent three years as a postdoc at UC Berkeley EECS (Computer Science Division), and was also affiliated with ICSI, where I had the good fortune to work with Trevor Darrell, Stuart Russell, Michael Jordan, and Peter Bartlett. Broadly speaking, I am interested in all aspects of machine learning, with an emphasis on applications to computer vision. Most of my research focuses on making it easier to analyze and search complex, large-scale data. A major focus is on large-scale optimization for core problems in machine learning such as metric learning, content-based search, clustering, and online learning. I am increasingly interested in large-scale graphical models, Bayesian inference, and Bayesian nonparametrics.

I finished my Ph.D. in computer science in November, 2008, supervised by Inderjit Dhillon in the University of Texas at Austin computer science department. I did my undergrad in computer science and mathematics at Cornell University. I have also worked with John Platt and Arun Surendran at Microsoft Research on large-scale optimization, and as an undergraduate, I worked with John Hopcroft on tracking topics in networked data over time. During the Fall 2007 semester, I was a research fellow at the Institute for Pure and Applied Mathematics at U.C.L.A.