Fitting a Graph to Vector Data


July 13, 2011


Jonathan Kelner




In this talk, I will set forth a general approach to many of the major problems in Machine Learning, including classification, regression and clustering, based on ideas from spectral graph theory. Applying this approach will yield a simple and clean methodology that gives very good solutions to these problems, outperforming the state-of-the-art on many of the standard data sets.
In recent years, a number of researchers have gained insight by fitting graphs to their data and then using these graphs to solve various problems. They have employed simply defined graphs that are easy to compute, associating a vertex of the graph with each data vector, and then connecting vertices whose vectors are sufficiently close, sometimes with weights depending on the distance. Not surprisingly, different results are obtained by the use of different graphs, and researchers have studied how to combine different graphs in a way that tends to give heavier weight to the better graphs.
I will examine what can be gained by choosing the graphs with more care. To this end, I will pose the question,
“What is the right graph to fit to a set of vectors in Rn?”
I will then propose an answer based on the graph Laplacian and a discrete analogue of the harmonic energy of a manifold, and I will discuss how to surmount the algorithmic challenges that arise in computing the optimal graphs according to this measure.
These graphs have several interesting theoretical properties and relate to a variety of other concepts in the machine learning literature, including manifold learning and nonlinear dimensionality reduction. In addition, I will show how they can be viewed as a generalization of both k-means clustering and singular value decompositions.
Joint work with Daniel Spielman and Samuel Daitch.


Jonathan Kelner

Jonathan Kelner is an Assistant Professor of Applied Mathematics in the MIT Department of Mathematics and a member of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). His research focuses on the application of techniques from pure mathematics to the solution of fundamental problems in algorithms and complexity theory.
He was an undergraduate at Harvard University, and he received his Ph.D. in Computer Science from MIT in 2006. Before joining the MIT faculty, he spent a year as a member of the Institute for Advanced Study. He has received a variety of awards for his work, including an NSF CAREER Award, an Alfred P. Sloan Research Fellowship, the NEC Award for Research in Computers and Communication, the Sprowls Doctoral Dissertation Award, the Best Student Paper Award at STOC 2004, the Best Paper Awards at SIGMETRICS 2011 and STOC 2011, and the Harold E. Edgerton Faculty Achievement Award.