New England Machine Learning Day 2013


The second New England Machine Learning Day will be held May 1, 2013, from 10:00 AM to 5:00 PM at Microsoft Research New England, One Memorial Drive, Cambridge, MA 02142. The event will bring together local academics and researchers in machine learning and its applications.

Related events:


10:00‑10:05, Jennifer Chayes (MSR)
Opening remarks

10:10‑10:40, Sham Kakade (MSR)
Learning latent structure in documents, social networks, and more…

In many applications, we face the challenge of modeling the interactions between multiple observations and hidden causes; such problems range from document retrieval, where we seek to model the underlying topics, to community detection in social networks. The (unsupervised) learning problem is to accurately estimate the model (e.g. the hidden topics, the underlying clusters, or the hidden communities in a social network) with only samples of the observed variables. In practice, many of these models are fit with local search heuristics. This talk will overview how simple and scalable linear algebra approaches provide closed form estimation methods for a wide class of these models models—including Gaussian mixture models, hidden Markov models, topic models (including latent Dirichlet allocation), and mixed membership models for communities in social networks.

10:45‑11:15, Stefanie Tellex (Brown)
Learning Word Meanings for Human-Robot Interaction

As robots become more powerful and autonomous, it is critical to develop ways for untrained users to quickly and easily tell them what to do. Natural language is a powerful and flexible modality for conveying complex requests, but in order for robots to effectively understand natural language commands, they must be able to acquire meaning representations that can be mapped to perceptual features in the external world. I will present approaches to learning these grounded meaning representations from a corpus of natural language sentences paired with a robot’s perceptual model of the environment. The robot can use these learned models to recognize events, follow commands, ask questions, and request help.

11:20‑11:50, Pablo Parrilo (MIT)
From Sparsity to Rank, and Beyond: algebra, geometry, and convexity

Optimization problems involving sparse vectors or low-rank matrices are of great importance in applied mathematics and engineering. They provide a rich and fruitful interaction between algebraic-geometric concepts and convex optimization, with strong synergies with popular techniques like L1 and nuclear norm minimization. In this lecture we will provide a gentle introduction to this exciting research area, highlighting key algebraic-geometric ideas as well as a survey of recent developments, including extensions to very general families of parsimonious models such as sums of a few permutations matrices, low-rank tensors, orthogonal matrices, and atomic measures, as well as the corresponding structure-inducing norms. Based on joint work with Venkat Chandrasekaran, Maryam Fazel, Ben Recht, Sujay Sanghavi, and Alan Willsky.

Posters and lunch

1:45‑2:15, Erik Sudderth (Brown)
Toward Reliable Bayesian Nonparametric Learning

Applications of Bayesian nonparametrics increasingly involve datasets with rich hierarchical, temporal, spatial, or relational structure. While basic inference algorithms such as the Gibbs sampler are easily generalized to such models, in practice they can fail in subtle and hard-to-diagnose ways. We explore this issue via variants of a simple and popular nonparametric Bayesian model, the hierarchical Dirichlet process. By optimizing variational learning objectives in non-traditional ways, we build improved models of text, image, and social network data.

2:20‑2:50, Ryan Adams (Harvard)
Practical Bayesian Optimization of Machine Learning Algorithms

Machine learning algorithms frequently involve careful tuning of learning parameters and model hyperparameters. Unfortunately, this tuning is often a “black art” requiring expert experience, rules of thumb, or sometimes brute-force search. There is therefore great appeal for automatic approaches that can optimize the performance of any given learning algorithm to the problem at hand. I will describe my recent work on solving this problem with Bayesian nonparametrics, using Gaussian processes. This approach of “Bayesian optimization” models the generalization performance as an unknown objective function with a GP prior. I will discuss new algorithms that account for variable cost in function evaluation and take advantage of parallelism in evaluation. These new algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms including latent Dirichlet allocation for text analysis, structured SVMs for protein motif finding, and convolutional neural networks for visual object recognition.

Coffee break

3:20‑3:50, Hanna Wallach (UMASS)
Machine Learning for Complex Social Processes

From the activities of the US Patent Office or the National Institutes of Health to communications between scientists or political legislators, complex social processes—groups of people interacting with each other in order to achieve specific and sometimes contradictory goals—underlie almost all human endeavor. In order draw thorough, data-driven conclusions about complex social processes, researchers and decision-makers need new quantitative tools for exploring, explaining, and making predictions using massive collections of interaction data. In this talk, I will discuss the development of machine learning methods for modeling interaction data. I will concentrate on exploratory analysis of communication networks—specifically, discovery and visualization of topic-specific subnetworks in email data sets. I will present a new Bayesian latent variable model of network structure and content and explain how this model can be used to analyze intra-governmental email networks.

3:55‑4:25, Cynthia Rudin (MIT)
ML for the Future: Healthcare, Energy, and the Internet

I will overview recent applications of ML to some of society’s critical domains, including healthcare, energy grid reliability, and information retrieval. Specifically:
1) Stroke risk prediction in medical patients, using ML techniques for interpretable predictive modeling.
2) Energy grid reliability in New York City, using point process models.
3) Growing a list using the Internet, using clustering techniques.
These applications show the promise of how applications can drive the development of effective new ML techniques.
Collaborators: Ben Letham, Seyda Ertekin, Tyler McCormick, David Madigan, and Katherine Heller

4:30‑5:00, Antonio Torralba (MIT)
Who is to blame in object detection failures?


1. Priors for Diversity in Generative Latent Variable Models by James Zou and Ryan Adams.

2. Generalized Random Utility Models by Hossein Azari, David C. Parkes, and Lirong Xia.

3. Approximate Inference in Collective Graphical Models by Daniel Sheldon, Tao Sun, Akshat Kumar, and Thomas G. Dietterich.

4. Discovering Structure in Spiking Networks by Scott Linderman and Ryan Adams.

5. Poisson Statistics and the Future of Internet Marketing by Delaram Motamedvaziri, Mohammad Hossein Rohban, Venkatesh Saligrama, and David Castanon.

6. Copy or Coincidence? A Model for Detecting Social Influence and Duplication Events by Lisa Friedland, David Jensen, and Michael Lavine.

7. An Impossibility Result for High Dimensional Supervised Learning by M. H. Rohban, P. Ishwar, B. Orten, W. C. Karl, and V. Saligrama.

8. Localizing 3D Cuboids in Single-view Images by Jianxiong Xiao, Bryan C. Russell, and Antonio Torralba.

10. Accelerating Inference: Towards a Full Language, Compiler and Hardware Stack by Lyric Labs – Analog Devices.

11. Efficient Nearest-Neighbor Search in the Probability Simplex by Kriste Krstovski, David A. Smith, Hanna M. Wallach, Andrew McGregor, and Michael J. Kurtz.

12. Image Caption Generation by Rebecca Mason.

13. The Gesture Recognition Toolkit by Nicholas Gillian and Joseph Paradiso.

14. The incidental parameter problem in network analysis for neural spiking data by Dahlia Nadkarni and Matthew Harrison.

15. Knowledge Mining Blood Pressure Data with Dynamic Bayesian Network Modeling by Alex Waldin, Kalyan Veeramachaneni, and Una-May O’Reilly.

16. The network you keep: Graphlet-Based discrimination of persons of interest by Saber Shokat Fadaee, Javed A. Aslam, Nikos Passas, and Ravi Sundaram.

17. Probabilistic reasoning about human edits in information integration by Michael Wick, Ari Kobren, and Andrew McCallum.

18. Spectral Discovery of Clinical Autism Phenotypes with Subspace Regularization by Finale Doshi-Velez, Deniz Oktay, Ben Mayne, and Isaac Kohane.

19. Predicting Age Distribution—A Generative Bayesian Model by Huseyin Oktay, Aykut Firat, and David Jensen.

20. An Improved Message-Passing Algorithm Incorporating Certainty Information by Nate Derbinsky, José Bento Ayres Pereira, Veit Elser, and Jonathan S. Yedidia.

21. A New Geometric Approach to Latent Topic Modeling and Discovery by Weicong Ding, Mohammad H. Rohban, Prakash Ishwar, and Venkatesh Saligrama.

22. Coco-Q: Learning in Stochastic Games with Side Payments by Eric Sodomka, Elizabeth Hilliard, Amy Greenwald, and Michael Littman.

23. Modeling Clinical Prognosis by Learning Interpretable Representations from Massive Health Data by Rohit Joshi and Peter Szolovits.

24. An Efficient Atomic Norm Minimization Approach to Identification of Low Order Models by Burak Yilmaz, Constantino Lagoa, and Mario Sznaier.

25. Agglomerative Clustering of Bagged Data Using Joint Distributions by David Arbour, James Atwood, Ahmed El-Kishky, and David Jensen.

26. Hankel Based Maximum Margin Classifiers: A Connection Between Machine Learning and Wiener Systems Identification by Fei Xiong, Yongfang Cheng, Octavia Camps, Mario Sznaier, and
Constantino Lagoa.

27. Fitting Large-Scale GLMs with Implicit Updates by Panos Toulis, Jason Rennie, and Edo Airoldi.

28. Automatic delineation of radiosensitive structures in CT images using statistical appearance models and level sets by Karl D. Fritscher and Gregory Sharp.

29. Topic-Partitioned Multinetwork Embeddings by Peter Krafft, Juston Moore, Bruce Desmarais, and Hanna Wallach.

30. Evaluating Crowdsourcing Participants in the Absence of Ground-Truth by Ramanathan Subramanian, Romer Rosales, Glenn Fung, and Jennifer Dy.

31. Nonparametric Mixture of Gaussian Processes with Constraints by James C. Ross.

32. Sparse Signal Processing with Linear and Non-Linear Observations: A Unified Shannon Theoretic Approach by Cem Aksoylar, George Atia, and Venkatesh Saligrama.

33. More Efficient Dual Decomposition for Corpus Wide Inference by Alexandre Passos, David Belanger, Sebastian Riedel, and Andrew McCallum.

34. Learning with Irregularly Sampled Time Series Data by Steve Cheng-Xian Li and Benjamin M. Marlin.

35. Batch-iFDD for Representation Expansion in Large MDPs by Alborz Geramifard, Tom Walsh, Nicholas Roy, and Jonathan How.

36. Leveraging Hierarchical Structure in Diagnostic Codes for Predicting Incident Heart Failure by Anima Singh and John Guttag.

37. Layered Model for Video Analysis by Deqing Sun, Jonas Wulff, Erik B. Sudderth, Hanspeter Pfister, and Michael J. Black.

38. FlexGP: a Divide and Conquer Approach to Machine Learning on the Cloud by Kalyan Veeramachaneni, Owen Derby, Dylan Sherry, and Una-May O’Reilly.

39. Density Estimation and Anomaly Detection Using the Relevance Vector Machine by Jose Lopez.

40. Reasoning about Independence in Probabilistic Models of Relational Data by Marc Maier, Katerina Marazopoulou, and David Jensen.

41. On a Particle-Stabilized Wang-Landau Algorithm by Luke Bornn, Pierre Jacob, Arnaud Doucet, and Pierre Del Moral.

42. Posterior Consistency for the Number of Components in a Finite Mixture by Jeffrey W. Miller and Matthew T. Harrison.


Edo Airoldi, Harvard
Tommi Jaakkola, MIT
Adam Tauman Kalai, Microsoft Research, Chair
Andrew McCallum, UMass Amherst