Machine Learning and Optimization

Established: November 30, 2011

The Machine Learning and Optimization Group of Microsoft Research pushes the state of the art in machine learning. Our work spans the space from theoretical foundations of machine learning, to new machine learning systems and algorithms, to helping our partner product groups apply machine learning to large and complex tasks.

Machine Learning Seminar

5/6/201499/1927Guy LebanonLocal Low-Rank Matrix Approximation
4/22/201499/1927Junming YinTBD
3/26/201499/1927Scott SannerData-driven Decision-making
3/24/201499/1927Jonathan HuangData Driven Student Feedback For MOOCs: Global Scale Education for the 21st century
2/18/201499/1927Csaba SzepesvariSparse Stochastic Bandits
2/6/201499/1915Eric XingOn The Algorithmic and System Interface of BIG LEARNING
1/31/201499/1927Bin YuModeling Visual Cortex V4 in Naturalistic Conditions with Invariant and Sparse Image Representations
1/21/201499/1915Sebastian BubeckThe linear bandit problem
1/14/201499/1915Robert SchapireExplaining AdaBoost


4/22/201399/1927Quoc LeScaling Deep Learning to 10,000 Cores and Beyond
3/27/201399/1919Abhishek KumarAlgorithms for Near-Separable Nonnegative Matrix Factorization
3/26/201399/1927Dong YuDeep Neural Network for Speech Recognition – Insights and Advances
3/13/201399/1915Manik VarmaMulti-Label Learning with Millions of Labels for Query Recommendation
1/29/201399/1919Qiang LiuBelief Propagation Algorithms for Crowdsourcing


10/25/201199/1927Daniel HsuEfficient algorithms for high-dimensional bandit problems
11/8/201199/1927Ran Gilad-BachrachThe Median Hypothesis
11/16/201199/4800Marina MeilaConsensus finding, exponential models, and infinite rankings
11/22/201199/1927Andre MartinsStructured Prediction in NLP: Dual Decomposition and Structured Sparsity
12/6/201199/1927Alekh AgarwalLearning and stochastic optimization with non-i.i.d. data
1/3/201299/1927Li DengDeep learning for Information Processing
1/17/201299/1927David McAllesterGeneralization Bounds and Consistency for Latent-Structural Probit and Ramp Loss
1/31/201299/1927Murali HaranGaussian processes for inference with implicit likelihoods
2/8/201299/1927Qiaozhu MeiThe Foreseer: Integrative Retrieval and Mining of information in Online Communities
2/14/201299/1927Chong WangHierarchical Bayesian modeling: efficient inference and applications
3/13/201299/1927Xi ChenOptimization for General Structured Sparse Learning
3/20/201299/1927Jonathan GoldsteinTemporal Analytics on Big Data for Web Advertising
3/21/201299/1915Anima AnandkumarHigh-Dimensional Estimation via Graphical Approaches: Methods and Guarantees
3/27/201299/1927Antony JosephAchieving information-theoretic limits in high-dimensional regression
4/3/201299/1927Hau-tieng WuVector Diffusion Maps, Connection Laplacian and their applications
4/10/201299/1927Christopher RéGoing Hogwild!: Parallelizing Incremental Gradient Methods and Matrix Mean Inequalities
4/11/201299/1927Lihong LiMachine Learning in the Bandit Setting: Algorithms, Evaluation, and Case Studies
5/1/201299/1927Christian SheltonThe case for continuous time
5/4/201299/1927Ben RechtThe Convex Geometry of Inverse Problems
5/8/201299/1927Yucheng LowGraphLab2: Distributed Graph-Parallel Computation on Natural GraphsGraphLab2: Distributed Graph-Parallel Computation on Natural Graphs
5/14/201299/1927Lise GetoorCollective Graph Identification
5/15/201299/1927John LangfordA Reliable Effective Terascale Linear Learning System
6/12/201299/1927Kilian WeinbergermSDA: A fast and easy-to-use way to improve bag-of-words features
6/19/201299/1927Miro DudikTractable market making in combinatorial prediction markets
7/3/201299/1927Abhradeep Guha ThakurtaDifferentially Private Learning on Large, Online and High-dimensional Data
7/18/201299/3042James BergstraGrid Search is a Bad Hyper-parameter Optimization Algorithm
8/7/201299/1927Karthik SridharanTBD
8/20/201299/1927Saeed AmizadehVariational Dual-Tree Framework for Large-Scale Transition Matrix Approximation







Memory Limited, Streaming PCA
Ioannis Mitliagkas, Constantine Caramanis, Prateek Jain, in Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States., December 1, 2013, View abstract, View external link








Provable Non-convex Optimization for Machine Learning Problems

Established: April 4, 2014

In this work, we explore theoretical properties of simple non-convex optimization methods for problems that feature prominently in several important areas such as recommendation systems, compressive sensing, computer vision etc. Talks: Provable Non-convex Optimization for Machine Learning. Summer School on Non-convex Optimization, IIT Bombay, 2015. [part 1] [part 2] Iterative Hard Thresholding for Sparse/Low-rank Linear Regression. INRIA, France, 2015. [pdf version] Iterative Hard Thresholding for Robust Regression.  ITW, 2015. [pdf version] Provable Alternating Minimization methods for…


Established: July 12, 2012

Many networked applications that run in the background on a mobile device incur significant energy drains when using the cellular radio interface for communication. This is mainly due to the radio-tail, where the cellular radio remaining in a high energy state for up to 20s after each communication spurt. In order to cut down energy consumption, many recent devices employ fast dormancy, a feature that forces the client radio to quickly go into a low…