Tutorials Session A – Deep Mathematical Properties of Submodularity with Applications to Machine Learning

Date

January 8, 2014

Speaker

Jeff A Bilmes

Affiliation

University of Washington

Overview

Submodular functions have received significant attention in the mathematics community owing to their natural and wide ranging applicability. Submodularity has a very simple definition which belies a treasure trove of consequent mathematical richness. This tutorial will attempt to convey some of this richness.

We will start by defining submodularity and polymatroidality — we will survey a surprisingly diverse set of functions that are submodular and operations that (sometimes remarkably) preserve submodularity. Next, we’ll define the submodular polytope, and its relationship to the greedy algorithm and its exact and efficient solution to certain linear programs with an exponential number of constraints. We will see how submodularity shares certain properties with convexity (efficient minimization, discrete separation, subdifferentials, lattices and sub-lattices, and the convexity of the Lovasz extension), concavity (via its definition, submodularity via concave functions, superdifferentials), and neither (simultaneous sub- and super-differentials, efficient approximate maximization). The Lovasz extension will be given particular attention due to its growing use for structured convex norms and surrogates in relaxation methods. We will survey both constrained and unconstrained submodular optimization (including the minimum norm point algorithm), discussing what is currently known about hardness (both upper and lower bounds), and also when algorithms or instances are practical.

As to applications, it is interesting that a submodular function itself can often be seen as a parameter to instantiate a machine-learning instance — this includes active/semi-supervised learning, structured sparsity inducing norms, combinatorial independence and generalized entropy, and rank-order based divergences. Other examples include feature selection, data subset (or core set) selection, inference in graphical models with high tree-width and global potentials in computer vision, and influence determination in social networks.

Speakers

Jeff A Bilmes

Jeff A. Bilmes is a professor at the Department of Electrical Engineering at the University of Washington, Seattle and an adjunct professor in Computer Science & Engineering and the department of Linguistics. He received his Ph.D. in computer science from the University of California in Berkeley. He is a 2001 NSF Career award winner, a 2002 CRA Digital Government Fellow, a 2008 NAE Gilbreth Lectureship award recipient, and a 2012/2013 ISCA Distinguished Lecturer. His primary interests lie in signal processing for pattern classification, speech recognition, language processing, bioinformatics, machine learning, graphical models, submodularity in combinatorial optimization and machine learning, active and semi-supervised learning, computer vision, and audio/music processing. Starting in 2003, Prof. Bilmes became one of the first to apply submodularity in machine learning.