Noisy, Missing and Corrupted Data

  • Constantine Caramanis | The University of Texas at Austin

Many models for sparse regression typically assume that the covariates are known completely, and without noise. Particularly in high-dimensional applications, this is often not the case. In this talk, we develop efficient OMP-like algorithms to deal with precisely this setting. Our algorithms are as efficient as OMP, and improve on the best-known results for missing and noisy data in regression, both in the high-dimensional setting where we seek to recover a sparse vector from only a few measurements, and in the classical low-dimensional setting where we recover an unstructured regressor. In the high-dimensional setting, our support-recovery algorithm requires no knowledge of even the statistics of the noise. Along the way, we also obtain improved performance guarantees for OMP for the standard sparse regression problem with Gaussian noise. Time permitting, we present some results on the more difficult setting of arbitrarily corrupted covariates.

Speaker Details

I am an Assistant Professor in the ECE department of The University of Texas at Austin. I received a PhD in EECS from The Massachusetts Institute of Technology, in the Laboratory for Information and Decision Systems (LIDS), and an A.B. in Mathematics from Harvard University. I received the NSF CAREER award in 2011.

My current research interests focus on decision-making in large-scale complex systems. Specifically, I am interested in robust and adaptable optimization, high dimensional statistics and machine learning, and applications to large-scale networks, including wireless networks, transportation networks, and energy networks. I also work on applications of machine learning and optimization to computer-aided design.