Manifold learning and missing data recovery through unsupervised regression

  • Miguel A. Carreira-Perpinan ,
  • Zhengdong Lu

ICDM |

We propose an algorithm that, given a high-dimensional dataset with missing values, achieves the distinct goals of learning a nonlinear low-dimensional representation of the data (the dimensionality reduction problem) and reconstructing the missing high-dimensional data (the matrix completion, or imputation, problem). The algorithm follows the Dimensionality Reduction by Unsupervised Regression approach, where one alternately optimizes over the latent coordinates given the reconstruction and projection mappings, and vice versa; but here we also optimize over the missing data, using an efficient, globally convergent Gauss-Newton scheme. We also show how to project or reconstruct test data with missing values. We achieve impressive reconstructions while learning good latent representations in image restoration with 50% missing pixels.