Probabilistic Latent Variable Decompositions for Image and Audio Analysis
- Bhiksha Ra and Paris Smaragdis | Mitsubishi Electric Research Labs (MERL)
In this talk we present a model which can decompose probability densities into sets of shift invariant components. We will show how this model is very well suited for audio and image problems and will demonstrate its applications on complex data sets. We will also explore the relationship of this model to well known decomposition methods (such as PCA/ICA/NMF/PARAFAC) and argue about its appropriateness when dealing with density-like representations. We will also present various extensions to this model to allow it to perform sparse coding, discover Markovian components, model a priori dispositions and temporal relationships through standard statistical models.
Speaker Details
Bhiksha Raj is a Staff Scientist at MERL. He completed his Ph.D. from Carnegie Mellon University (CMU) in May 2000. Dr. Raj works mainly on algorithmic aspects of speech recognition, with special emphasis on improving the robustness of speech recognition systems to environmental noise. His latest work has been on sound source separation and the use of novel secondary sensors for speech processing.
Paris Smaragdis joined MERL late 2002 as a research scientist. His main interests are auditory scene analysis and self-organizing computational perception. Before coming to MERL he was a postdoctoral associate at MIT, where he also obtained his PhD degree in perceptual computing. His most recent work has been on sound source separation, multimodal statistics and audio classification.
-
-
Jeff Running
-
-
Watch Next
-
-
-
-
-
-
Accelerating MRI image reconstruction with Tyger
- Karen Easterbrook,
- Ilyana Rosenberg
-
-
-
From Microfarms to the Moon: A Teen Innovator’s Journey in Robotics
- Pranav Kumar Redlapalli
-