Markov Topic Models

Chong Wang; Bo Thiesson; Chris Meek; David Blei

Markov Topic Models

Chong Wang ,
Bo Thiesson ,
Chris Meek ,
David Blei

D. van Dyk and M. Welling (Eds.), Proceedings of The Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS) 2009, JMLR: W&CP 5 | April 2009

Published by Journal of Machine Learning Research

Download BibTex

We develop Markov topic models (MTMs), a novel family of generative probabilistic models that can learn topics simultaneously from multiple corpora, such as papers from different conferences. We apply Gaussian (Markov) random fields to model the correlations of different corpora. MTMs capture both the internal topic structure within each corpus and the relationships between topics across the corpora. We derive an efficient estimation procedure with variational expectation-maximization. We study the performance of our models on a corpus of abstracts from six different computer science conferences. Our analysis reveals qualitative discoveries that are not possible with traditional topic models, and improved quantitative performance over the state of the art.