Mining Latent Associations of Objects Using a Typed Mixture Model – A Case Study on Expert/Expertise Mining
This paper studies the problem of discovering latent associations among objects in text documents. Specifically, given two sets of objects and various types of co-occurrence data concerning the objects existing in texts, we aim to discover the hidden or latent associative relationships between the two sets of objects. Existing methods are not directly applicable as they are unable to consider all this information. For example, the probabilistic mixture model called Separable Mixture Model (SMM) proposed by Hofmann can use only one type of co-occurrences to mine latent associations. This paper proposes a more general probabilistic mixture model called the Typed Separable Mixture Model (TSMM), which is able to use all types of co-occurrences within a single framework. Experimental results based on the expert/expertise mining task show that TSMM outperforms SMM significantly.