Bayesian Combination of Crowd-Based Tweet Sentiment Analysis Judgments

Human Computer Interaction International Conference |

Crowdsourcing at Scale 2013 Workshop Shared Task Challenge (Joint winners)

In this paper we describe the probabilistic model that we used in the CrowdScale – Shared Task Challenge 2013 for processing the CrowdFlower dataset, which consists of a collection of crowdsourced text sentiment judgments. Specifically, the dataset includes 569,786 sentiment judgments for 98,979 tweets, discussing the weather, collected from 1,960 judges. The challenge is to compute the most reliable estimate of the true sentiment of each tweet from the judgment set while taking into account possible noise and biases of the judges and other properties of the text contained in the tweets. To address this challenge, we developed a Bayesian model, which is able to infer the true sentiment of the tweets by combining signals from both the crowd labels and words in the tweets. The model represents the reliability of each judge using a confusion matrix model and the likelihood of each dictionary word belonging to a certain sentiment class using a mixture of bag of words models. Both these models are combined together to learn the latent true tweet sentiments. We discuss our scalable model implementation using the Infer.NET framework, and our preliminary results which show that our model performs better than the majority voting baseline.