Large-scale real-time social media analytics provides a novel set of conditions for the construction of predictive models. With individual users as training and test instances, their associated content (“lexical features”) and context (“network features”) are made available incrementally over time, as they converse over discussion forums. We propose various approaches to handling this dynamic data for predicting latent user properties, from traditional batch training and testing, to incremental bootstrapping, and then active learning via interactive rationale crowdsourcing.
We also study the relationships between a variety of predicted user properties, opinions and emotions on a large sample of users in online social network. We first correlate user demographics and personality with the emotional profile emanating from user tweets. We then analyze the relationships between predicted user properties and user-environment emotional contrast estimated over various neighborhoods including friends, retweeted and mentioned users. Finally, we analyze and compare predictive power of latent user properties, emotions and interests for automatically inferring showing off and self-promoting behaviors projected in online social networks.