Track: Data Mining
Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs
- Qiaozhu Mei (University of Illinois at Urbana-Champaign)
- Xu Ling (University of Illinois at Urbana-Champaign)
- Matthew Wondra (University of Illinois at Urbana-Champaign)
- Hang Su (Vanderbilt University)
- ChengXiang Zhai (UIUC)
In this paper, we define the problem of topic-sentiment analysis on Weblogs and propose a novel probabilistic model to capture the mixture of topics and sentiments simultaneously. The proposed Topic-Sentiment Mixture (TSM) model can reveal the latent topical facets in a Weblog collection, the subtopics in the results of an ad hoc query, and their associated sentiments. It could also provide general sentiment models that are applicable to any ad hoc topics. With a specifically designed HMM structure, the sentiment models and topic models estimated with TSM can be utilized to extract topic life cycles and sentiment dynamics. Empirical experiments on different Weblog datasets show that this approach is effective for modeling the topic facets and sentiments and extracting their dynamics from Weblog collections. The TSM model is quite general; it can be applied to any text collections with a mixture of topics and sentiments, thus has many potential applications, such as search result summarization, opinion tracking, and user behavior prediction.