AAAI Publications, Third International AAAI Conference on Weblogs and Social Media

Font Size: 
Detecting Topic Drift with Compound Topic Models
Dan Knights, Michael C. Mozer, Nicolas Nicolov

Last modified: 2009-07-07


The Latent Dirichlet Allocation topic model of Blei, Ng, and Jordan (2003) is well-established as an effective approach to recovering meaningful topics of conversation from a set of documents. However, a useful analysis of user-generated content is concerned not only with the recovery of topics from a static data set, but with the evolution of topics over time. We employ a compound topic model (CTM) to track topics across two distinct data sets (i.e. past and present) and to visualize trends in topics over time; we evaluate several metrics for detecting a change in the distribution of topics within a time-window; and we illustrate how our approach discovers emerging conversation topics related to current events in real data sets.

Full Text: PDF