Shu Huang, Qiankun Zhao, Prasenjit Mitra, C. Lee Giles
In this paper, we propose a novel approach to expand queries by exploring both location information and topic information of the queries. Users at different locations tend to have different vocabularies, while the different expressions coming from different vocabularies may relate to the same topics. Thus these expressions are identified as location sensitive and can be used for query expansion. We propose a hierarchical query expansion model, which employs a two-level SVM classification model to classify queries as location sensitive or location non-sensitive, where the former are further classified into same location sensitive and different location sensitive. For the location sensitive queries, we propose an LDA based topic-level query similarity measure to rank the list of similar queries. Experiments with 2G raw log data from CiteSeer and Excite show that our hierarchical classification model predicts the query location sensitivity with more than 80% precision and that the final search result is significantly better than existing query expansion methods.
Subjects: 1. Applications; 1.10 Information Retrieval
Submitted: Apr 15, 2008