TableRank: A Ranking Algorithm for Table Search and Retrieval

Ying Liu, Kun Bai, Prasenjit Mitra, C. Lee Giles

Tables are ubiquitous in web pages and scientific documents. With the explosive development of the web, tables have become a valuable information repository. Therefore, effectively and efficiently searching tables becomes a challenge. Existing search engines do not provide satisfactory search results largely because the current ranking schemes are inadequate for table search and automatic table understanding and extraction are rather difficult in general. In this work, we design and evaluate a novel table ranking algorithm -- TableRank to improve the performance of our table search engine TableSeer. Given a keyword based table query, TableRank facilities TableSeer to return the most relevant tables by tailoring the classic vector space model. TableRank adopts an innovative term weighting scheme by aggregating multiple weighting factors from three levels: term, table and document. The experimental results show that our table search engine outperforms existing search engines on table search. In addition, incorporating multiple weighting factors can significantly improve the ranking results.

Subjects: 1.10 Information Retrieval; 11. Knowledge Representation

Submitted: Apr 19, 2007

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.