Published:
May 2003
Proceedings:
Proceedings of the Sixteenth International Florida Artificial Intelligence Research Society Conference (FLAIRS 2003)
Volume
Issue:
Proceedings of the Sixteenth International Florida Artificial Intelligence Research Society Conference (FLAIRS 2003)
Track:
All Papers
Downloads:
Abstract:
We present a new keyword extraction algorithm that applies to a single document without using a corpus. Frequent terms are extracted first, then a set of cooccurrence between each term and the frequent terms, i.e., occurrences in the same sentences, is generated. Co-occurrence distribution shows importance of a term in the document as follows. If probability distribution of co-occurrence between term a and the frequent terms is biased to a particular subset of frequent terms, then term a is likely to be a keyword. The degree of biases of distribution is measured by the χ2-measure. Our algorithm shows comparable performance to tfidf without using a corpus.
FLAIRS
Proceedings of the Sixteenth International Florida Artificial Intelligence Research Society Conference (FLAIRS 2003)
ISBN 978-1-57735-177-1
Published by The AAAI Press, Menlo Park, California.