Clustering and Approximate Identification of Frequent Item Sets

Selim Mimaroglu, Dan Simovici

We propose an algorithm that computes an approximation of the set of frequent item sets by using the bit sequence representation of the associations between items and transactions. The algorithm is obtained by modifying a hierarchical agglomerative clustering algorithm and takes advantage of the speed that bit operations afford. The algorithm offers a very significant speed advantage over standard implementations of the Apriori technique and, under certain conditions, recovers the preponderant part of the frequent item sets.

Subjects: 12. Machine Learning and Discovery; 11. Knowledge Representation

Submitted: Jan 31, 2007

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.