Aggregation of Imprecise and Uncertain Information for Knowledge Discovery in Databases

Sally McClean, Bryan Scotney and Mary Shapcott

We consider the problem of aggregation for uncertain and imprecise data. For such data, we define aggregation operators and use them to provide information on properties and patterns of data attributes. The aggregates that we define use the Kullback-Leibler information divergence between the aggregated probability distribution and the individual tuple data values. We are thus able to provide a probability distribution for the domain values of an attribute or group of attributes using imperfect data. Information stored in a database is often subject to uncertainty and imprecision. An extended relational data model has previously been proposed for such data which allows us to quantify our uncertainty and imprecision about attribute values by representing them as a probability distribution. Our aggregation operators are defined on such a data model. The provision of such operators is a central requirement in furnishing a database with the capability to perform the operations necessary for Knowledge Discovery in Databases.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.