Using Hierarchies, Aggregates and Statistical Models to Discover Knowledge from Distributed Databases

Rónán Páircéir, Sally McClean and Bryan Scotney

Data Warehouses and statistical databases (Shoshani 1997) contain both numerical attributes (measures) and categorical attributes (dimensions). These data are often stored within a relational database with an associated hierarchical structure. There are few algorithms to date that explicitly exploit this hierarchical structure when carrying out knowledge discovery on such data. We look at a number of aspects of knowledge discovery from a set of databases distributed over the internet including the following: Discovery of statistical relationships, rules and exceptions from hierarchically structured data which may contain heterogeneous and non-independent instances; Use of aggregates as a set of sufficient statistics in place of base data for efficient model computation; Leveraging the power of a relational database system for efficient computation of sufficient statistics; Use of statistical metadata to aid distributed data integration and knowledge discovery.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.