Data Surveying: Foundations of an Inductive Query Language

Authors

Arno Siebes

CWI

Database Research Group

The Netherlands

Track:

All Contents

Downloads:

Download PDF

Abstract:

Data mining systems have to evolve from a set of specialised routines to more generally applicable inductive query languages to satisfy industry’s need for strategic information. This paper introduces such an inductive query language called Data Surveying. Data Surveying is the discovery of interesting subsets of the database. Groups of customers whose behaviour deviates from average customer behaviour are examples of such interesting subsets. A user specifies what makes a subset interesting through a survey task. The wide applicability of this scheme is illustrated by a variety of examples. To implement an inductive query language system, the "what" (the kind of strategic information sought) has to be made independent from the "how" (how this strategic information is discovered). In other words, the discovery algorithms have to be task independent. In this paper, operators on the search space are introduced to achieve this independence. The discovery algorithms are defined relative to these operators. To enforce efficient discovery, the notion of polynomial convergence is defined for these algorithms. Domain knowledge plays an important role in the specification of both the survey task and the operators.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.