The Process of Knowledge Discovery in Databases: A First Sketch

Authors

Ronald J. Brachman

Tej Anand

Track:

Foundational Issues and Core Problems

Downloads:

Abstract:

The general idea of discovering knowledge in large amounts of data is both appealing and intuitive. Typically we focus our attention on learning algorithms, which provide the core capability of generalizing from large numbers of small, very specific facts to useful high-level rules; these learning techniques seem to hold the most excitement and perhaps the most substantive scientific content in the knowledge discovery in databases (KDD) enterprise. However, when we engage in real-world discovery tasks, we find that they can be extremely complex, and that induction of rules is only one small part of the overall process. While others have written overviews of "the concept of KDD, and even provided block diagrams for "knowledge discovery systems," no one has begun to identify all of the building blocks in a realistic KDD process. This is what we attempt to do here. Besides bringing into the discussion several parts of the process that have received inadequate attention in the KDD community, a careful elucidation of the steps in a realistic knowledge discovery process can provide a framework for comparison of different technologies and tools that are almost impossible to compare without a clean model.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.