DOI:
Abstract:
When mining frequent Datalog queries, many queries will cover the same examples; i.e., they will be equivalent and hence, redundant. The equivalences can be due to the data set or to the regularities specified in the background theory. To avoid the generation of redundant clauses, we introduce various types of condensed representations. More specifically, we introduce delta-free and closed clauses, that are defined w.r.t. the data set, and semantically free and closed clauses, that take into account a logical background theory. A novel algorithm that employs these representations is also presented and experimentally evaluated on a number of benchmark problems in inductive logic programming.