The Truth Is in There—Rule Extraction from Opaque Models Using Genetic Programming

Ulf Johansson, Rikard K&oulm;nig, and Lars Niklasson

A common problem whenusing complicated models for prediction and classification is that the complexity of the model entails that it is hard, or impossible, to interpret. For some scenarios this might not be a limitation, since the priority is the accuracy of the model. In other situations the limitations might be severe, since additional aspects are important to consider; e.g. comprehensibility or scalability of the model. In this study we show how the gap between accuracy and other aspects can be bridged by using a rule extraction method (termed G-REX) based on genetic programming. The extraction method is evaluated against the five criteria accuracy, comprehensibility, fidelity, scalability and generality. It is also shown how G-REX can create novel representation languages; here regression trees and fuzzy rules. The problem used is a data-mining problem from the marketing domain where the impact of advertising is predicted from investment plans. Several experiments, covering both regression and classification tasks, are evaluated. Results show that G-REX in general is capable of extracting both accurate and comprehensible representations, thus allowing high performance also in domains where comprehensibility is of essence.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.