������������������������������������������������������������������������A common problem whenusing complicated models for prediction and classification is that the complexity of the model entails that it is hard, or impossible, to interpret. For some scenarios this might not be a limitation, since the priority is the accuracy of the model. In other situations the limitations might be severe, since additional aspects are important to consider; e.g. comprehensibility or scalability of the model. In this study we show how the gap between accuracy and other aspects can be bridged by using a rule extraction method (termed G-REX) based on genetic programming. The extraction method is evaluated against the five criteria accuracy, comprehensibility, fidelity, scalability and generality. It is also shown how G-REX can create novel representation languages; here regression trees and fuzzy rules. The problem used is a data-mining problem from the marketing domain where the impact of advertising is predicted from investment plans. Several experiments, covering both regression and classification tasks, are evaluated. Results show that G-REX in general is capable of extracting both accurate and comprehensible representations, thus allowing high performance also in domains where comprehensibility is of essence.
Published Date: May 2004
Registration: ISBN 978-1-57735-201-3
Copyright: Published by The AAAI Press, Menlo Park, California.