Paper #22

 

A. Appice, M. Ceci, D. Malerba "Mining Model Trees with Regression and Splitting Nodes"

Keywords: predictive data mining, model trees, regression tasks

 

Model trees are tree-based models that associate leaves with multiple linear models and are used to solve prediction problems in which the response variable is numeric. In this paper a method for mining model trees is presented. Its main characteristic is the construction of trees with two types of nodes: regression nodes, which perform only straight-line regression, and splitting nodes, which partition the feature space. The multiple linear model associated to each leaf is then built stepwise by combining straight-line regressions reported along the path from the root to the leaf. In this way, internal regression nodes contribute to the definition of multiple models and capture global effects, while straight-line regressions at leaves can only capture local effects. The proposed method has been implemented in the system SMOTI and evaluated on both artificially generated datasets and benchmark datasets used for studies on both regression and model trees. The first set of results show that SMOTI outperforms the state-of-the-art model tree induction system M5, while the second set of results do not allow us to draw statistically significant conclusions. However, model trees induced by SMOTI are generally easily interpretable and their analysis may reveal interesting patterns in the data.