Localised splitting criteria for classification and regression trees
Bremner, Alexandra (2004) Localised splitting criteria for classification and regression trees. PhD thesis, Murdoch University.
|PDF - Front Pages |
Download (60kB) | Preview
|PDF - Whole Thesis |
Download (2413kB) | Preview
This thesis presents a modification of existing entropy-based splitting criteria for classification and regression trees. Trees are typically grown using splitting criteria that choose optimal splits without taking future splits into account. This thesis examines localised splitting criteria that are based on local averaging in regression trees or local proportions in classification trees. The use of a localised criterion is motivated by the fact that future splits result in leaves that contain local observations, and hence local deviances provide a better approximation of the deviance of the fully grown tree. While most recent research has focussed on tree-averaging techniques that are aimed at taking a moderately successful splitting criterion and improving its predictive power, this thesis concentrates on improving the splitting criterion.
Use of a localised splitting criterion captures local structures and enables later splits to capitalise on the placement of earlier splits when growing a tree. Using the localised splitting criterion results in much simpler trees for pure interaction data (data with no main effects) and can produce trees with fewer errors and lower residual mean deviances than those produced using a global splitting criterion when applied to real data sets with strong interaction effects.
The superiority of the localised splitting criterion can persist when multiple trees are grown and averaged using simple methods. Although a single tree grown using the localised splitting criterion can outperform tree averaging using the global criterion, generally improvements in predictive performance are achieved by utilising the localised splitting criterion's property of detecting local discontinuities and averaging over sets of trees grown by placing splits where the deviance is locally minimal. Predictive performance improves further when the degree of localisation of the splitting criterion is randomly selected and weighted randomisation is used with locally minimal deviances to produce sets of trees to average over. Although state of the art methods quickly average very large numbers of trees, thus making the performance of the splitting criterion less critical, predictive performance when the localised criterion is used in bagging indicates that different splitting methods warrant investigation.
The localised splitting criterion is most useful for growing one tree or a small number of trees to examine structure in the data. Structurally different trees can be obtained by simply splitting the data where the localised splitting criterion is locally optimal.
|Publication Type:||Thesis (PhD)|
|Murdoch Affiliation:||School of Engineering Science|
|Item Control Page|