Fast growing and interpretable oblique trees via logistic regression models

The classification tree is an attractive method for classification as the predictions it makes are more transparent than most other classifiers. The most widely accepted approaches to tree-growth use axis-parallel splits to partition continuous attributes. Since the interpretability of a tree dimini...

Full description

Bibliographic Details
Main Author:	Truong, A
Other Authors:	Ripley, B
Format:	Thesis
Language:	English
Published:	2009
Subjects:	Computationally-intensive statistics Mathematics Pattern recognition (statistics)

_version_	1826301075925237760
author	Truong, A
author2	Ripley, B
author_facet	Ripley, B Truong, A
author_sort	Truong, A
collection	OXFORD
description	The classification tree is an attractive method for classification as the predictions it makes are more transparent than most other classifiers. The most widely accepted approaches to tree-growth use axis-parallel splits to partition continuous attributes. Since the interpretability of a tree diminishes as it grows larger, researchers have sought ways of growing trees with oblique splits as they are better able to partition observations. The focus of this thesis is to grow oblique trees in a fast and deterministic manner and to propose ways of making them more interpretable. Finding good oblique splits is a computationally difficult task. Various authors have proposed ways of doing this by either performing stochastic searches or by solving problems that effectively produce oblique splits at each stage of tree-growth. A new approach to finding such splits is proposed that restricts attention to a small but comprehensive set of splits. Empirical evidence shows that good oblique splits are found in most cases. When observations come from a small number of classes, empirical evidence shows that oblique trees can be grown in a matter of seconds. As interpretability is the main strength of classification trees, it is important for oblique trees that are grown to be interpretable. As the proposed approach to finding oblique splits makes use of logistic regression, well-founded variable selection techniques are introduced to classification trees. This allows concise oblique splits to be found at each stage of tree-growth so that oblique trees that are more interpretable can be directly grown. In addition to this, cost-complexity pruning ideas which were developed for axis-parallel trees have been adapted to make oblique trees more interpretable. A major and practical component of this thesis is in providing the <em>oblique.tree</em> package in R that allows casual users to experiment with oblique trees in a way that was not possible before.
first_indexed	2024-03-07T05:26:53Z
format	Thesis
id	oxford-uuid:e0de0156-da01-4781-85c5-8213f5004f10
institution	University of Oxford
language	English
last_indexed	2024-03-07T05:26:53Z
publishDate	2009
record_format	dspace
spelling	oxford-uuid:e0de0156-da01-4781-85c5-8213f5004f102022-03-27T09:50:24ZFast growing and interpretable oblique trees via logistic regression modelsThesishttp://purl.org/coar/resource_type/c_db06uuid:e0de0156-da01-4781-85c5-8213f5004f10Computationally-intensive statisticsMathematicsPattern recognition (statistics)EnglishOxford University Research Archive - Valet2009Truong, ARipley, BThe classification tree is an attractive method for classification as the predictions it makes are more transparent than most other classifiers. The most widely accepted approaches to tree-growth use axis-parallel splits to partition continuous attributes. Since the interpretability of a tree diminishes as it grows larger, researchers have sought ways of growing trees with oblique splits as they are better able to partition observations. The focus of this thesis is to grow oblique trees in a fast and deterministic manner and to propose ways of making them more interpretable. Finding good oblique splits is a computationally difficult task. Various authors have proposed ways of doing this by either performing stochastic searches or by solving problems that effectively produce oblique splits at each stage of tree-growth. A new approach to finding such splits is proposed that restricts attention to a small but comprehensive set of splits. Empirical evidence shows that good oblique splits are found in most cases. When observations come from a small number of classes, empirical evidence shows that oblique trees can be grown in a matter of seconds. As interpretability is the main strength of classification trees, it is important for oblique trees that are grown to be interpretable. As the proposed approach to finding oblique splits makes use of logistic regression, well-founded variable selection techniques are introduced to classification trees. This allows concise oblique splits to be found at each stage of tree-growth so that oblique trees that are more interpretable can be directly grown. In addition to this, cost-complexity pruning ideas which were developed for axis-parallel trees have been adapted to make oblique trees more interpretable. A major and practical component of this thesis is in providing the <em>oblique.tree</em> package in R that allows casual users to experiment with oblique trees in a way that was not possible before.
spellingShingle	Computationally-intensive statistics Mathematics Pattern recognition (statistics) Truong, A Fast growing and interpretable oblique trees via logistic regression models
title	Fast growing and interpretable oblique trees via logistic regression models
title_full	Fast growing and interpretable oblique trees via logistic regression models
title_fullStr	Fast growing and interpretable oblique trees via logistic regression models
title_full_unstemmed	Fast growing and interpretable oblique trees via logistic regression models
title_short	Fast growing and interpretable oblique trees via logistic regression models
title_sort	fast growing and interpretable oblique trees via logistic regression models
topic	Computationally-intensive statistics Mathematics Pattern recognition (statistics)
work_keys_str_mv	AT truonga fastgrowingandinterpretableobliquetreesvialogisticregressionmodels

Fast growing and interpretable oblique trees via logistic regression models

Similar Items