Node harvest

When choosing a suitable technique for regression and classification with multivariate predictor variables, one is often faced with a tradeoff between interpretability and high predictive accuracy. To give a classical example, classification and regression trees are easy to understand and interpret....

Full description

Bibliographic Details
Main Author:	Meinshausen, N
Format:	Journal article
Language:	English
Published:	2009

_version_	1797080698177191936
author	Meinshausen, N
author_facet	Meinshausen, N
author_sort	Meinshausen, N
collection	OXFORD
description	When choosing a suitable technique for regression and classification with multivariate predictor variables, one is often faced with a tradeoff between interpretability and high predictive accuracy. To give a classical example, classification and regression trees are easy to understand and interpret. Tree ensembles like Random Forests provide usually more accurate predictions. Yet tree ensembles are also more difficult to analyze than single trees and are often criticized, perhaps unfairly, as `black box' predictors. Node harvest is trying to reconcile the two aims of interpretability and predictive accuracy by combining positive aspects of trees and tree ensembles. Results are very sparse and interpretable and predictive accuracy is extremely competitive, especially for low signal-to-noise data. The procedure is simple: an initial set of a few thousand nodes is generated randomly. If a new observation falls into just a single node, its prediction is the mean response of all training observation within this node, identical to a tree-like prediction. A new observation falls typically into several nodes and its prediction is then the weighted average of the mean responses across all these nodes. The only role of node harvest is to `pick' the right nodes from the initial large ensemble of nodes by choosing node weights, which amounts in the proposed algorithm to a quadratic programming problem with linear inequality constraints. The solution is sparse in the sense that only very few nodes are selected with a nonzero weight. This sparsity is not explicitly enforced. Maybe surprisingly, it is not necessary to select a tuning parameter for optimal predictive accuracy. Node harvest can handle mixed data and missing values and is shown to be simple to interpret and competitive in predictive accuracy on a variety of data sets.
first_indexed	2024-03-07T01:03:50Z
format	Journal article
id	oxford-uuid:8aa1b518-7dce-47e6-9db1-b975d518bdbe
institution	University of Oxford
language	English
last_indexed	2024-03-07T01:03:50Z
publishDate	2009
record_format	dspace
spelling	oxford-uuid:8aa1b518-7dce-47e6-9db1-b975d518bdbe2022-03-26T22:32:51ZNode harvestJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:8aa1b518-7dce-47e6-9db1-b975d518bdbeEnglishSymplectic Elements at Oxford2009Meinshausen, NWhen choosing a suitable technique for regression and classification with multivariate predictor variables, one is often faced with a tradeoff between interpretability and high predictive accuracy. To give a classical example, classification and regression trees are easy to understand and interpret. Tree ensembles like Random Forests provide usually more accurate predictions. Yet tree ensembles are also more difficult to analyze than single trees and are often criticized, perhaps unfairly, as `black box' predictors. Node harvest is trying to reconcile the two aims of interpretability and predictive accuracy by combining positive aspects of trees and tree ensembles. Results are very sparse and interpretable and predictive accuracy is extremely competitive, especially for low signal-to-noise data. The procedure is simple: an initial set of a few thousand nodes is generated randomly. If a new observation falls into just a single node, its prediction is the mean response of all training observation within this node, identical to a tree-like prediction. A new observation falls typically into several nodes and its prediction is then the weighted average of the mean responses across all these nodes. The only role of node harvest is to `pick' the right nodes from the initial large ensemble of nodes by choosing node weights, which amounts in the proposed algorithm to a quadratic programming problem with linear inequality constraints. The solution is sparse in the sense that only very few nodes are selected with a nonzero weight. This sparsity is not explicitly enforced. Maybe surprisingly, it is not necessary to select a tuning parameter for optimal predictive accuracy. Node harvest can handle mixed data and missing values and is shown to be simple to interpret and competitive in predictive accuracy on a variety of data sets.
spellingShingle	Meinshausen, N Node harvest
title	Node harvest
title_full	Node harvest
title_fullStr	Node harvest
title_full_unstemmed	Node harvest
title_short	Node harvest
title_sort	node harvest
work_keys_str_mv	AT meinshausenn nodeharvest

Node harvest

Similar Items