Interpretability With Accurate Small Models

Models often need to be constrained to a certain size for them to be considered interpretable. For example, a decision tree of depth 5 is much easier to understand than one of depth 50. Limiting model size, however, often reduces accuracy. We suggest a practical technique that minimizes this trade-o...

Full description

Bibliographic Details
Main Authors:	Abhishek Ghose, Balaraman Ravindran
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2020-02-01
Series:	Frontiers in Artificial Intelligence
Subjects:	ML interpretable machine learning Bayesian optimization infinite mixture models density estimation
Online Access:	https://www.frontiersin.org/article/10.3389/frai.2020.00003/full

_version_	1818498540794544128
author	Abhishek Ghose Balaraman Ravindran
author_facet	Abhishek Ghose Balaraman Ravindran
author_sort	Abhishek Ghose
collection	DOAJ
description	Models often need to be constrained to a certain size for them to be considered interpretable. For example, a decision tree of depth 5 is much easier to understand than one of depth 50. Limiting model size, however, often reduces accuracy. We suggest a practical technique that minimizes this trade-off between interpretability and classification accuracy. This enables an arbitrary learning algorithm to produce highly accurate small-sized models. Our technique identifies the training data distribution to learn from that leads to the highest accuracy for a model of a given size. We represent the training distribution as a combination of sampling schemes. Each scheme is defined by a parameterized probability mass function applied to the segmentation produced by a decision tree. An Infinite Mixture Model with Beta components is used to represent a combination of such schemes. The mixture model parameters are learned using Bayesian Optimization. Under simplistic assumptions, we would need to optimize for O(d) variables for a distribution over a d-dimensional input space, which is cumbersome for most real-world data. However, we show that our technique significantly reduces this number to a fixed set of eight variables at the cost of relatively cheap preprocessing. The proposed technique is flexible: it is model-agnostic, i.e., it may be applied to the learning algorithm for any model family, and it admits a general notion of model size. We demonstrate its effectiveness using multiple real-world datasets to construct decision trees, linear probability models and gradient boosted models with different sizes. We observe significant improvements in the F1-score in most instances, exceeding an improvement of 100% in some cases.
first_indexed	2024-12-10T20:17:02Z
format	Article
id	doaj.art-b07998e388be4cde9a0e1982f2a0d943
institution	Directory Open Access Journal
issn	2624-8212
language	English
last_indexed	2024-12-10T20:17:02Z
publishDate	2020-02-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Artificial Intelligence
spelling	doaj.art-b07998e388be4cde9a0e1982f2a0d9432022-12-22T01:35:10ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122020-02-01310.3389/frai.2020.00003507097Interpretability With Accurate Small ModelsAbhishek Ghose0Balaraman Ravindran1Department of Computer Science and Engineering, IIT Madras, Chennai, IndiaDepartment of Computer Science and Engineering, Robert Bosch Centre for Data Science and AI, IIT Madras, Chennai, IndiaModels often need to be constrained to a certain size for them to be considered interpretable. For example, a decision tree of depth 5 is much easier to understand than one of depth 50. Limiting model size, however, often reduces accuracy. We suggest a practical technique that minimizes this trade-off between interpretability and classification accuracy. This enables an arbitrary learning algorithm to produce highly accurate small-sized models. Our technique identifies the training data distribution to learn from that leads to the highest accuracy for a model of a given size. We represent the training distribution as a combination of sampling schemes. Each scheme is defined by a parameterized probability mass function applied to the segmentation produced by a decision tree. An Infinite Mixture Model with Beta components is used to represent a combination of such schemes. The mixture model parameters are learned using Bayesian Optimization. Under simplistic assumptions, we would need to optimize for O(d) variables for a distribution over a d-dimensional input space, which is cumbersome for most real-world data. However, we show that our technique significantly reduces this number to a fixed set of eight variables at the cost of relatively cheap preprocessing. The proposed technique is flexible: it is model-agnostic, i.e., it may be applied to the learning algorithm for any model family, and it admits a general notion of model size. We demonstrate its effectiveness using multiple real-world datasets to construct decision trees, linear probability models and gradient boosted models with different sizes. We observe significant improvements in the F1-score in most instances, exceeding an improvement of 100% in some cases.https://www.frontiersin.org/article/10.3389/frai.2020.00003/fullMLinterpretable machine learningBayesian optimizationinfinite mixture modelsdensity estimation
spellingShingle	Abhishek Ghose Balaraman Ravindran Interpretability With Accurate Small Models Frontiers in Artificial Intelligence ML interpretable machine learning Bayesian optimization infinite mixture models density estimation
title	Interpretability With Accurate Small Models
title_full	Interpretability With Accurate Small Models
title_fullStr	Interpretability With Accurate Small Models
title_full_unstemmed	Interpretability With Accurate Small Models
title_short	Interpretability With Accurate Small Models
title_sort	interpretability with accurate small models
topic	ML interpretable machine learning Bayesian optimization infinite mixture models density estimation
url	https://www.frontiersin.org/article/10.3389/frai.2020.00003/full
work_keys_str_mv	AT abhishekghose interpretabilitywithaccuratesmallmodels AT balaramanravindran interpretabilitywithaccuratesmallmodels

Interpretability With Accurate Small Models

Similar Items