Sparse regression over clusters: SparClur

Abstract Prediction tasks in personalized medicine require models that combine accuracy and interpretability. We propose an integer optimization approach for building sparse regression models with enforced coordination, using data partitioned among leaves in a prediction tree. We show...

Full description

Bibliographic Details
Main Authors: Bertsimas, Dimitris, Dunn, Jack, Kapelevich, Lea, Zhang, Rebecca
Other Authors: Sloan School of Management
Format: Article
Language:English
Published: Springer Berlin Heidelberg 2022
Online Access:https://hdl.handle.net/1721.1/140530
_version_ 1826209533006970880
author Bertsimas, Dimitris
Dunn, Jack
Kapelevich, Lea
Zhang, Rebecca
author2 Sloan School of Management
author_facet Sloan School of Management
Bertsimas, Dimitris
Dunn, Jack
Kapelevich, Lea
Zhang, Rebecca
author_sort Bertsimas, Dimitris
collection MIT
description Abstract Prediction tasks in personalized medicine require models that combine accuracy and interpretability. We propose an integer optimization approach for building sparse regression models with enforced coordination, using data partitioned among leaves in a prediction tree. We show that the method recovers the true underlying relationship between observations and target variables in large-scale synthetic data in seconds. We apply our method to several real-world medical prediction problems and observe that the additional structure imposed provides a substantial gain in interpretability, at a low cost to accuracy.
first_indexed 2024-09-23T14:23:58Z
format Article
id mit-1721.1/140530
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T14:23:58Z
publishDate 2022
publisher Springer Berlin Heidelberg
record_format dspace
spelling mit-1721.1/1405302023-02-06T20:43:29Z Sparse regression over clusters: SparClur Bertsimas, Dimitris Dunn, Jack Kapelevich, Lea Zhang, Rebecca Sloan School of Management Massachusetts Institute of Technology. Operations Research Center Abstract Prediction tasks in personalized medicine require models that combine accuracy and interpretability. We propose an integer optimization approach for building sparse regression models with enforced coordination, using data partitioned among leaves in a prediction tree. We show that the method recovers the true underlying relationship between observations and target variables in large-scale synthetic data in seconds. We apply our method to several real-world medical prediction problems and observe that the additional structure imposed provides a substantial gain in interpretability, at a low cost to accuracy. 2022-02-18T16:25:26Z 2022-02-18T16:25:26Z 2021-07-08 2022-02-17T04:18:16Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/140530 Bertsimas, Dimitris, Dunn, Jack, Kapelevich, Lea and Zhang, Rebecca. 2021. "Sparse regression over clusters: SparClur." en https://doi.org/10.1007/s11590-021-01770-9 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature application/pdf Springer Berlin Heidelberg Springer Berlin Heidelberg
spellingShingle Bertsimas, Dimitris
Dunn, Jack
Kapelevich, Lea
Zhang, Rebecca
Sparse regression over clusters: SparClur
title Sparse regression over clusters: SparClur
title_full Sparse regression over clusters: SparClur
title_fullStr Sparse regression over clusters: SparClur
title_full_unstemmed Sparse regression over clusters: SparClur
title_short Sparse regression over clusters: SparClur
title_sort sparse regression over clusters sparclur
url https://hdl.handle.net/1721.1/140530
work_keys_str_mv AT bertsimasdimitris sparseregressionoverclusterssparclur
AT dunnjack sparseregressionoverclusterssparclur
AT kapelevichlea sparseregressionoverclusterssparclur
AT zhangrebecca sparseregressionoverclusterssparclur