Faster and easier: cross-validation and model robustness checks

Machine learning and statistical methods are increasingly used in high-stakes applications – for instance, in policing crime, making predictions about the atmosphere, or providing medical care. We want to assess the extent to which we can trust our methods, though, before we use them in such applica...

Full description

Bibliographic Details
Main Author: Stephenson, William T.
Other Authors: Broderick, Tamara
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/143247
_version_ 1811093887141281792
author Stephenson, William T.
author2 Broderick, Tamara
author_facet Broderick, Tamara
Stephenson, William T.
author_sort Stephenson, William T.
collection MIT
description Machine learning and statistical methods are increasingly used in high-stakes applications – for instance, in policing crime, making predictions about the atmosphere, or providing medical care. We want to assess the extent to which we can trust our methods, though, before we use them in such applications. There exist assessment tools, such as cross-validation (CV) and robustness checks, that help us understand exactly how trustworthy our methods are. In both cases (CV and robustness checks), a typical workflow follows the pattern of “change the dataset or method, and then rerun the analysis.” However, this workflow (1) requires users to specify the set of relevant changes, and (2) requires a computer to repeatedly refit the model. For methods involving large and complex models, (1) is expensive in terms of user time, and (2) is expensive in terms of compute time. So CV, which requires (2), and robustness checks, which often require both (1) and (2), see little use in the large and complex models that need them the most. In this thesis, we address these challenges by developing model evaluation tools that are fast in terms of both compute and user time. We develop tools to approximate CV when it is most computationally expensive: in high dimensional and complex, structured models. But approximating CV implicitly relies on the quality of CV itself. We show theory and empirics calling into question the reliability of the use of CV for quickly and automatically tuning model hyperparameters – even in cases where the behavior of CV is thought to be relatively well-understood. On the front of robustness checks, we note that a common workflow in Bayesian prior robustness requires users to manually specify a set of alternative reasonable priors, a task that can be time consuming and difficult. We develop automatic tools to search for a prediction-changing alternative prior for Gaussian processes, saving users from having to manually specify the set of alternative priors.
first_indexed 2024-09-23T15:52:16Z
format Thesis
id mit-1721.1/143247
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T15:52:16Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1432472022-06-16T03:06:27Z Faster and easier: cross-validation and model robustness checks Stephenson, William T. Broderick, Tamara Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Machine learning and statistical methods are increasingly used in high-stakes applications – for instance, in policing crime, making predictions about the atmosphere, or providing medical care. We want to assess the extent to which we can trust our methods, though, before we use them in such applications. There exist assessment tools, such as cross-validation (CV) and robustness checks, that help us understand exactly how trustworthy our methods are. In both cases (CV and robustness checks), a typical workflow follows the pattern of “change the dataset or method, and then rerun the analysis.” However, this workflow (1) requires users to specify the set of relevant changes, and (2) requires a computer to repeatedly refit the model. For methods involving large and complex models, (1) is expensive in terms of user time, and (2) is expensive in terms of compute time. So CV, which requires (2), and robustness checks, which often require both (1) and (2), see little use in the large and complex models that need them the most. In this thesis, we address these challenges by developing model evaluation tools that are fast in terms of both compute and user time. We develop tools to approximate CV when it is most computationally expensive: in high dimensional and complex, structured models. But approximating CV implicitly relies on the quality of CV itself. We show theory and empirics calling into question the reliability of the use of CV for quickly and automatically tuning model hyperparameters – even in cases where the behavior of CV is thought to be relatively well-understood. On the front of robustness checks, we note that a common workflow in Bayesian prior robustness requires users to manually specify a set of alternative reasonable priors, a task that can be time consuming and difficult. We develop automatic tools to search for a prediction-changing alternative prior for Gaussian processes, saving users from having to manually specify the set of alternative priors. Ph.D. 2022-06-15T13:07:06Z 2022-06-15T13:07:06Z 2022-02 2022-03-04T20:47:50.655Z Thesis https://hdl.handle.net/1721.1/143247 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Stephenson, William T.
Faster and easier: cross-validation and model robustness checks
title Faster and easier: cross-validation and model robustness checks
title_full Faster and easier: cross-validation and model robustness checks
title_fullStr Faster and easier: cross-validation and model robustness checks
title_full_unstemmed Faster and easier: cross-validation and model robustness checks
title_short Faster and easier: cross-validation and model robustness checks
title_sort faster and easier cross validation and model robustness checks
url https://hdl.handle.net/1721.1/143247
work_keys_str_mv AT stephensonwilliamt fasterandeasiercrossvalidationandmodelrobustnesschecks