New interpretable machine learning techniques and an application to stroke prediction in atrial fibrillation patients

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019

Bibliographic Details
Main Author: Yang, Hongyu,Ph. D.Massachusetts Institute of Technology.
Other Authors: Cynthia Rudin.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2020
Subjects:
Online Access:https://hdl.handle.net/1721.1/124095
_version_ 1826206686968283136
author Yang, Hongyu,Ph. D.Massachusetts Institute of Technology.
author2 Cynthia Rudin.
author_facet Cynthia Rudin.
Yang, Hongyu,Ph. D.Massachusetts Institute of Technology.
author_sort Yang, Hongyu,Ph. D.Massachusetts Institute of Technology.
collection MIT
description Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
first_indexed 2024-09-23T13:37:00Z
format Thesis
id mit-1721.1/124095
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T13:37:00Z
publishDate 2020
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1240952020-03-10T03:14:37Z New interpretable machine learning techniques and an application to stroke prediction in atrial fibrillation patients Yang, Hongyu,Ph. D.Massachusetts Institute of Technology. Cynthia Rudin. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Electrical Engineering and Computer Science. Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 Cataloged from PDF version of thesis. Includes bibliographical references (pages 117-125). Building interpretable and accurate models are attracting more and more interest in the machine learning community. In this thesis, we developed an interpretable machine learning algorithm called SBRL and we built an interpretable and statistically more accurate model for predicting strokes for patients in atrial fabrication (AF) who have not had a prior history of stroke and who are not taking anticoagulants. The first part of the thesis presents an interpretable machine learning algorithm that can be used as an alternative algorithm to the decision tree algorithm. Our algorithm builds an optimized rules list model from data by maximizing the posterior probability of a natural hierarchical generative model. It has the form of chained IF-THEN clauses which is simple for a human to follow and derive its prediction by hand. We developed two theoretical bounds for the algorithm. One for the length of the optimal rules list model; and the other for the upper bounds of the posterior probability of the optimized rules list given its prefixes. We thoroughly tested our algorithm against other interpretable and non-interpretable machine learning algorithms across multiple public datasets, in terms of interpretability, computational speed, and accuracy. Our algorithm strikes a balance among these metrics. The second part of the thesis presents how we used the ATRIA2-CVRN study cohort to build a stroke prediction model that is as simple as but statistically significantly more accurate than the stroke models in wide use, such as the CHA₂DS₂-VASc and ATRIA scores, for patients in AF who are not taking anticoagulants. We focused on the more challenging problem of primary prevention. We assessed the strengths of predictors and identified informative predictors not used in existing stroke models. We created a univariate stroke model using the most informative predictor age and achieved statistically significantly better performance than CHA₂DS₂-VASc and similar performance as ATRIA. We used various machine learning models to test the limit of the information that can be extracted from the data. We built a linear model with optimized integer coefficients using RiskSLIM. We used SBRL to generate simple-yet-accurate representations for high-risk patients who should be recommended anticoagulants. by Hongyu Yang. Ph. D. Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science 2020-03-09T18:53:17Z 2020-03-09T18:53:17Z 2019 2019 Thesis https://hdl.handle.net/1721.1/124095 1142633819 eng MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582 125 pages application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Yang, Hongyu,Ph. D.Massachusetts Institute of Technology.
New interpretable machine learning techniques and an application to stroke prediction in atrial fibrillation patients
title New interpretable machine learning techniques and an application to stroke prediction in atrial fibrillation patients
title_full New interpretable machine learning techniques and an application to stroke prediction in atrial fibrillation patients
title_fullStr New interpretable machine learning techniques and an application to stroke prediction in atrial fibrillation patients
title_full_unstemmed New interpretable machine learning techniques and an application to stroke prediction in atrial fibrillation patients
title_short New interpretable machine learning techniques and an application to stroke prediction in atrial fibrillation patients
title_sort new interpretable machine learning techniques and an application to stroke prediction in atrial fibrillation patients
topic Electrical Engineering and Computer Science.
url https://hdl.handle.net/1721.1/124095
work_keys_str_mv AT yanghongyuphdmassachusettsinstituteoftechnology newinterpretablemachinelearningtechniquesandanapplicationtostrokepredictioninatrialfibrillationpatients