Knowledge Distillation for Interpretable Clinical Time Series Outcome Prediction

A common machine learning task in healthcare is to predict a patient’s final outcome given their history of vitals and treatments. For example, sepsis is a life-threatening condition that happens when the body has an extreme response to an infection. Treating sepsis is a complicated process, and we...

Full description

Bibliographic Details
Main Author:	Wong, Anna
Other Authors:	Mark, Roger G.
Format:	Thesis
Published:	Massachusetts Institute of Technology 2023
Online Access:	https://hdl.handle.net/1721.1/151355

_version_	1826196640615104512
author	Wong, Anna
author2	Mark, Roger G.
author_facet	Mark, Roger G. Wong, Anna
author_sort	Wong, Anna
collection	MIT
description	A common machine learning task in healthcare is to predict a patient’s final outcome given their history of vitals and treatments. For example, sepsis is a life-threatening condition that happens when the body has an extreme response to an infection. Treating sepsis is a complicated process, and we are interested in being able to predict a sepsis patient’s final outcome. Neural networks are a powerful model to make accurate predictions on such outcomes, but a major drawback of these models is that they are not interpretable. Being able to accurately predict treatment outcomes while also being able to understand the model’s predictions is necessary for these models and algorithms to be used in the real world. In this thesis, we use knowledge distillation, which is a technique for taking a model with high predictive power (known as the "teacher model"), and using it to train a model that has other desirable traits such as interpretability (known as the "student model"). For our teacher model, we use an LSTM, which is a type of neural network, to predict mortality for sepsis patients, given information about their recent history of vital signs and treatments. For our student model, we use an autoregressive hidden Markov model to learn interpretable hidden states. To incorporate the knowledge from the teacher model into the student model, we use a similarity-based constraint. We evaluate a method from a previous work that uses variational inference to learn the hidden states, and also develop and evaluate an alternative approach that uses the expectation-maximization algorithm. We analyze the interpretability of the learned states. Our results show that, although there is room for improvement in maintaining the generative performance of the model after adding the similarity constraint, the expectation-maximization algorithm is successful in incorporating the constraint to achieve high predictive power similar to the teacher model, along with better interpretability when compared to the teacher model.
first_indexed	2024-09-23T10:31:01Z
format	Thesis
id	mit-1721.1/151355
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T10:31:01Z
publishDate	2023
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1513552023-08-01T04:03:25Z Knowledge Distillation for Interpretable Clinical Time Series Outcome Prediction Wong, Anna Mark, Roger G. Lehman, Li-wei Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science A common machine learning task in healthcare is to predict a patient’s final outcome given their history of vitals and treatments. For example, sepsis is a life-threatening condition that happens when the body has an extreme response to an infection. Treating sepsis is a complicated process, and we are interested in being able to predict a sepsis patient’s final outcome. Neural networks are a powerful model to make accurate predictions on such outcomes, but a major drawback of these models is that they are not interpretable. Being able to accurately predict treatment outcomes while also being able to understand the model’s predictions is necessary for these models and algorithms to be used in the real world. In this thesis, we use knowledge distillation, which is a technique for taking a model with high predictive power (known as the "teacher model"), and using it to train a model that has other desirable traits such as interpretability (known as the "student model"). For our teacher model, we use an LSTM, which is a type of neural network, to predict mortality for sepsis patients, given information about their recent history of vital signs and treatments. For our student model, we use an autoregressive hidden Markov model to learn interpretable hidden states. To incorporate the knowledge from the teacher model into the student model, we use a similarity-based constraint. We evaluate a method from a previous work that uses variational inference to learn the hidden states, and also develop and evaluate an alternative approach that uses the expectation-maximization algorithm. We analyze the interpretability of the learned states. Our results show that, although there is room for improvement in maintaining the generative performance of the model after adding the similarity constraint, the expectation-maximization algorithm is successful in incorporating the constraint to achieve high predictive power similar to the teacher model, along with better interpretability when compared to the teacher model. M.Eng. 2023-07-31T19:33:40Z 2023-07-31T19:33:40Z 2023-06 2023-06-06T16:35:06.220Z Thesis https://hdl.handle.net/1721.1/151355 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Wong, Anna Knowledge Distillation for Interpretable Clinical Time Series Outcome Prediction
title	Knowledge Distillation for Interpretable Clinical Time Series Outcome Prediction
title_full	Knowledge Distillation for Interpretable Clinical Time Series Outcome Prediction
title_fullStr	Knowledge Distillation for Interpretable Clinical Time Series Outcome Prediction
title_full_unstemmed	Knowledge Distillation for Interpretable Clinical Time Series Outcome Prediction
title_short	Knowledge Distillation for Interpretable Clinical Time Series Outcome Prediction
title_sort	knowledge distillation for interpretable clinical time series outcome prediction
url	https://hdl.handle.net/1721.1/151355
work_keys_str_mv	AT wonganna knowledgedistillationforinterpretableclinicaltimeseriesoutcomeprediction

Knowledge Distillation for Interpretable Clinical Time Series Outcome Prediction

Similar Items