Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data

Thesis: S.M. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.

Bibliographic Details
Main Author: Gong, Jen J. (Jen Jian)
Other Authors: John V. Guttag.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2014
Subjects:
Online Access:http://hdl.handle.net/1721.1/91090
_version_ 1811090420826898432
author Gong, Jen J. (Jen Jian)
author2 John V. Guttag.
author_facet John V. Guttag.
Gong, Jen J. (Jen Jian)
author_sort Gong, Jen J. (Jen Jian)
collection MIT
description Thesis: S.M. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.
first_indexed 2024-09-23T14:45:22Z
format Thesis
id mit-1721.1/91090
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T14:45:22Z
publishDate 2014
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/910902019-04-10T21:57:06Z Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data Instance-transfer for selecting relevant training data Gong, Jen J. (Jen Jian) John V. Guttag. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: S.M. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014. 52 Cataloged from PDF version of thesis. Includes bibliographical references (pages 66-71). One of the primary problems in constructing risk-stratification models for medical applications is that the data are often noisy, incomplete, and suffer from high class-imbalance. This problem becomes more severe when the total amount of data relevant to the task of interest is small. We address this problem in the context of risk-stratifying patients receiving isolated surgical aortic valve replacements (isolated AVR) for the adverse outcomes of operative mortality and stroke. We work with data from two hospitals (Hospital 1 and Hospital 2) in the Society of Thoracic Surgeons (STS) Adult Cardiac Surgery Database. Because the data available for our application of interest (target data) are limited, developing an accurate model using only these data is infeasible. Instead, we investigate transfer learning approaches to utilize data from other cardiac surgery procedures as well as from other institutions (source data). We first evaluate the effectiveness of leveraging information across procedures within a single hospital. We achieve significant improvements over baseline: at Hospital 1, the average AUC for operative mortality increased from 0.58 to 0.70. However, not all source examples are equally useful. Next, we evaluate the effectiveness of leveraging data across hospitals. We show that leveraging information across hospitals has variable utility; although it can result in worse performance (average AUC for stroke at Hospital 1 dropped from 0.61 to 0.56), it can also lead to significant improvements (average AUC for operative mortality at Hospital 1 increased from 0.70 to 0.72). Finally, we present an automated approach to leveraging the available source data. We investigate how removing source data based on how far they are from the mean of the target data affects performance. We propose an instance-weighting scheme based on these distances. This automated instance-weighting approach can achieve small, but significant improvements over using all of the data without weights (average AUC for operative mortality at Hospital 1 increased from 0.72 to 0.73). Research on these methods can have an important impact on the development of clinical risk-stratification tools targeted towards specific patient populations. by Jen J. Gong. S.M. in Computer Science and Engineering 2014-10-21T17:25:32Z 2014-10-21T17:25:32Z 2014 2014 Thesis http://hdl.handle.net/1721.1/91090 892724540 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 71 pages application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Gong, Jen J. (Jen Jian)
Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title_full Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title_fullStr Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title_full_unstemmed Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title_short Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title_sort improving clinical risk stratification tools instance transfer for selecting relevant training data
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/91090
work_keys_str_mv AT gongjenjjenjian improvingclinicalriskstratificationtoolsinstancetransferforselectingrelevanttrainingdata
AT gongjenjjenjian instancetransferforselectingrelevanttrainingdata