Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data

Thesis: S.M. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.

Bibliographic Details
Main Author:	Gong, Jen J. (Jen Jian)
Other Authors:	John V. Guttag.
Format:	Thesis
Language:	eng
Published:	Massachusetts Institute of Technology 2014
Subjects:	Electrical Engineering and Computer Science.
Online Access:	http://hdl.handle.net/1721.1/91090

_version_	1811090420826898432
author	Gong, Jen J. (Jen Jian)
author2	John V. Guttag.
author_facet	John V. Guttag. Gong, Jen J. (Jen Jian)
author_sort	Gong, Jen J. (Jen Jian)
collection	MIT
description	Thesis: S.M. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.
first_indexed	2024-09-23T14:45:22Z
format	Thesis
id	mit-1721.1/91090
institution	Massachusetts Institute of Technology
language	eng
last_indexed	2024-09-23T14:45:22Z
publishDate	2014
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/910902019-04-10T21:57:06Z Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data Instance-transfer for selecting relevant training data Gong, Jen J. (Jen Jian) John V. Guttag. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: S.M. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014. 52 Cataloged from PDF version of thesis. Includes bibliographical references (pages 66-71). One of the primary problems in constructing risk-stratification models for medical applications is that the data are often noisy, incomplete, and suffer from high class-imbalance. This problem becomes more severe when the total amount of data relevant to the task of interest is small. We address this problem in the context of risk-stratifying patients receiving isolated surgical aortic valve replacements (isolated AVR) for the adverse outcomes of operative mortality and stroke. We work with data from two hospitals (Hospital 1 and Hospital 2) in the Society of Thoracic Surgeons (STS) Adult Cardiac Surgery Database. Because the data available for our application of interest (target data) are limited, developing an accurate model using only these data is infeasible. Instead, we investigate transfer learning approaches to utilize data from other cardiac surgery procedures as well as from other institutions (source data). We first evaluate the effectiveness of leveraging information across procedures within a single hospital. We achieve significant improvements over baseline: at Hospital 1, the average AUC for operative mortality increased from 0.58 to 0.70. However, not all source examples are equally useful. Next, we evaluate the effectiveness of leveraging data across hospitals. We show that leveraging information across hospitals has variable utility; although it can result in worse performance (average AUC for stroke at Hospital 1 dropped from 0.61 to 0.56), it can also lead to significant improvements (average AUC for operative mortality at Hospital 1 increased from 0.70 to 0.72). Finally, we present an automated approach to leveraging the available source data. We investigate how removing source data based on how far they are from the mean of the target data affects performance. We propose an instance-weighting scheme based on these distances. This automated instance-weighting approach can achieve small, but significant improvements over using all of the data without weights (average AUC for operative mortality at Hospital 1 increased from 0.72 to 0.73). Research on these methods can have an important impact on the development of clinical risk-stratification tools targeted towards specific patient populations. by Jen J. Gong. S.M. in Computer Science and Engineering 2014-10-21T17:25:32Z 2014-10-21T17:25:32Z 2014 2014 Thesis http://hdl.handle.net/1721.1/91090 892724540 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 71 pages application/pdf Massachusetts Institute of Technology
spellingShingle	Electrical Engineering and Computer Science. Gong, Jen J. (Jen Jian) Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title	Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title_full	Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title_fullStr	Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title_full_unstemmed	Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title_short	Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data
title_sort	improving clinical risk stratification tools instance transfer for selecting relevant training data
topic	Electrical Engineering and Computer Science.
url	http://hdl.handle.net/1721.1/91090
work_keys_str_mv	AT gongjenjjenjian improvingclinicalriskstratificationtoolsinstancetransferforselectingrelevanttrainingdata AT gongjenjjenjian instancetransferforselectingrelevanttrainingdata

Improving clinical risk-stratification tools : instance-transfer for selecting relevant training data

Similar Items