Patient contrastive learning: A performant, expressive, and practical approach to electrocardiogram modeling

<jats:p>Supervised machine learning applications in health care are often limited due to a scarcity of labeled training data. To mitigate the effect of small sample size, we introduce a pre-training approach, <jats:bold>P</jats:bold>atient <jats:bold>C</jats:bold>ontras...

Full description

Bibliographic Details
Main Authors:	Diamant, Nathaniel, Reinertsen, Erik, Song, Steven, Aguirre, Aaron D, Stultz, Collin M, Batra, Puneet
Other Authors:	Massachusetts Institute of Technology. Research Laboratory of Electronics
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2022
Online Access:	https://hdl.handle.net/1721.1/143901

_version_	1826214546606391296
author	Diamant, Nathaniel Reinertsen, Erik Song, Steven Aguirre, Aaron D Stultz, Collin M Batra, Puneet
author2	Massachusetts Institute of Technology. Research Laboratory of Electronics
author_facet	Massachusetts Institute of Technology. Research Laboratory of Electronics Diamant, Nathaniel Reinertsen, Erik Song, Steven Aguirre, Aaron D Stultz, Collin M Batra, Puneet
author_sort	Diamant, Nathaniel
collection	MIT
description	<jats:p>Supervised machine learning applications in health care are often limited due to a scarcity of labeled training data. To mitigate the effect of small sample size, we introduce a pre-training approach, <jats:bold>P</jats:bold>atient <jats:bold>C</jats:bold>ontrastive <jats:bold>L</jats:bold>earning of <jats:bold>R</jats:bold>epresentations (PCLR), which creates latent representations of electrocardiograms (ECGs) from a large number of unlabeled examples using contrastive learning. The resulting representations are expressive, performant, and practical across a wide spectrum of clinical tasks. We develop PCLR using a large health care system with over 3.2 million 12-lead ECGs and demonstrate that training linear models on PCLR representations achieves a 51% performance increase, on average, over six training set sizes and four tasks (sex classification, age regression, and the detection of left ventricular hypertrophy and atrial fibrillation), relative to training neural network models from scratch. We also compared PCLR to three other ECG pre-training approaches (supervised pre-training, unsupervised pre-training with an autoencoder, and pre-training using a contrastive multi ECG-segment approach), and show significant performance benefits in three out of four tasks. We found an average performance benefit of 47% over the other models and an average of a 9% performance benefit compared to best model for each task. We release PCLR to enable others to extract ECG representations at <jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/broadinstitute/ml4h/tree/master/model_zoo/PCLR" xlink:type="simple">https://github.com/broadinstitute/ml4h/tree/master/model_zoo/PCLR</jats:ext-link>.</jats:p>
first_indexed	2024-09-23T16:07:31Z
format	Article
id	mit-1721.1/143901
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T16:07:31Z
publishDate	2022
publisher	Public Library of Science (PLoS)
record_format	dspace
spelling	mit-1721.1/1439012023-01-18T20:31:47Z Patient contrastive learning: A performant, expressive, and practical approach to electrocardiogram modeling Diamant, Nathaniel Reinertsen, Erik Song, Steven Aguirre, Aaron D Stultz, Collin M Batra, Puneet Massachusetts Institute of Technology. Research Laboratory of Electronics Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Harvard University--MIT Division of Health Sciences and Technology <jats:p>Supervised machine learning applications in health care are often limited due to a scarcity of labeled training data. To mitigate the effect of small sample size, we introduce a pre-training approach, <jats:bold>P</jats:bold>atient <jats:bold>C</jats:bold>ontrastive <jats:bold>L</jats:bold>earning of <jats:bold>R</jats:bold>epresentations (PCLR), which creates latent representations of electrocardiograms (ECGs) from a large number of unlabeled examples using contrastive learning. The resulting representations are expressive, performant, and practical across a wide spectrum of clinical tasks. We develop PCLR using a large health care system with over 3.2 million 12-lead ECGs and demonstrate that training linear models on PCLR representations achieves a 51% performance increase, on average, over six training set sizes and four tasks (sex classification, age regression, and the detection of left ventricular hypertrophy and atrial fibrillation), relative to training neural network models from scratch. We also compared PCLR to three other ECG pre-training approaches (supervised pre-training, unsupervised pre-training with an autoencoder, and pre-training using a contrastive multi ECG-segment approach), and show significant performance benefits in three out of four tasks. We found an average performance benefit of 47% over the other models and an average of a 9% performance benefit compared to best model for each task. We release PCLR to enable others to extract ECG representations at <jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/broadinstitute/ml4h/tree/master/model_zoo/PCLR" xlink:type="simple">https://github.com/broadinstitute/ml4h/tree/master/model_zoo/PCLR</jats:ext-link>.</jats:p> 2022-07-20T17:33:16Z 2022-07-20T17:33:16Z 2022 2022-07-20T17:03:37Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/143901 Diamant, Nathaniel, Reinertsen, Erik, Song, Steven, Aguirre, Aaron D, Stultz, Collin M et al. 2022. "Patient contrastive learning: A performant, expressive, and practical approach to electrocardiogram modeling." PLoS Computational Biology, 18 (2). en 10.1371/JOURNAL.PCBI.1009862 PLoS Computational Biology Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf Public Library of Science (PLoS) PLoS
spellingShingle	Diamant, Nathaniel Reinertsen, Erik Song, Steven Aguirre, Aaron D Stultz, Collin M Batra, Puneet Patient contrastive learning: A performant, expressive, and practical approach to electrocardiogram modeling
title	Patient contrastive learning: A performant, expressive, and practical approach to electrocardiogram modeling
title_full	Patient contrastive learning: A performant, expressive, and practical approach to electrocardiogram modeling
title_fullStr	Patient contrastive learning: A performant, expressive, and practical approach to electrocardiogram modeling
title_full_unstemmed	Patient contrastive learning: A performant, expressive, and practical approach to electrocardiogram modeling
title_short	Patient contrastive learning: A performant, expressive, and practical approach to electrocardiogram modeling
title_sort	patient contrastive learning a performant expressive and practical approach to electrocardiogram modeling
url	https://hdl.handle.net/1721.1/143901
work_keys_str_mv	AT diamantnathaniel patientcontrastivelearningaperformantexpressiveandpracticalapproachtoelectrocardiogrammodeling AT reinertsenerik patientcontrastivelearningaperformantexpressiveandpracticalapproachtoelectrocardiogrammodeling AT songsteven patientcontrastivelearningaperformantexpressiveandpracticalapproachtoelectrocardiogrammodeling AT aguirreaarond patientcontrastivelearningaperformantexpressiveandpracticalapproachtoelectrocardiogrammodeling AT stultzcollinm patientcontrastivelearningaperformantexpressiveandpracticalapproachtoelectrocardiogrammodeling AT batrapuneet patientcontrastivelearningaperformantexpressiveandpracticalapproachtoelectrocardiogrammodeling

Patient contrastive learning: A performant, expressive, and practical approach to electrocardiogram modeling

Similar Items