Latent Patient Cluster Discovery for Robust Future Forecasting and New-Patient Generalization.

Commonly referred to as predictive modeling, the use of machine learning and statistical methods to improve healthcare outcomes has recently gained traction in biomedical informatics research. Given the vast opportunities enabled by large Electronic Health Records (EHR) data and powerful resources f...

Full description

Bibliographic Details
Main Authors:	Ting Qian, Aaron J Masino
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2016-01-01
Series:	PLoS ONE
Online Access:	http://europepmc.org/articles/PMC5026362?pdf=render

_version_	1819100464231219200
author	Ting Qian Aaron J Masino
author_facet	Ting Qian Aaron J Masino
author_sort	Ting Qian
collection	DOAJ
description	Commonly referred to as predictive modeling, the use of machine learning and statistical methods to improve healthcare outcomes has recently gained traction in biomedical informatics research. Given the vast opportunities enabled by large Electronic Health Records (EHR) data and powerful resources for conducting predictive modeling, we argue that it is yet crucial to first carefully examine the prediction task and then choose predictive methods accordingly. Specifically, we argue that there are at least three distinct prediction tasks that are often conflated in biomedical research: 1) data imputation, where a model fills in the missing values in a dataset, 2) future forecasting, where a model projects the development of a medical condition for a known patient based on existing observations, and 3) new-patient generalization, where a model transfers the knowledge learned from previously observed patients to newly encountered ones. Importantly, the latter two tasks-future forecasting and new-patient generalizations-tend to be more difficult than data imputation as they require predictions to be made on potentially out-of-sample data (i.e., data following a different predictable pattern from what has been learned by the model). Using hearing loss progression as an example, we investigate three regression models and show that the modeling of latent clusters is a robust method for addressing the more challenging prediction scenarios. Overall, our findings suggest that there exist significant differences between various kinds of prediction tasks and that it is important to evaluate the merits of a predictive model relative to the specific purpose of a prediction task.
first_indexed	2024-12-22T01:03:11Z
format	Article
id	doaj.art-60abf109c36c400d8d05f87ba417b366
institution	Directory Open Access Journal
issn	1932-6203
language	English
last_indexed	2024-12-22T01:03:11Z
publishDate	2016-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj.art-60abf109c36c400d8d05f87ba417b3662022-12-21T18:44:09ZengPublic Library of Science (PLoS)PLoS ONE1932-62032016-01-01119e016281210.1371/journal.pone.0162812Latent Patient Cluster Discovery for Robust Future Forecasting and New-Patient Generalization.Ting QianAaron J MasinoCommonly referred to as predictive modeling, the use of machine learning and statistical methods to improve healthcare outcomes has recently gained traction in biomedical informatics research. Given the vast opportunities enabled by large Electronic Health Records (EHR) data and powerful resources for conducting predictive modeling, we argue that it is yet crucial to first carefully examine the prediction task and then choose predictive methods accordingly. Specifically, we argue that there are at least three distinct prediction tasks that are often conflated in biomedical research: 1) data imputation, where a model fills in the missing values in a dataset, 2) future forecasting, where a model projects the development of a medical condition for a known patient based on existing observations, and 3) new-patient generalization, where a model transfers the knowledge learned from previously observed patients to newly encountered ones. Importantly, the latter two tasks-future forecasting and new-patient generalizations-tend to be more difficult than data imputation as they require predictions to be made on potentially out-of-sample data (i.e., data following a different predictable pattern from what has been learned by the model). Using hearing loss progression as an example, we investigate three regression models and show that the modeling of latent clusters is a robust method for addressing the more challenging prediction scenarios. Overall, our findings suggest that there exist significant differences between various kinds of prediction tasks and that it is important to evaluate the merits of a predictive model relative to the specific purpose of a prediction task.http://europepmc.org/articles/PMC5026362?pdf=render
spellingShingle	Ting Qian Aaron J Masino Latent Patient Cluster Discovery for Robust Future Forecasting and New-Patient Generalization. PLoS ONE
title	Latent Patient Cluster Discovery for Robust Future Forecasting and New-Patient Generalization.
title_full	Latent Patient Cluster Discovery for Robust Future Forecasting and New-Patient Generalization.
title_fullStr	Latent Patient Cluster Discovery for Robust Future Forecasting and New-Patient Generalization.
title_full_unstemmed	Latent Patient Cluster Discovery for Robust Future Forecasting and New-Patient Generalization.
title_short	Latent Patient Cluster Discovery for Robust Future Forecasting and New-Patient Generalization.
title_sort	latent patient cluster discovery for robust future forecasting and new patient generalization
url	http://europepmc.org/articles/PMC5026362?pdf=render
work_keys_str_mv	AT tingqian latentpatientclusterdiscoveryforrobustfutureforecastingandnewpatientgeneralization AT aaronjmasino latentpatientclusterdiscoveryforrobustfutureforecastingandnewpatientgeneralization

Latent Patient Cluster Discovery for Robust Future Forecasting and New-Patient Generalization.

Similar Items