Latent topic discovery of clinical concepts from hospital discharge summaries of a heterogeneous patient cohort

Patients in critical care often exhibit complex disease patterns. A fundamental challenge in clinical research is to identify clinical features that may be characteristic of adverse patient outcomes. In this work, we propose a data-driven approach for phenotype discovery of patients in critical care...

Full description

Bibliographic Details
Main Authors: Saeed, Mohammed, Lehman, Li-Wei, Long, William F., Mark, Roger G
Other Authors: Institute for Medical Engineering and Science
Format: Article
Published: Institute of Electrical and Electronics Engineers (IEEE) 2017
Online Access:http://hdl.handle.net/1721.1/112805
https://orcid.org/0000-0002-6318-2978
Description
Summary:Patients in critical care often exhibit complex disease patterns. A fundamental challenge in clinical research is to identify clinical features that may be characteristic of adverse patient outcomes. In this work, we propose a data-driven approach for phenotype discovery of patients in critical care. We used Hierarchical Dirichlet Process (HDP) as a non-parametric topic modeling technique to automatically discover the latent "topic" structure of diseases, symptoms, and findings documented in hospital discharge summaries. We show that the latent topic structure can be used to reveal phenotypic patterns of diseases and symptoms shared across subgroups of a patient cohort, and may contain prognostic value in stratifying patients' post hospital discharge mortality risks. Using discharge summaries of a large patient cohort from the MIMIC II database, we evaluate the clinical utility of the discovered topic structure in identifying patients who are at high risk of mortality within one year post hospital discharge. We demonstrate that the learned topic structure has statistically significant associations with mortality post hospital discharge, and may provide valuable insights in defining new feature sets for predicting patient outcomes.