Digital medicine and the curse of dimensionality
Abstract Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among othe...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2021-10-01
|
Series: | npj Digital Medicine |
Online Access: | https://doi.org/10.1038/s41746-021-00521-5 |
_version_ | 1827614699446861824 |
---|---|
author | Visar Berisha Chelsea Krantsevich P. Richard Hahn Shira Hahn Gautam Dasarathy Pavan Turaga Julie Liss |
author_facet | Visar Berisha Chelsea Krantsevich P. Richard Hahn Shira Hahn Gautam Dasarathy Pavan Turaga Julie Liss |
author_sort | Visar Berisha |
collection | DOAJ |
description | Abstract Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients’ lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting—their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models—a phenomenon known as “the curse of dimensionality” in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers. |
first_indexed | 2024-03-09T08:58:31Z |
format | Article |
id | doaj.art-c8a58a5ae25641eaac4a654dd1212a4d |
institution | Directory Open Access Journal |
issn | 2398-6352 |
language | English |
last_indexed | 2024-03-09T08:58:31Z |
publishDate | 2021-10-01 |
publisher | Nature Portfolio |
record_format | Article |
series | npj Digital Medicine |
spelling | doaj.art-c8a58a5ae25641eaac4a654dd1212a4d2023-12-02T12:23:00ZengNature Portfolionpj Digital Medicine2398-63522021-10-01411810.1038/s41746-021-00521-5Digital medicine and the curse of dimensionalityVisar Berisha0Chelsea Krantsevich1P. Richard Hahn2Shira Hahn3Gautam Dasarathy4Pavan Turaga5Julie Liss6School of Electrical Computer and Energy Engineering, Arizona State UniversityAural AnalyticsSchool of Mathematical and Statistical Sciences, Arizona State UniversityCollege of Health Solutions, Arizona State UniversitySchool of Electrical Computer and Energy Engineering, Arizona State UniversitySchool of Electrical Computer and Energy Engineering, Arizona State UniversityCollege of Health Solutions, Arizona State UniversityAbstract Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients’ lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting—their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models—a phenomenon known as “the curse of dimensionality” in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers.https://doi.org/10.1038/s41746-021-00521-5 |
spellingShingle | Visar Berisha Chelsea Krantsevich P. Richard Hahn Shira Hahn Gautam Dasarathy Pavan Turaga Julie Liss Digital medicine and the curse of dimensionality npj Digital Medicine |
title | Digital medicine and the curse of dimensionality |
title_full | Digital medicine and the curse of dimensionality |
title_fullStr | Digital medicine and the curse of dimensionality |
title_full_unstemmed | Digital medicine and the curse of dimensionality |
title_short | Digital medicine and the curse of dimensionality |
title_sort | digital medicine and the curse of dimensionality |
url | https://doi.org/10.1038/s41746-021-00521-5 |
work_keys_str_mv | AT visarberisha digitalmedicineandthecurseofdimensionality AT chelseakrantsevich digitalmedicineandthecurseofdimensionality AT prichardhahn digitalmedicineandthecurseofdimensionality AT shirahahn digitalmedicineandthecurseofdimensionality AT gautamdasarathy digitalmedicineandthecurseofdimensionality AT pavanturaga digitalmedicineandthecurseofdimensionality AT julieliss digitalmedicineandthecurseofdimensionality |