Digital medicine and the curse of dimensionality

Abstract Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among othe...

Full description

Bibliographic Details
Main Authors: Visar Berisha, Chelsea Krantsevich, P. Richard Hahn, Shira Hahn, Gautam Dasarathy, Pavan Turaga, Julie Liss
Format: Article
Language:English
Published: Nature Portfolio 2021-10-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-021-00521-5
_version_ 1827614699446861824
author Visar Berisha
Chelsea Krantsevich
P. Richard Hahn
Shira Hahn
Gautam Dasarathy
Pavan Turaga
Julie Liss
author_facet Visar Berisha
Chelsea Krantsevich
P. Richard Hahn
Shira Hahn
Gautam Dasarathy
Pavan Turaga
Julie Liss
author_sort Visar Berisha
collection DOAJ
description Abstract Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients’ lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting—their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models—a phenomenon known as “the curse of dimensionality” in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers.
first_indexed 2024-03-09T08:58:31Z
format Article
id doaj.art-c8a58a5ae25641eaac4a654dd1212a4d
institution Directory Open Access Journal
issn 2398-6352
language English
last_indexed 2024-03-09T08:58:31Z
publishDate 2021-10-01
publisher Nature Portfolio
record_format Article
series npj Digital Medicine
spelling doaj.art-c8a58a5ae25641eaac4a654dd1212a4d2023-12-02T12:23:00ZengNature Portfolionpj Digital Medicine2398-63522021-10-01411810.1038/s41746-021-00521-5Digital medicine and the curse of dimensionalityVisar Berisha0Chelsea Krantsevich1P. Richard Hahn2Shira Hahn3Gautam Dasarathy4Pavan Turaga5Julie Liss6School of Electrical Computer and Energy Engineering, Arizona State UniversityAural AnalyticsSchool of Mathematical and Statistical Sciences, Arizona State UniversityCollege of Health Solutions, Arizona State UniversitySchool of Electrical Computer and Energy Engineering, Arizona State UniversitySchool of Electrical Computer and Energy Engineering, Arizona State UniversityCollege of Health Solutions, Arizona State UniversityAbstract Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients’ lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting—their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models—a phenomenon known as “the curse of dimensionality” in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers.https://doi.org/10.1038/s41746-021-00521-5
spellingShingle Visar Berisha
Chelsea Krantsevich
P. Richard Hahn
Shira Hahn
Gautam Dasarathy
Pavan Turaga
Julie Liss
Digital medicine and the curse of dimensionality
npj Digital Medicine
title Digital medicine and the curse of dimensionality
title_full Digital medicine and the curse of dimensionality
title_fullStr Digital medicine and the curse of dimensionality
title_full_unstemmed Digital medicine and the curse of dimensionality
title_short Digital medicine and the curse of dimensionality
title_sort digital medicine and the curse of dimensionality
url https://doi.org/10.1038/s41746-021-00521-5
work_keys_str_mv AT visarberisha digitalmedicineandthecurseofdimensionality
AT chelseakrantsevich digitalmedicineandthecurseofdimensionality
AT prichardhahn digitalmedicineandthecurseofdimensionality
AT shirahahn digitalmedicineandthecurseofdimensionality
AT gautamdasarathy digitalmedicineandthecurseofdimensionality
AT pavanturaga digitalmedicineandthecurseofdimensionality
AT julieliss digitalmedicineandthecurseofdimensionality