Predicting human health from biofluid-based metabolomics using machine learning

© 2020, The Author(s). Biofluid-based metabolomics has the potential to provide highly accurate, minimally invasive diagnostics. Metabolomics studies using mass spectrometry typically reduce the high-dimensional data to only a small number of statistically significant features, that are often chemic...

Full beskrivning

Bibliografiska uppgifter
Huvudupphovsmän: Evans, Ethan D, Duvallet, Claire, Chu, Nathaniel D, Oberst, Michael K, Murphy, Michael A, Rockafellow, Isaac, Sontag, David, Alm, Eric J
Övriga upphovsmän: Massachusetts Institute of Technology. Department of Biological Engineering
Materialtyp: Artikel
Språk:English
Publicerad: Springer Science and Business Media LLC 2021
Länkar:https://hdl.handle.net/1721.1/133758
_version_ 1826214987680448512
author Evans, Ethan D
Duvallet, Claire
Chu, Nathaniel D
Oberst, Michael K
Murphy, Michael A
Rockafellow, Isaac
Sontag, David
Alm, Eric J
author2 Massachusetts Institute of Technology. Department of Biological Engineering
author_facet Massachusetts Institute of Technology. Department of Biological Engineering
Evans, Ethan D
Duvallet, Claire
Chu, Nathaniel D
Oberst, Michael K
Murphy, Michael A
Rockafellow, Isaac
Sontag, David
Alm, Eric J
author_sort Evans, Ethan D
collection MIT
description © 2020, The Author(s). Biofluid-based metabolomics has the potential to provide highly accurate, minimally invasive diagnostics. Metabolomics studies using mass spectrometry typically reduce the high-dimensional data to only a small number of statistically significant features, that are often chemically identified—where each feature corresponds to a mass-to-charge ratio, retention time, and intensity. This practice may remove a substantial amount of predictive signal. To test the utility of the complete feature set, we train machine learning models for health state-prediction in 35 human metabolomics studies, representing 148 individual data sets. Models trained with all features outperform those using only significant features and frequently provide high predictive performance across nine health state categories, despite disparate experimental and disease contexts. Using only non-significant features it is still often possible to train models and achieve high predictive performance, suggesting useful predictive signal. This work highlights the potential for health state diagnostics using all metabolomics features with data-driven analysis.
first_indexed 2024-09-23T16:14:28Z
format Article
id mit-1721.1/133758
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T16:14:28Z
publishDate 2021
publisher Springer Science and Business Media LLC
record_format dspace
spelling mit-1721.1/1337582023-01-20T21:34:24Z Predicting human health from biofluid-based metabolomics using machine learning Evans, Ethan D Duvallet, Claire Chu, Nathaniel D Oberst, Michael K Murphy, Michael A Rockafellow, Isaac Sontag, David Alm, Eric J Massachusetts Institute of Technology. Department of Biological Engineering Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science © 2020, The Author(s). Biofluid-based metabolomics has the potential to provide highly accurate, minimally invasive diagnostics. Metabolomics studies using mass spectrometry typically reduce the high-dimensional data to only a small number of statistically significant features, that are often chemically identified—where each feature corresponds to a mass-to-charge ratio, retention time, and intensity. This practice may remove a substantial amount of predictive signal. To test the utility of the complete feature set, we train machine learning models for health state-prediction in 35 human metabolomics studies, representing 148 individual data sets. Models trained with all features outperform those using only significant features and frequently provide high predictive performance across nine health state categories, despite disparate experimental and disease contexts. Using only non-significant features it is still often possible to train models and achieve high predictive performance, suggesting useful predictive signal. This work highlights the potential for health state diagnostics using all metabolomics features with data-driven analysis. 2021-10-27T19:56:29Z 2021-10-27T19:56:29Z 2020 2021-02-02T17:20:14Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/133758 en 10.1038/s41598-020-74823-1 Scientific Reports Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf Springer Science and Business Media LLC Scientific Reports
spellingShingle Evans, Ethan D
Duvallet, Claire
Chu, Nathaniel D
Oberst, Michael K
Murphy, Michael A
Rockafellow, Isaac
Sontag, David
Alm, Eric J
Predicting human health from biofluid-based metabolomics using machine learning
title Predicting human health from biofluid-based metabolomics using machine learning
title_full Predicting human health from biofluid-based metabolomics using machine learning
title_fullStr Predicting human health from biofluid-based metabolomics using machine learning
title_full_unstemmed Predicting human health from biofluid-based metabolomics using machine learning
title_short Predicting human health from biofluid-based metabolomics using machine learning
title_sort predicting human health from biofluid based metabolomics using machine learning
url https://hdl.handle.net/1721.1/133758
work_keys_str_mv AT evansethand predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning
AT duvalletclaire predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning
AT chunathanield predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning
AT oberstmichaelk predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning
AT murphymichaela predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning
AT rockafellowisaac predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning
AT sontagdavid predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning
AT almericj predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning