Predicting human health from biofluid-based metabolomics using machine learning
© 2020, The Author(s). Biofluid-based metabolomics has the potential to provide highly accurate, minimally invasive diagnostics. Metabolomics studies using mass spectrometry typically reduce the high-dimensional data to only a small number of statistically significant features, that are often chemic...
Huvudupphovsmän: | , , , , , , , |
---|---|
Övriga upphovsmän: | |
Materialtyp: | Artikel |
Språk: | English |
Publicerad: |
Springer Science and Business Media LLC
2021
|
Länkar: | https://hdl.handle.net/1721.1/133758 |
_version_ | 1826214987680448512 |
---|---|
author | Evans, Ethan D Duvallet, Claire Chu, Nathaniel D Oberst, Michael K Murphy, Michael A Rockafellow, Isaac Sontag, David Alm, Eric J |
author2 | Massachusetts Institute of Technology. Department of Biological Engineering |
author_facet | Massachusetts Institute of Technology. Department of Biological Engineering Evans, Ethan D Duvallet, Claire Chu, Nathaniel D Oberst, Michael K Murphy, Michael A Rockafellow, Isaac Sontag, David Alm, Eric J |
author_sort | Evans, Ethan D |
collection | MIT |
description | © 2020, The Author(s). Biofluid-based metabolomics has the potential to provide highly accurate, minimally invasive diagnostics. Metabolomics studies using mass spectrometry typically reduce the high-dimensional data to only a small number of statistically significant features, that are often chemically identified—where each feature corresponds to a mass-to-charge ratio, retention time, and intensity. This practice may remove a substantial amount of predictive signal. To test the utility of the complete feature set, we train machine learning models for health state-prediction in 35 human metabolomics studies, representing 148 individual data sets. Models trained with all features outperform those using only significant features and frequently provide high predictive performance across nine health state categories, despite disparate experimental and disease contexts. Using only non-significant features it is still often possible to train models and achieve high predictive performance, suggesting useful predictive signal. This work highlights the potential for health state diagnostics using all metabolomics features with data-driven analysis. |
first_indexed | 2024-09-23T16:14:28Z |
format | Article |
id | mit-1721.1/133758 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T16:14:28Z |
publishDate | 2021 |
publisher | Springer Science and Business Media LLC |
record_format | dspace |
spelling | mit-1721.1/1337582023-01-20T21:34:24Z Predicting human health from biofluid-based metabolomics using machine learning Evans, Ethan D Duvallet, Claire Chu, Nathaniel D Oberst, Michael K Murphy, Michael A Rockafellow, Isaac Sontag, David Alm, Eric J Massachusetts Institute of Technology. Department of Biological Engineering Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science © 2020, The Author(s). Biofluid-based metabolomics has the potential to provide highly accurate, minimally invasive diagnostics. Metabolomics studies using mass spectrometry typically reduce the high-dimensional data to only a small number of statistically significant features, that are often chemically identified—where each feature corresponds to a mass-to-charge ratio, retention time, and intensity. This practice may remove a substantial amount of predictive signal. To test the utility of the complete feature set, we train machine learning models for health state-prediction in 35 human metabolomics studies, representing 148 individual data sets. Models trained with all features outperform those using only significant features and frequently provide high predictive performance across nine health state categories, despite disparate experimental and disease contexts. Using only non-significant features it is still often possible to train models and achieve high predictive performance, suggesting useful predictive signal. This work highlights the potential for health state diagnostics using all metabolomics features with data-driven analysis. 2021-10-27T19:56:29Z 2021-10-27T19:56:29Z 2020 2021-02-02T17:20:14Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/133758 en 10.1038/s41598-020-74823-1 Scientific Reports Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf Springer Science and Business Media LLC Scientific Reports |
spellingShingle | Evans, Ethan D Duvallet, Claire Chu, Nathaniel D Oberst, Michael K Murphy, Michael A Rockafellow, Isaac Sontag, David Alm, Eric J Predicting human health from biofluid-based metabolomics using machine learning |
title | Predicting human health from biofluid-based metabolomics using machine learning |
title_full | Predicting human health from biofluid-based metabolomics using machine learning |
title_fullStr | Predicting human health from biofluid-based metabolomics using machine learning |
title_full_unstemmed | Predicting human health from biofluid-based metabolomics using machine learning |
title_short | Predicting human health from biofluid-based metabolomics using machine learning |
title_sort | predicting human health from biofluid based metabolomics using machine learning |
url | https://hdl.handle.net/1721.1/133758 |
work_keys_str_mv | AT evansethand predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning AT duvalletclaire predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning AT chunathanield predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning AT oberstmichaelk predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning AT murphymichaela predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning AT rockafellowisaac predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning AT sontagdavid predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning AT almericj predictinghumanhealthfrombiofluidbasedmetabolomicsusingmachinelearning |