A Comprehensive Evaluation of Metabolomics Data Preprocessing Methods for Deep Learning

Machine learning has greatly advanced over the past decade, owing to advances in algorithmic innovations, hardware acceleration, and benchmark datasets to train on domains such as computer vision, natural-language processing, and more recently the life sciences. In particular, the subfield of machin...

Full description

Bibliographic Details
Main Authors: Krzysztof Jan Abram, Douglas McCloskey
Format: Article
Language:English
Published: MDPI AG 2022-02-01
Series:Metabolites
Subjects:
Online Access:https://www.mdpi.com/2218-1989/12/3/202
_version_ 1797445158305792000
author Krzysztof Jan Abram
Douglas McCloskey
author_facet Krzysztof Jan Abram
Douglas McCloskey
author_sort Krzysztof Jan Abram
collection DOAJ
description Machine learning has greatly advanced over the past decade, owing to advances in algorithmic innovations, hardware acceleration, and benchmark datasets to train on domains such as computer vision, natural-language processing, and more recently the life sciences. In particular, the subfield of machine learning known as deep learning has found applications in genomics, proteomics, and metabolomics. However, a thorough assessment of how the data preprocessing methods required for the analysis of life science data affect the performance of deep learning is lacking. This work contributes to filling that gap by assessing the impact of commonly used as well as newly developed methods employed in data preprocessing workflows for metabolomics that span from raw data to processed data. The results from these analyses are summarized into a set of best practices that can be used by researchers as a starting point for downstream classification and reconstruction tasks using deep learning.
first_indexed 2024-03-09T13:21:42Z
format Article
id doaj.art-bf458fd5ccd24fe2b44da03596c239fe
institution Directory Open Access Journal
issn 2218-1989
language English
last_indexed 2024-03-09T13:21:42Z
publishDate 2022-02-01
publisher MDPI AG
record_format Article
series Metabolites
spelling doaj.art-bf458fd5ccd24fe2b44da03596c239fe2023-11-30T21:29:16ZengMDPI AGMetabolites2218-19892022-02-0112320210.3390/metabo12030202A Comprehensive Evaluation of Metabolomics Data Preprocessing Methods for Deep LearningKrzysztof Jan Abram0Douglas McCloskey1Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Lyngby, DenmarkNovo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Lyngby, DenmarkMachine learning has greatly advanced over the past decade, owing to advances in algorithmic innovations, hardware acceleration, and benchmark datasets to train on domains such as computer vision, natural-language processing, and more recently the life sciences. In particular, the subfield of machine learning known as deep learning has found applications in genomics, proteomics, and metabolomics. However, a thorough assessment of how the data preprocessing methods required for the analysis of life science data affect the performance of deep learning is lacking. This work contributes to filling that gap by assessing the impact of commonly used as well as newly developed methods employed in data preprocessing workflows for metabolomics that span from raw data to processed data. The results from these analyses are summarized into a set of best practices that can be used by researchers as a starting point for downstream classification and reconstruction tasks using deep learning.https://www.mdpi.com/2218-1989/12/3/202metabolomicsdeep learningpreprocessing
spellingShingle Krzysztof Jan Abram
Douglas McCloskey
A Comprehensive Evaluation of Metabolomics Data Preprocessing Methods for Deep Learning
Metabolites
metabolomics
deep learning
preprocessing
title A Comprehensive Evaluation of Metabolomics Data Preprocessing Methods for Deep Learning
title_full A Comprehensive Evaluation of Metabolomics Data Preprocessing Methods for Deep Learning
title_fullStr A Comprehensive Evaluation of Metabolomics Data Preprocessing Methods for Deep Learning
title_full_unstemmed A Comprehensive Evaluation of Metabolomics Data Preprocessing Methods for Deep Learning
title_short A Comprehensive Evaluation of Metabolomics Data Preprocessing Methods for Deep Learning
title_sort comprehensive evaluation of metabolomics data preprocessing methods for deep learning
topic metabolomics
deep learning
preprocessing
url https://www.mdpi.com/2218-1989/12/3/202
work_keys_str_mv AT krzysztofjanabram acomprehensiveevaluationofmetabolomicsdatapreprocessingmethodsfordeeplearning
AT douglasmccloskey acomprehensiveevaluationofmetabolomicsdatapreprocessingmethodsfordeeplearning
AT krzysztofjanabram comprehensiveevaluationofmetabolomicsdatapreprocessingmethodsfordeeplearning
AT douglasmccloskey comprehensiveevaluationofmetabolomicsdatapreprocessingmethodsfordeeplearning