MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry

Abstract Mapping the chemical space of compounds to chemical structures remains a challenge in metabolomics. Despite the advancements in untargeted liquid chromatography-mass spectrometry (LC–MS) to achieve a high-throughput profile of metabolites from complex biological resources, only a small frac...

Full description

Bibliographic Details
Main Authors: Mahnoor Zulfiqar, Luiz Gadelha, Christoph Steinbeck, Maria Sorokina, Kristian Peters
Format: Article
Language:English
Published: BMC 2023-03-01
Series:Journal of Cheminformatics
Subjects:
Online Access:https://doi.org/10.1186/s13321-023-00695-y
_version_ 1827982928450158592
author Mahnoor Zulfiqar
Luiz Gadelha
Christoph Steinbeck
Maria Sorokina
Kristian Peters
author_facet Mahnoor Zulfiqar
Luiz Gadelha
Christoph Steinbeck
Maria Sorokina
Kristian Peters
author_sort Mahnoor Zulfiqar
collection DOAJ
description Abstract Mapping the chemical space of compounds to chemical structures remains a challenge in metabolomics. Despite the advancements in untargeted liquid chromatography-mass spectrometry (LC–MS) to achieve a high-throughput profile of metabolites from complex biological resources, only a small fraction of these metabolites can be annotated with confidence. Many novel computational methods and tools have been developed to enable chemical structure annotation to known and unknown compounds such as in silico generated spectra and molecular networking. Here, we present an automated and reproducible Metabolome Annotation Workflow (MAW) for untargeted metabolomics data to further facilitate and automate the complex annotation by combining tandem mass spectrometry (MS2) input data pre-processing, spectral and compound database matching with computational classification, and in silico annotation. MAW takes the LC-MS2 spectra as input and generates a list of putative candidates from spectral and compound databases. The databases are integrated via the R package Spectra and the metabolite annotation tool SIRIUS as part of the R segment of the workflow (MAW-R). The final candidate selection is performed using the cheminformatics tool RDKit in the Python segment (MAW-Py). Furthermore, each feature is assigned a chemical structure and can be imported to a chemical structure similarity network. MAW is following the FAIR (Findable, Accessible, Interoperable, Reusable) principles and has been made available as the docker images, maw-r and maw-py. The source code and documentation are available on GitHub ( https://github.com/zmahnoor14/MAW ). The performance of MAW is evaluated on two case studies. MAW can improve candidate ranking by integrating spectral databases with annotation tools like SIRIUS which contributes to an efficient candidate selection procedure. The results from MAW are also reproducible and traceable, compliant with the FAIR guidelines. Taken together, MAW could greatly facilitate automated metabolite characterization in diverse fields such as clinical metabolomics and natural product discovery.
first_indexed 2024-04-09T22:40:50Z
format Article
id doaj.art-bafe384e71184496a59ed71326baf3e5
institution Directory Open Access Journal
issn 1758-2946
language English
last_indexed 2024-04-09T22:40:50Z
publishDate 2023-03-01
publisher BMC
record_format Article
series Journal of Cheminformatics
spelling doaj.art-bafe384e71184496a59ed71326baf3e52023-03-22T12:13:22ZengBMCJournal of Cheminformatics1758-29462023-03-0115111710.1186/s13321-023-00695-yMAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometryMahnoor Zulfiqar0Luiz Gadelha1Christoph Steinbeck2Maria Sorokina3Kristian Peters4Institute for Inorganic and Analytical Chemistry, Friedrich Schiller UniversityInstitute for Inorganic and Analytical Chemistry, Friedrich Schiller UniversityInstitute for Inorganic and Analytical Chemistry, Friedrich Schiller UniversityInstitute for Inorganic and Analytical Chemistry, Friedrich Schiller UniversityiDiv - German Centre for Integrative Biodiversity Research, Halle-Jena-LeipzigAbstract Mapping the chemical space of compounds to chemical structures remains a challenge in metabolomics. Despite the advancements in untargeted liquid chromatography-mass spectrometry (LC–MS) to achieve a high-throughput profile of metabolites from complex biological resources, only a small fraction of these metabolites can be annotated with confidence. Many novel computational methods and tools have been developed to enable chemical structure annotation to known and unknown compounds such as in silico generated spectra and molecular networking. Here, we present an automated and reproducible Metabolome Annotation Workflow (MAW) for untargeted metabolomics data to further facilitate and automate the complex annotation by combining tandem mass spectrometry (MS2) input data pre-processing, spectral and compound database matching with computational classification, and in silico annotation. MAW takes the LC-MS2 spectra as input and generates a list of putative candidates from spectral and compound databases. The databases are integrated via the R package Spectra and the metabolite annotation tool SIRIUS as part of the R segment of the workflow (MAW-R). The final candidate selection is performed using the cheminformatics tool RDKit in the Python segment (MAW-Py). Furthermore, each feature is assigned a chemical structure and can be imported to a chemical structure similarity network. MAW is following the FAIR (Findable, Accessible, Interoperable, Reusable) principles and has been made available as the docker images, maw-r and maw-py. The source code and documentation are available on GitHub ( https://github.com/zmahnoor14/MAW ). The performance of MAW is evaluated on two case studies. MAW can improve candidate ranking by integrating spectral databases with annotation tools like SIRIUS which contributes to an efficient candidate selection procedure. The results from MAW are also reproducible and traceable, compliant with the FAIR guidelines. Taken together, MAW could greatly facilitate automated metabolite characterization in diverse fields such as clinical metabolomics and natural product discovery.https://doi.org/10.1186/s13321-023-00695-yUntargeted metabolomicsWorkflowTandem mass spectrometryFAIRMetabolite annotation
spellingShingle Mahnoor Zulfiqar
Luiz Gadelha
Christoph Steinbeck
Maria Sorokina
Kristian Peters
MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry
Journal of Cheminformatics
Untargeted metabolomics
Workflow
Tandem mass spectrometry
FAIR
Metabolite annotation
title MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry
title_full MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry
title_fullStr MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry
title_full_unstemmed MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry
title_short MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry
title_sort maw the reproducible metabolome annotation workflow for untargeted tandem mass spectrometry
topic Untargeted metabolomics
Workflow
Tandem mass spectrometry
FAIR
Metabolite annotation
url https://doi.org/10.1186/s13321-023-00695-y
work_keys_str_mv AT mahnoorzulfiqar mawthereproduciblemetabolomeannotationworkflowforuntargetedtandemmassspectrometry
AT luizgadelha mawthereproduciblemetabolomeannotationworkflowforuntargetedtandemmassspectrometry
AT christophsteinbeck mawthereproduciblemetabolomeannotationworkflowforuntargetedtandemmassspectrometry
AT mariasorokina mawthereproduciblemetabolomeannotationworkflowforuntargetedtandemmassspectrometry
AT kristianpeters mawthereproduciblemetabolomeannotationworkflowforuntargetedtandemmassspectrometry