MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry
Abstract Mapping the chemical space of compounds to chemical structures remains a challenge in metabolomics. Despite the advancements in untargeted liquid chromatography-mass spectrometry (LC–MS) to achieve a high-throughput profile of metabolites from complex biological resources, only a small frac...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2023-03-01
|
Series: | Journal of Cheminformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13321-023-00695-y |
_version_ | 1827982928450158592 |
---|---|
author | Mahnoor Zulfiqar Luiz Gadelha Christoph Steinbeck Maria Sorokina Kristian Peters |
author_facet | Mahnoor Zulfiqar Luiz Gadelha Christoph Steinbeck Maria Sorokina Kristian Peters |
author_sort | Mahnoor Zulfiqar |
collection | DOAJ |
description | Abstract Mapping the chemical space of compounds to chemical structures remains a challenge in metabolomics. Despite the advancements in untargeted liquid chromatography-mass spectrometry (LC–MS) to achieve a high-throughput profile of metabolites from complex biological resources, only a small fraction of these metabolites can be annotated with confidence. Many novel computational methods and tools have been developed to enable chemical structure annotation to known and unknown compounds such as in silico generated spectra and molecular networking. Here, we present an automated and reproducible Metabolome Annotation Workflow (MAW) for untargeted metabolomics data to further facilitate and automate the complex annotation by combining tandem mass spectrometry (MS2) input data pre-processing, spectral and compound database matching with computational classification, and in silico annotation. MAW takes the LC-MS2 spectra as input and generates a list of putative candidates from spectral and compound databases. The databases are integrated via the R package Spectra and the metabolite annotation tool SIRIUS as part of the R segment of the workflow (MAW-R). The final candidate selection is performed using the cheminformatics tool RDKit in the Python segment (MAW-Py). Furthermore, each feature is assigned a chemical structure and can be imported to a chemical structure similarity network. MAW is following the FAIR (Findable, Accessible, Interoperable, Reusable) principles and has been made available as the docker images, maw-r and maw-py. The source code and documentation are available on GitHub ( https://github.com/zmahnoor14/MAW ). The performance of MAW is evaluated on two case studies. MAW can improve candidate ranking by integrating spectral databases with annotation tools like SIRIUS which contributes to an efficient candidate selection procedure. The results from MAW are also reproducible and traceable, compliant with the FAIR guidelines. Taken together, MAW could greatly facilitate automated metabolite characterization in diverse fields such as clinical metabolomics and natural product discovery. |
first_indexed | 2024-04-09T22:40:50Z |
format | Article |
id | doaj.art-bafe384e71184496a59ed71326baf3e5 |
institution | Directory Open Access Journal |
issn | 1758-2946 |
language | English |
last_indexed | 2024-04-09T22:40:50Z |
publishDate | 2023-03-01 |
publisher | BMC |
record_format | Article |
series | Journal of Cheminformatics |
spelling | doaj.art-bafe384e71184496a59ed71326baf3e52023-03-22T12:13:22ZengBMCJournal of Cheminformatics1758-29462023-03-0115111710.1186/s13321-023-00695-yMAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometryMahnoor Zulfiqar0Luiz Gadelha1Christoph Steinbeck2Maria Sorokina3Kristian Peters4Institute for Inorganic and Analytical Chemistry, Friedrich Schiller UniversityInstitute for Inorganic and Analytical Chemistry, Friedrich Schiller UniversityInstitute for Inorganic and Analytical Chemistry, Friedrich Schiller UniversityInstitute for Inorganic and Analytical Chemistry, Friedrich Schiller UniversityiDiv - German Centre for Integrative Biodiversity Research, Halle-Jena-LeipzigAbstract Mapping the chemical space of compounds to chemical structures remains a challenge in metabolomics. Despite the advancements in untargeted liquid chromatography-mass spectrometry (LC–MS) to achieve a high-throughput profile of metabolites from complex biological resources, only a small fraction of these metabolites can be annotated with confidence. Many novel computational methods and tools have been developed to enable chemical structure annotation to known and unknown compounds such as in silico generated spectra and molecular networking. Here, we present an automated and reproducible Metabolome Annotation Workflow (MAW) for untargeted metabolomics data to further facilitate and automate the complex annotation by combining tandem mass spectrometry (MS2) input data pre-processing, spectral and compound database matching with computational classification, and in silico annotation. MAW takes the LC-MS2 spectra as input and generates a list of putative candidates from spectral and compound databases. The databases are integrated via the R package Spectra and the metabolite annotation tool SIRIUS as part of the R segment of the workflow (MAW-R). The final candidate selection is performed using the cheminformatics tool RDKit in the Python segment (MAW-Py). Furthermore, each feature is assigned a chemical structure and can be imported to a chemical structure similarity network. MAW is following the FAIR (Findable, Accessible, Interoperable, Reusable) principles and has been made available as the docker images, maw-r and maw-py. The source code and documentation are available on GitHub ( https://github.com/zmahnoor14/MAW ). The performance of MAW is evaluated on two case studies. MAW can improve candidate ranking by integrating spectral databases with annotation tools like SIRIUS which contributes to an efficient candidate selection procedure. The results from MAW are also reproducible and traceable, compliant with the FAIR guidelines. Taken together, MAW could greatly facilitate automated metabolite characterization in diverse fields such as clinical metabolomics and natural product discovery.https://doi.org/10.1186/s13321-023-00695-yUntargeted metabolomicsWorkflowTandem mass spectrometryFAIRMetabolite annotation |
spellingShingle | Mahnoor Zulfiqar Luiz Gadelha Christoph Steinbeck Maria Sorokina Kristian Peters MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry Journal of Cheminformatics Untargeted metabolomics Workflow Tandem mass spectrometry FAIR Metabolite annotation |
title | MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry |
title_full | MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry |
title_fullStr | MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry |
title_full_unstemmed | MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry |
title_short | MAW: the reproducible Metabolome Annotation Workflow for untargeted tandem mass spectrometry |
title_sort | maw the reproducible metabolome annotation workflow for untargeted tandem mass spectrometry |
topic | Untargeted metabolomics Workflow Tandem mass spectrometry FAIR Metabolite annotation |
url | https://doi.org/10.1186/s13321-023-00695-y |
work_keys_str_mv | AT mahnoorzulfiqar mawthereproduciblemetabolomeannotationworkflowforuntargetedtandemmassspectrometry AT luizgadelha mawthereproduciblemetabolomeannotationworkflowforuntargetedtandemmassspectrometry AT christophsteinbeck mawthereproduciblemetabolomeannotationworkflowforuntargetedtandemmassspectrometry AT mariasorokina mawthereproduciblemetabolomeannotationworkflowforuntargetedtandemmassspectrometry AT kristianpeters mawthereproduciblemetabolomeannotationworkflowforuntargetedtandemmassspectrometry |