CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification

Metabolite identification for untargeted metabolomics is often hampered by the lack of experimentally collected reference spectra from tandem mass spectrometry (MS/MS). To circumvent this problem, Competitive Fragmentation Modeling-ID (CFM-ID) was developed to accurately predict electrospray ionizat...

Full description

Bibliographic Details
Main Authors: Yannick Djoumbou-Feunang, Allison Pon, Naama Karu, Jiamin Zheng, Carin Li, David Arndt, Maheswor Gautam, Felicity Allen, David S. Wishart
Format: Article
Language:English
Published: MDPI AG 2019-04-01
Series:Metabolites
Subjects:
Online Access:https://www.mdpi.com/2218-1989/9/4/72
_version_ 1818425723394719744
author Yannick Djoumbou-Feunang
Allison Pon
Naama Karu
Jiamin Zheng
Carin Li
David Arndt
Maheswor Gautam
Felicity Allen
David S. Wishart
author_facet Yannick Djoumbou-Feunang
Allison Pon
Naama Karu
Jiamin Zheng
Carin Li
David Arndt
Maheswor Gautam
Felicity Allen
David S. Wishart
author_sort Yannick Djoumbou-Feunang
collection DOAJ
description Metabolite identification for untargeted metabolomics is often hampered by the lack of experimentally collected reference spectra from tandem mass spectrometry (MS/MS). To circumvent this problem, Competitive Fragmentation Modeling-ID (CFM-ID) was developed to accurately predict electrospray ionization-MS/MS (ESI-MS/MS) spectra from chemical structures and to aid in compound identification via MS/MS spectral matching. While earlier versions of CFM-ID performed very well, CFM-ID’s performance for predicting the MS/MS spectra of certain classes of compounds, including many lipids, was quite poor. Furthermore, CFM-ID’s compound identification capabilities were limited because it did not use experimentally available MS/MS spectra nor did it exploit metadata in its spectral matching algorithm. Here, we describe significant improvements to CFM-ID’s performance and speed. These include (1) the implementation of a rule-based fragmentation approach for lipid MS/MS spectral prediction, which greatly improves the speed and accuracy of CFM-ID; (2) the inclusion of experimental MS/MS spectra and other metadata to enhance CFM-ID’s compound identification abilities; (3) the development of new scoring functions that improves CFM-ID’s accuracy by 21.1%; and (4) the implementation of a chemical classification algorithm that correctly classifies unknown chemicals (based on their MS/MS spectra) in >80% of the cases. This improved version called CFM-ID 3.0 is freely available as a web server. Its source code is also accessible online.
first_indexed 2024-12-14T14:18:28Z
format Article
id doaj.art-615de7509d534f90ba45aa205d3a2b56
institution Directory Open Access Journal
issn 2218-1989
language English
last_indexed 2024-12-14T14:18:28Z
publishDate 2019-04-01
publisher MDPI AG
record_format Article
series Metabolites
spelling doaj.art-615de7509d534f90ba45aa205d3a2b562022-12-21T22:58:09ZengMDPI AGMetabolites2218-19892019-04-01947210.3390/metabo9040072metabo9040072CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound IdentificationYannick Djoumbou-Feunang0Allison Pon1Naama Karu2Jiamin Zheng3Carin Li4David Arndt5Maheswor Gautam6Felicity Allen7David S. Wishart8Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, CanadaOMx Personal Health Analytics, Edmonton, AB T5J 1B9, CanadaDepartment of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, CanadaDepartment of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, CanadaDepartment of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, CanadaDepartment of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, CanadaDepartment of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, CanadaWellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UKDepartment of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, CanadaMetabolite identification for untargeted metabolomics is often hampered by the lack of experimentally collected reference spectra from tandem mass spectrometry (MS/MS). To circumvent this problem, Competitive Fragmentation Modeling-ID (CFM-ID) was developed to accurately predict electrospray ionization-MS/MS (ESI-MS/MS) spectra from chemical structures and to aid in compound identification via MS/MS spectral matching. While earlier versions of CFM-ID performed very well, CFM-ID’s performance for predicting the MS/MS spectra of certain classes of compounds, including many lipids, was quite poor. Furthermore, CFM-ID’s compound identification capabilities were limited because it did not use experimentally available MS/MS spectra nor did it exploit metadata in its spectral matching algorithm. Here, we describe significant improvements to CFM-ID’s performance and speed. These include (1) the implementation of a rule-based fragmentation approach for lipid MS/MS spectral prediction, which greatly improves the speed and accuracy of CFM-ID; (2) the inclusion of experimental MS/MS spectra and other metadata to enhance CFM-ID’s compound identification abilities; (3) the development of new scoring functions that improves CFM-ID’s accuracy by 21.1%; and (4) the implementation of a chemical classification algorithm that correctly classifies unknown chemicals (based on their MS/MS spectra) in >80% of the cases. This improved version called CFM-ID 3.0 is freely available as a web server. Its source code is also accessible online.https://www.mdpi.com/2218-1989/9/4/72mass spectrometryliquid chromatographyMS spectral predictionmetabolite identificationstructure-based chemical classificationrule-based fragmentationcombinatorial fragmentation
spellingShingle Yannick Djoumbou-Feunang
Allison Pon
Naama Karu
Jiamin Zheng
Carin Li
David Arndt
Maheswor Gautam
Felicity Allen
David S. Wishart
CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification
Metabolites
mass spectrometry
liquid chromatography
MS spectral prediction
metabolite identification
structure-based chemical classification
rule-based fragmentation
combinatorial fragmentation
title CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification
title_full CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification
title_fullStr CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification
title_full_unstemmed CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification
title_short CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification
title_sort cfm id 3 0 significantly improved esi ms ms prediction and compound identification
topic mass spectrometry
liquid chromatography
MS spectral prediction
metabolite identification
structure-based chemical classification
rule-based fragmentation
combinatorial fragmentation
url https://www.mdpi.com/2218-1989/9/4/72
work_keys_str_mv AT yannickdjoumboufeunang cfmid30significantlyimprovedesimsmspredictionandcompoundidentification
AT allisonpon cfmid30significantlyimprovedesimsmspredictionandcompoundidentification
AT naamakaru cfmid30significantlyimprovedesimsmspredictionandcompoundidentification
AT jiaminzheng cfmid30significantlyimprovedesimsmspredictionandcompoundidentification
AT carinli cfmid30significantlyimprovedesimsmspredictionandcompoundidentification
AT davidarndt cfmid30significantlyimprovedesimsmspredictionandcompoundidentification
AT mahesworgautam cfmid30significantlyimprovedesimsmspredictionandcompoundidentification
AT felicityallen cfmid30significantlyimprovedesimsmspredictionandcompoundidentification
AT davidswishart cfmid30significantlyimprovedesimsmspredictionandcompoundidentification