Machine Learning for Absolute Quantification of Unidentified Compounds in Non-Targeted LC/HRMS

LC/ESI/HRMS is increasingly employed for monitoring chemical pollutants in water samples, with non-targeted analysis becoming more common. Unfortunately, due to the lack of analytical standards, non-targeted analysis is mostly qualitative. To remedy this, models have been developed to evaluate the r...

Full description

Bibliographic Details
Main Authors: Emma Palm, Anneli Kruve
Format: Article
Language:English
Published: MDPI AG 2022-02-01
Series:Molecules
Subjects:
Online Access:https://www.mdpi.com/1420-3049/27/3/1013
_version_ 1797485969956405248
author Emma Palm
Anneli Kruve
author_facet Emma Palm
Anneli Kruve
author_sort Emma Palm
collection DOAJ
description LC/ESI/HRMS is increasingly employed for monitoring chemical pollutants in water samples, with non-targeted analysis becoming more common. Unfortunately, due to the lack of analytical standards, non-targeted analysis is mostly qualitative. To remedy this, models have been developed to evaluate the response of compounds from their structure, which can then be used for quantification in non-targeted analysis. Still, these models rely on tentatively known structures while for most detected compounds, a list of structural candidates, or sometimes only exact mass and retention time are identified. In this study, a quantification approach was developed, where LC/ESI/HRMS descriptors are used for quantification of compounds even if the structure is unknown. The approach was developed based on 92 compounds analyzed in parallel in both positive and negative ESI mode with mobile phases at pH 2.7, 8.0, and 10.0. The developed approach was compared with two baseline approaches— one assuming equal response factors for all compounds and one using the response factor of the closest eluting standard. The former gave a mean prediction error of a factor of 29, while the latter gave a mean prediction error of a factor of 1300. In the machine learning-based quantification approach developed here, the corresponding prediction error was a factor of 10. Furthermore, the approach was validated by analyzing two blind samples containing 48 compounds spiked into tap water and ultrapure water. The obtained mean prediction error was lower than a factor of 6.0 for both samples. The errors were found to be comparable to approaches using structural information.
first_indexed 2024-03-09T23:27:25Z
format Article
id doaj.art-89ec82cb5f114fc490a8e55cee5e7bd1
institution Directory Open Access Journal
issn 1420-3049
language English
last_indexed 2024-03-09T23:27:25Z
publishDate 2022-02-01
publisher MDPI AG
record_format Article
series Molecules
spelling doaj.art-89ec82cb5f114fc490a8e55cee5e7bd12023-11-23T17:16:17ZengMDPI AGMolecules1420-30492022-02-01273101310.3390/molecules27031013Machine Learning for Absolute Quantification of Unidentified Compounds in Non-Targeted LC/HRMSEmma Palm0Anneli Kruve1Department of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 114 18 Stockholm, SwedenDepartment of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 114 18 Stockholm, SwedenLC/ESI/HRMS is increasingly employed for monitoring chemical pollutants in water samples, with non-targeted analysis becoming more common. Unfortunately, due to the lack of analytical standards, non-targeted analysis is mostly qualitative. To remedy this, models have been developed to evaluate the response of compounds from their structure, which can then be used for quantification in non-targeted analysis. Still, these models rely on tentatively known structures while for most detected compounds, a list of structural candidates, or sometimes only exact mass and retention time are identified. In this study, a quantification approach was developed, where LC/ESI/HRMS descriptors are used for quantification of compounds even if the structure is unknown. The approach was developed based on 92 compounds analyzed in parallel in both positive and negative ESI mode with mobile phases at pH 2.7, 8.0, and 10.0. The developed approach was compared with two baseline approaches— one assuming equal response factors for all compounds and one using the response factor of the closest eluting standard. The former gave a mean prediction error of a factor of 29, while the latter gave a mean prediction error of a factor of 1300. In the machine learning-based quantification approach developed here, the corresponding prediction error was a factor of 10. Furthermore, the approach was validated by analyzing two blind samples containing 48 compounds spiked into tap water and ultrapure water. The obtained mean prediction error was lower than a factor of 6.0 for both samples. The errors were found to be comparable to approaches using structural information.https://www.mdpi.com/1420-3049/27/3/1013random forestnon-target analysissuspect screeningquantification
spellingShingle Emma Palm
Anneli Kruve
Machine Learning for Absolute Quantification of Unidentified Compounds in Non-Targeted LC/HRMS
Molecules
random forest
non-target analysis
suspect screening
quantification
title Machine Learning for Absolute Quantification of Unidentified Compounds in Non-Targeted LC/HRMS
title_full Machine Learning for Absolute Quantification of Unidentified Compounds in Non-Targeted LC/HRMS
title_fullStr Machine Learning for Absolute Quantification of Unidentified Compounds in Non-Targeted LC/HRMS
title_full_unstemmed Machine Learning for Absolute Quantification of Unidentified Compounds in Non-Targeted LC/HRMS
title_short Machine Learning for Absolute Quantification of Unidentified Compounds in Non-Targeted LC/HRMS
title_sort machine learning for absolute quantification of unidentified compounds in non targeted lc hrms
topic random forest
non-target analysis
suspect screening
quantification
url https://www.mdpi.com/1420-3049/27/3/1013
work_keys_str_mv AT emmapalm machinelearningforabsolutequantificationofunidentifiedcompoundsinnontargetedlchrms
AT annelikruve machinelearningforabsolutequantificationofunidentifiedcompoundsinnontargetedlchrms