Performance and robustness of small molecule retention time prediction with molecular graph neural networks in industrial drug discovery campaigns
Abstract This study explores how machine-learning can be used to predict chromatographic retention times (RT) for the analysis of small molecules, with the objective of identifying a machine-learning framework with the robustness required to support a chemical synthesis production platform. We used...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2024-04-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-024-59620-4 |
_version_ | 1797199475949699072 |
---|---|
author | Daniel Vik David Pii Chirag Mudaliar Mads Nørregaard-Madsen Aleksejs Kontijevskis |
author_facet | Daniel Vik David Pii Chirag Mudaliar Mads Nørregaard-Madsen Aleksejs Kontijevskis |
author_sort | Daniel Vik |
collection | DOAJ |
description | Abstract This study explores how machine-learning can be used to predict chromatographic retention times (RT) for the analysis of small molecules, with the objective of identifying a machine-learning framework with the robustness required to support a chemical synthesis production platform. We used internally generated data from high-throughput parallel synthesis in context of pharmaceutical drug discovery projects. We tested machine-learning models from the following frameworks: XGBoost, ChemProp, and DeepChem, using a dataset of 7552 small molecules. Our findings show that two specific models, AttentiveFP and ChemProp, performed better than XGBoost and a regular neural network in predicting RT accurately. We also assessed how well these models performed over time and found that molecular graph neural networks consistently gave accurate predictions for new chemical series. In addition, when we applied ChemProp on the publicly available METLIN SMRT dataset, it performed impressively with an average error of 38.70 s. These results highlight the efficacy of molecular graph neural networks, especially ChemProp, in diverse RT prediction scenarios, thereby enhancing the efficiency of chromatographic analysis. |
first_indexed | 2024-04-24T07:16:21Z |
format | Article |
id | doaj.art-7809f1803283419293563e65a4899082 |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-04-24T07:16:21Z |
publishDate | 2024-04-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-7809f1803283419293563e65a48990822024-04-21T11:18:27ZengNature PortfolioScientific Reports2045-23222024-04-011411810.1038/s41598-024-59620-4Performance and robustness of small molecule retention time prediction with molecular graph neural networks in industrial drug discovery campaignsDaniel Vik0David Pii1Chirag Mudaliar2Mads Nørregaard-Madsen3Aleksejs Kontijevskis4Amgen Research Copenhagen, Amgen Inc.Amgen Research Copenhagen, Amgen Inc.Amgen Research Copenhagen, Amgen Inc.Amgen Research Copenhagen, Amgen Inc.Amgen Research Copenhagen, Amgen Inc.Abstract This study explores how machine-learning can be used to predict chromatographic retention times (RT) for the analysis of small molecules, with the objective of identifying a machine-learning framework with the robustness required to support a chemical synthesis production platform. We used internally generated data from high-throughput parallel synthesis in context of pharmaceutical drug discovery projects. We tested machine-learning models from the following frameworks: XGBoost, ChemProp, and DeepChem, using a dataset of 7552 small molecules. Our findings show that two specific models, AttentiveFP and ChemProp, performed better than XGBoost and a regular neural network in predicting RT accurately. We also assessed how well these models performed over time and found that molecular graph neural networks consistently gave accurate predictions for new chemical series. In addition, when we applied ChemProp on the publicly available METLIN SMRT dataset, it performed impressively with an average error of 38.70 s. These results highlight the efficacy of molecular graph neural networks, especially ChemProp, in diverse RT prediction scenarios, thereby enhancing the efficiency of chromatographic analysis.https://doi.org/10.1038/s41598-024-59620-4ChromatographyMachine-learningRetention timeSmall moleculeApplied artificial intelligencePharmaceuticals |
spellingShingle | Daniel Vik David Pii Chirag Mudaliar Mads Nørregaard-Madsen Aleksejs Kontijevskis Performance and robustness of small molecule retention time prediction with molecular graph neural networks in industrial drug discovery campaigns Scientific Reports Chromatography Machine-learning Retention time Small molecule Applied artificial intelligence Pharmaceuticals |
title | Performance and robustness of small molecule retention time prediction with molecular graph neural networks in industrial drug discovery campaigns |
title_full | Performance and robustness of small molecule retention time prediction with molecular graph neural networks in industrial drug discovery campaigns |
title_fullStr | Performance and robustness of small molecule retention time prediction with molecular graph neural networks in industrial drug discovery campaigns |
title_full_unstemmed | Performance and robustness of small molecule retention time prediction with molecular graph neural networks in industrial drug discovery campaigns |
title_short | Performance and robustness of small molecule retention time prediction with molecular graph neural networks in industrial drug discovery campaigns |
title_sort | performance and robustness of small molecule retention time prediction with molecular graph neural networks in industrial drug discovery campaigns |
topic | Chromatography Machine-learning Retention time Small molecule Applied artificial intelligence Pharmaceuticals |
url | https://doi.org/10.1038/s41598-024-59620-4 |
work_keys_str_mv | AT danielvik performanceandrobustnessofsmallmoleculeretentiontimepredictionwithmoleculargraphneuralnetworksinindustrialdrugdiscoverycampaigns AT davidpii performanceandrobustnessofsmallmoleculeretentiontimepredictionwithmoleculargraphneuralnetworksinindustrialdrugdiscoverycampaigns AT chiragmudaliar performanceandrobustnessofsmallmoleculeretentiontimepredictionwithmoleculargraphneuralnetworksinindustrialdrugdiscoverycampaigns AT madsnørregaardmadsen performanceandrobustnessofsmallmoleculeretentiontimepredictionwithmoleculargraphneuralnetworksinindustrialdrugdiscoverycampaigns AT aleksejskontijevskis performanceandrobustnessofsmallmoleculeretentiontimepredictionwithmoleculargraphneuralnetworksinindustrialdrugdiscoverycampaigns |