Limits of Prediction for Machine Learning in Drug Discovery
In drug discovery, molecules are optimized towards desired properties. In this context, machine learning is used for extrapolation in drug discovery projects. The limits of extrapolation for regression models are known. However, a systematic analysis of the effectiveness of extrapolation in drug dis...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-03-01
|
Series: | Frontiers in Pharmacology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fphar.2022.832120/full |
_version_ | 1819104517865603072 |
---|---|
author | Modest von Korff Thomas Sander |
author_facet | Modest von Korff Thomas Sander |
author_sort | Modest von Korff |
collection | DOAJ |
description | In drug discovery, molecules are optimized towards desired properties. In this context, machine learning is used for extrapolation in drug discovery projects. The limits of extrapolation for regression models are known. However, a systematic analysis of the effectiveness of extrapolation in drug discovery has not yet been performed. In response, this study examined the capabilities of six machine learning algorithms to extrapolate from 243 datasets. The response values calculated from the molecules in the datasets were molecular weight, cLogP, and the number of sp3-atoms. Three experimental set ups were chosen for response values. Shuffled data were used for interpolation, whereas data for extrapolation were sorted from high to low values, and the reverse. Extrapolation with sorted data resulted in much larger prediction errors than extrapolation with shuffled data. Additionally, this study demonstrated that linear machine learning methods are preferable for extrapolation. |
first_indexed | 2024-12-22T02:07:37Z |
format | Article |
id | doaj.art-44bd4713f73f48beafaa5ea79fbbba05 |
institution | Directory Open Access Journal |
issn | 1663-9812 |
language | English |
last_indexed | 2024-12-22T02:07:37Z |
publishDate | 2022-03-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Pharmacology |
spelling | doaj.art-44bd4713f73f48beafaa5ea79fbbba052022-12-21T18:42:29ZengFrontiers Media S.A.Frontiers in Pharmacology1663-98122022-03-011310.3389/fphar.2022.832120832120Limits of Prediction for Machine Learning in Drug DiscoveryModest von KorffThomas SanderIn drug discovery, molecules are optimized towards desired properties. In this context, machine learning is used for extrapolation in drug discovery projects. The limits of extrapolation for regression models are known. However, a systematic analysis of the effectiveness of extrapolation in drug discovery has not yet been performed. In response, this study examined the capabilities of six machine learning algorithms to extrapolate from 243 datasets. The response values calculated from the molecules in the datasets were molecular weight, cLogP, and the number of sp3-atoms. Three experimental set ups were chosen for response values. Shuffled data were used for interpolation, whereas data for extrapolation were sorted from high to low values, and the reverse. Extrapolation with sorted data resulted in much larger prediction errors than extrapolation with shuffled data. Additionally, this study demonstrated that linear machine learning methods are preferable for extrapolation.https://www.frontiersin.org/articles/10.3389/fphar.2022.832120/fullmachine learningdrug discoveryextrapolationdata setPLS (partial least square)Gaussian regression |
spellingShingle | Modest von Korff Thomas Sander Limits of Prediction for Machine Learning in Drug Discovery Frontiers in Pharmacology machine learning drug discovery extrapolation data set PLS (partial least square) Gaussian regression |
title | Limits of Prediction for Machine Learning in Drug Discovery |
title_full | Limits of Prediction for Machine Learning in Drug Discovery |
title_fullStr | Limits of Prediction for Machine Learning in Drug Discovery |
title_full_unstemmed | Limits of Prediction for Machine Learning in Drug Discovery |
title_short | Limits of Prediction for Machine Learning in Drug Discovery |
title_sort | limits of prediction for machine learning in drug discovery |
topic | machine learning drug discovery extrapolation data set PLS (partial least square) Gaussian regression |
url | https://www.frontiersin.org/articles/10.3389/fphar.2022.832120/full |
work_keys_str_mv | AT modestvonkorff limitsofpredictionformachinelearningindrugdiscovery AT thomassander limitsofpredictionformachinelearningindrugdiscovery |