Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies

Abstract We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our a...

Full description

Bibliographic Details
Main Authors: Christoph A. Bauer, Gisbert Schneider, Andreas H. Göller
Format: Article
Language:English
Published: BMC 2019-09-01
Series:Journal of Cheminformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13321-019-0381-4
_version_ 1819278499702112256
author Christoph A. Bauer
Gisbert Schneider
Andreas H. Göller
author_facet Christoph A. Bauer
Gisbert Schneider
Andreas H. Göller
author_sort Christoph A. Bauer
collection DOAJ
description Abstract We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our acceptor and donor databases are the largest on record with 4426 and 1036 data points, respectively. After scanning over radial atomic descriptors and ML methods, our final trained HBA and HBD ML models achieve RMSEs of 3.8 kJ mol−1 (acceptors), and 2.3 kJ mol−1 (donors) on experimental test sets, respectively. This performance is comparable with previous models that are trained on experimental hydrogen bonding free energies, indicating that molecular QC data can serve as substitute for experiment. The potential ramifications thereof could lead to a full replacement of wetlab chemistry for HBA/HBD strength determination by QC. As a possible chemical application of our ML models, we highlight our predicted HBA and HBD strengths as possible descriptors in two case studies on trends in intramolecular hydrogen bonding.
first_indexed 2024-12-24T00:12:59Z
format Article
id doaj.art-c4975b89104542f6805c3c5c9670ec43
institution Directory Open Access Journal
issn 1758-2946
language English
last_indexed 2024-12-24T00:12:59Z
publishDate 2019-09-01
publisher BMC
record_format Article
series Journal of Cheminformatics
spelling doaj.art-c4975b89104542f6805c3c5c9670ec432022-12-21T17:24:50ZengBMCJournal of Cheminformatics1758-29462019-09-0111111610.1186/s13321-019-0381-4Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energiesChristoph A. Bauer0Gisbert Schneider1Andreas H. Göller2Department of Chemistry and Applied Biosciences, Swiss Federal Institute of Technology (ETH)Department of Chemistry and Applied Biosciences, Swiss Federal Institute of Technology (ETH)Bayer AG, Pharmaceuticals, R&DAbstract We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our acceptor and donor databases are the largest on record with 4426 and 1036 data points, respectively. After scanning over radial atomic descriptors and ML methods, our final trained HBA and HBD ML models achieve RMSEs of 3.8 kJ mol−1 (acceptors), and 2.3 kJ mol−1 (donors) on experimental test sets, respectively. This performance is comparable with previous models that are trained on experimental hydrogen bonding free energies, indicating that molecular QC data can serve as substitute for experiment. The potential ramifications thereof could lead to a full replacement of wetlab chemistry for HBA/HBD strength determination by QC. As a possible chemical application of our ML models, we highlight our predicted HBA and HBD strengths as possible descriptors in two case studies on trends in intramolecular hydrogen bonding.http://link.springer.com/article/10.1186/s13321-019-0381-4Computational chemistryDensity functional theoryHydrogen bond strengthFree energy predictionCheminformatics
spellingShingle Christoph A. Bauer
Gisbert Schneider
Andreas H. Göller
Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
Journal of Cheminformatics
Computational chemistry
Density functional theory
Hydrogen bond strength
Free energy prediction
Cheminformatics
title Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
title_full Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
title_fullStr Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
title_full_unstemmed Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
title_short Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
title_sort machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first principles interaction free energies
topic Computational chemistry
Density functional theory
Hydrogen bond strength
Free energy prediction
Cheminformatics
url http://link.springer.com/article/10.1186/s13321-019-0381-4
work_keys_str_mv AT christophabauer machinelearningmodelsforhydrogenbonddonorandacceptorstrengthsusinglargeanddiversetrainingdatageneratedbyfirstprinciplesinteractionfreeenergies
AT gisbertschneider machinelearningmodelsforhydrogenbonddonorandacceptorstrengthsusinglargeanddiversetrainingdatageneratedbyfirstprinciplesinteractionfreeenergies
AT andreashgoller machinelearningmodelsforhydrogenbonddonorandacceptorstrengthsusinglargeanddiversetrainingdatageneratedbyfirstprinciplesinteractionfreeenergies