Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies
Abstract We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our a...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-09-01
|
Series: | Journal of Cheminformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s13321-019-0381-4 |
_version_ | 1819278499702112256 |
---|---|
author | Christoph A. Bauer Gisbert Schneider Andreas H. Göller |
author_facet | Christoph A. Bauer Gisbert Schneider Andreas H. Göller |
author_sort | Christoph A. Bauer |
collection | DOAJ |
description | Abstract We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our acceptor and donor databases are the largest on record with 4426 and 1036 data points, respectively. After scanning over radial atomic descriptors and ML methods, our final trained HBA and HBD ML models achieve RMSEs of 3.8 kJ mol−1 (acceptors), and 2.3 kJ mol−1 (donors) on experimental test sets, respectively. This performance is comparable with previous models that are trained on experimental hydrogen bonding free energies, indicating that molecular QC data can serve as substitute for experiment. The potential ramifications thereof could lead to a full replacement of wetlab chemistry for HBA/HBD strength determination by QC. As a possible chemical application of our ML models, we highlight our predicted HBA and HBD strengths as possible descriptors in two case studies on trends in intramolecular hydrogen bonding. |
first_indexed | 2024-12-24T00:12:59Z |
format | Article |
id | doaj.art-c4975b89104542f6805c3c5c9670ec43 |
institution | Directory Open Access Journal |
issn | 1758-2946 |
language | English |
last_indexed | 2024-12-24T00:12:59Z |
publishDate | 2019-09-01 |
publisher | BMC |
record_format | Article |
series | Journal of Cheminformatics |
spelling | doaj.art-c4975b89104542f6805c3c5c9670ec432022-12-21T17:24:50ZengBMCJournal of Cheminformatics1758-29462019-09-0111111610.1186/s13321-019-0381-4Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energiesChristoph A. Bauer0Gisbert Schneider1Andreas H. Göller2Department of Chemistry and Applied Biosciences, Swiss Federal Institute of Technology (ETH)Department of Chemistry and Applied Biosciences, Swiss Federal Institute of Technology (ETH)Bayer AG, Pharmaceuticals, R&DAbstract We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our acceptor and donor databases are the largest on record with 4426 and 1036 data points, respectively. After scanning over radial atomic descriptors and ML methods, our final trained HBA and HBD ML models achieve RMSEs of 3.8 kJ mol−1 (acceptors), and 2.3 kJ mol−1 (donors) on experimental test sets, respectively. This performance is comparable with previous models that are trained on experimental hydrogen bonding free energies, indicating that molecular QC data can serve as substitute for experiment. The potential ramifications thereof could lead to a full replacement of wetlab chemistry for HBA/HBD strength determination by QC. As a possible chemical application of our ML models, we highlight our predicted HBA and HBD strengths as possible descriptors in two case studies on trends in intramolecular hydrogen bonding.http://link.springer.com/article/10.1186/s13321-019-0381-4Computational chemistryDensity functional theoryHydrogen bond strengthFree energy predictionCheminformatics |
spellingShingle | Christoph A. Bauer Gisbert Schneider Andreas H. Göller Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies Journal of Cheminformatics Computational chemistry Density functional theory Hydrogen bond strength Free energy prediction Cheminformatics |
title | Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies |
title_full | Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies |
title_fullStr | Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies |
title_full_unstemmed | Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies |
title_short | Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies |
title_sort | machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first principles interaction free energies |
topic | Computational chemistry Density functional theory Hydrogen bond strength Free energy prediction Cheminformatics |
url | http://link.springer.com/article/10.1186/s13321-019-0381-4 |
work_keys_str_mv | AT christophabauer machinelearningmodelsforhydrogenbonddonorandacceptorstrengthsusinglargeanddiversetrainingdatageneratedbyfirstprinciplesinteractionfreeenergies AT gisbertschneider machinelearningmodelsforhydrogenbonddonorandacceptorstrengthsusinglargeanddiversetrainingdatageneratedbyfirstprinciplesinteractionfreeenergies AT andreashgoller machinelearningmodelsforhydrogenbonddonorandacceptorstrengthsusinglargeanddiversetrainingdatageneratedbyfirstprinciplesinteractionfreeenergies |