A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility

Abstract Efficient and accurate prediction of molecular properties, such as lipophilicity and solubility, is highly desirable for rational compound design in chemical and pharmaceutical industries. To this end, we build and apply a graph-neural-network framework called self-attention-based message-p...

Full description

Bibliographic Details
Main Authors: Bowen Tang, Skyler T. Kramer, Meijuan Fang, Yingkun Qiu, Zhen Wu, Dong Xu
Format: Article
Language:English
Published: BMC 2020-02-01
Series:Journal of Cheminformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13321-020-0414-z
_version_ 1818031042774171648
author Bowen Tang
Skyler T. Kramer
Meijuan Fang
Yingkun Qiu
Zhen Wu
Dong Xu
author_facet Bowen Tang
Skyler T. Kramer
Meijuan Fang
Yingkun Qiu
Zhen Wu
Dong Xu
author_sort Bowen Tang
collection DOAJ
description Abstract Efficient and accurate prediction of molecular properties, such as lipophilicity and solubility, is highly desirable for rational compound design in chemical and pharmaceutical industries. To this end, we build and apply a graph-neural-network framework called self-attention-based message-passing neural network (SAMPN) to study the relationship between chemical properties and structures in an interpretable way. The main advantages of SAMPN are that it directly uses chemical graphs and breaks the black-box mold of many machine/deep learning methods. Specifically, its attention mechanism indicates the degree to which each atom of the molecule contributes to the property of interest, and these results are easily visualized. Further, SAMPN outperforms random forests and the deep learning framework MPN from Deepchem. In addition, another formulation of SAMPN (Multi-SAMPN) can simultaneously predict multiple chemical properties with higher accuracy and efficiency than other models that predict one specific chemical property. Moreover, SAMPN can generate chemically visible and interpretable results, which can help researchers discover new pharmaceuticals and materials. The source code of the SAMPN prediction pipeline is freely available at Github ( https://github.com/tbwxmu/SAMPN ).
first_indexed 2024-12-10T05:45:11Z
format Article
id doaj.art-2e341dc48d624ee0b01f065af9d41781
institution Directory Open Access Journal
issn 1758-2946
language English
last_indexed 2024-12-10T05:45:11Z
publishDate 2020-02-01
publisher BMC
record_format Article
series Journal of Cheminformatics
spelling doaj.art-2e341dc48d624ee0b01f065af9d417812022-12-22T02:00:10ZengBMCJournal of Cheminformatics1758-29462020-02-011211910.1186/s13321-020-0414-zA self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubilityBowen Tang0Skyler T. Kramer1Meijuan Fang2Yingkun Qiu3Zhen Wu4Dong Xu5Fujian Provincial Key Laboratory of Innovative Drug Target Research, School of Pharmaceutical Sciences, Xiamen UniversityDepartment of Electrical Engineering and Computer Science, Informatics Institute, and Christopher S. Bond Life Sciences Center, University of MissouriFujian Provincial Key Laboratory of Innovative Drug Target Research, School of Pharmaceutical Sciences, Xiamen UniversityFujian Provincial Key Laboratory of Innovative Drug Target Research, School of Pharmaceutical Sciences, Xiamen UniversityFujian Provincial Key Laboratory of Innovative Drug Target Research, School of Pharmaceutical Sciences, Xiamen UniversityDepartment of Electrical Engineering and Computer Science, Informatics Institute, and Christopher S. Bond Life Sciences Center, University of MissouriAbstract Efficient and accurate prediction of molecular properties, such as lipophilicity and solubility, is highly desirable for rational compound design in chemical and pharmaceutical industries. To this end, we build and apply a graph-neural-network framework called self-attention-based message-passing neural network (SAMPN) to study the relationship between chemical properties and structures in an interpretable way. The main advantages of SAMPN are that it directly uses chemical graphs and breaks the black-box mold of many machine/deep learning methods. Specifically, its attention mechanism indicates the degree to which each atom of the molecule contributes to the property of interest, and these results are easily visualized. Further, SAMPN outperforms random forests and the deep learning framework MPN from Deepchem. In addition, another formulation of SAMPN (Multi-SAMPN) can simultaneously predict multiple chemical properties with higher accuracy and efficiency than other models that predict one specific chemical property. Moreover, SAMPN can generate chemically visible and interpretable results, which can help researchers discover new pharmaceuticals and materials. The source code of the SAMPN prediction pipeline is freely available at Github ( https://github.com/tbwxmu/SAMPN ).http://link.springer.com/article/10.1186/s13321-020-0414-zMessage passing networkAttention mechanismDeep learningLipophilicityAqueous solubility
spellingShingle Bowen Tang
Skyler T. Kramer
Meijuan Fang
Yingkun Qiu
Zhen Wu
Dong Xu
A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility
Journal of Cheminformatics
Message passing network
Attention mechanism
Deep learning
Lipophilicity
Aqueous solubility
title A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility
title_full A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility
title_fullStr A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility
title_full_unstemmed A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility
title_short A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility
title_sort self attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility
topic Message passing network
Attention mechanism
Deep learning
Lipophilicity
Aqueous solubility
url http://link.springer.com/article/10.1186/s13321-020-0414-z
work_keys_str_mv AT bowentang aselfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility
AT skylertkramer aselfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility
AT meijuanfang aselfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility
AT yingkunqiu aselfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility
AT zhenwu aselfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility
AT dongxu aselfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility
AT bowentang selfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility
AT skylertkramer selfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility
AT meijuanfang selfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility
AT yingkunqiu selfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility
AT zhenwu selfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility
AT dongxu selfattentionbasedmessagepassingneuralnetworkforpredictingmolecularlipophilicityandaqueoussolubility