Exploring Deep Learning for Metalloporphyrins: Databases, Molecular Representations, and Model Architectures

Metalloporphyrins have been studied as biomimetic catalysts for more than 120 years and have accumulated a large amount of data, which provides a solid foundation for deep learning to discover chemical trends and structure–function relationships. In this study, key components of deep learning of met...

Full description

Bibliographic Details
Main Authors: An Su, Chengwei Zhang, Yuan-Bin She, Yun-Fang Yang
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Catalysts
Subjects:
Online Access:https://www.mdpi.com/2073-4344/12/11/1485
Description
Summary:Metalloporphyrins have been studied as biomimetic catalysts for more than 120 years and have accumulated a large amount of data, which provides a solid foundation for deep learning to discover chemical trends and structure–function relationships. In this study, key components of deep learning of metalloporphyrins, including databases, molecular representations, and model architectures, were systematically investigated. A protocol to construct canonical SMILES for metalloporphyrins was proposed, which was then used to represent the two-dimensional structures of over 10,000 metalloporphyrins in an existing computational database. Subsequently, several state-of-the-art chemical deep learning models, including graph neural network-based models and natural language processing-based models, were employed to predict the energy gaps of metalloporphyrins. Two models showed satisfactory predictive performance (<i>R</i><sup>2</sup> 0.94) with canonical SMILES as the only source of structural information. In addition, an unsupervised visualization algorithm was used to interpret the molecular features learned by the deep learning models.
ISSN:2073-4344