Investigation of the structure-odor relationship using a Transformer model

Abstract The relationships between molecular structures and their properties are subtle and complex, and the properties of odor are no exception. Molecules with similar structures, such as a molecule and its optical isomer, may have completely different odors, whereas molecules with completely disti...

Full description

Bibliographic Details
Main Authors: Xiaofan Zheng, Yoichi Tomiura, Kenshi Hayashi
Format: Article
Language:English
Published: BMC 2022-12-01
Series:Journal of Cheminformatics
Subjects:
Online Access:https://doi.org/10.1186/s13321-022-00671-y
_version_ 1797973533917184000
author Xiaofan Zheng
Yoichi Tomiura
Kenshi Hayashi
author_facet Xiaofan Zheng
Yoichi Tomiura
Kenshi Hayashi
author_sort Xiaofan Zheng
collection DOAJ
description Abstract The relationships between molecular structures and their properties are subtle and complex, and the properties of odor are no exception. Molecules with similar structures, such as a molecule and its optical isomer, may have completely different odors, whereas molecules with completely distinct structures may have similar odors. Many works have attempted to explain the molecular structure-odor relationship from chemical and data-driven perspectives. The Transformer model is widely used in natural language processing and computer vision, and the attention mechanism included in the Transformer model can identify relationships between inputs and outputs. In this paper, we describe the construction of a Transformer model for predicting molecular properties and interpreting the prediction results. The SMILES data of 100,000 molecules are collected and used to predict the existence of molecular substructures, and our proposed model achieves an F1 value of 0.98. The attention matrix is visualized to investigate the substructure annotation performance of the attention mechanism, and we find that certain atoms in the target substructures are accurately annotated. Finally, we collect 4462 molecules and their odor descriptors and use the proposed model to infer 98 odor descriptors, obtaining an average F1 value of 0.33. For the 19 odor descriptors that achieved F1 values greater than 0.45, we also attempt to summarize the relationship between the molecular substructures and odor quality through the attention matrix.
first_indexed 2024-04-11T04:05:38Z
format Article
id doaj.art-cb9c5dd07a0f44299685d66716b1ba98
institution Directory Open Access Journal
issn 1758-2946
language English
last_indexed 2024-04-11T04:05:38Z
publishDate 2022-12-01
publisher BMC
record_format Article
series Journal of Cheminformatics
spelling doaj.art-cb9c5dd07a0f44299685d66716b1ba982023-01-01T12:26:05ZengBMCJournal of Cheminformatics1758-29462022-12-0114111610.1186/s13321-022-00671-yInvestigation of the structure-odor relationship using a Transformer modelXiaofan Zheng0Yoichi Tomiura1Kenshi Hayashi2Graduate School of Information Science and Electrical Engineering, Department of Informatics, Kyushu UniversityGraduate School of Information Science and Electrical Engineering, Department of Informatics, Kyushu UniversityGraduate School of Information Science and Electrical Engineering, Department of Electronics, Kyushu UniversityAbstract The relationships between molecular structures and their properties are subtle and complex, and the properties of odor are no exception. Molecules with similar structures, such as a molecule and its optical isomer, may have completely different odors, whereas molecules with completely distinct structures may have similar odors. Many works have attempted to explain the molecular structure-odor relationship from chemical and data-driven perspectives. The Transformer model is widely used in natural language processing and computer vision, and the attention mechanism included in the Transformer model can identify relationships between inputs and outputs. In this paper, we describe the construction of a Transformer model for predicting molecular properties and interpreting the prediction results. The SMILES data of 100,000 molecules are collected and used to predict the existence of molecular substructures, and our proposed model achieves an F1 value of 0.98. The attention matrix is visualized to investigate the substructure annotation performance of the attention mechanism, and we find that certain atoms in the target substructures are accurately annotated. Finally, we collect 4462 molecules and their odor descriptors and use the proposed model to infer 98 odor descriptors, obtaining an average F1 value of 0.33. For the 19 odor descriptors that achieved F1 values greater than 0.45, we also attempt to summarize the relationship between the molecular substructures and odor quality through the attention matrix.https://doi.org/10.1186/s13321-022-00671-yMolecular structure-odor relationTransformer modelOdor descriptor
spellingShingle Xiaofan Zheng
Yoichi Tomiura
Kenshi Hayashi
Investigation of the structure-odor relationship using a Transformer model
Journal of Cheminformatics
Molecular structure-odor relation
Transformer model
Odor descriptor
title Investigation of the structure-odor relationship using a Transformer model
title_full Investigation of the structure-odor relationship using a Transformer model
title_fullStr Investigation of the structure-odor relationship using a Transformer model
title_full_unstemmed Investigation of the structure-odor relationship using a Transformer model
title_short Investigation of the structure-odor relationship using a Transformer model
title_sort investigation of the structure odor relationship using a transformer model
topic Molecular structure-odor relation
Transformer model
Odor descriptor
url https://doi.org/10.1186/s13321-022-00671-y
work_keys_str_mv AT xiaofanzheng investigationofthestructureodorrelationshipusingatransformermodel
AT yoichitomiura investigationofthestructureodorrelationshipusingatransformermodel
AT kenshihayashi investigationofthestructureodorrelationshipusingatransformermodel