An Inverse QSAR Method Based on Linear Regression and Integer Programming

Background: Drug design is one of the important applications of biological science. Extensive studies have been done on computer-aided drug design based on inverse quantitative structure activity relationship (inverse QSAR), which is to infer chemical compounds from given chemical activities and con...

Full description

Bibliographic Details
Main Authors: Jianshen Zhu, Naveed Ahmed Azam, Kazuya Haraguchi, Liang Zhao, Hiroshi Nagamochi, Tatsuya Akutsu
Format: Article
Language:English
Published: IMR Press 2022-06-01
Series:Frontiers in Bioscience-Landmark
Subjects:
Online Access:https://www.imrpress.com/journal/FBL/27/6/10.31083/j.fbl2706188
_version_ 1818255717755256832
author Jianshen Zhu
Naveed Ahmed Azam
Kazuya Haraguchi
Liang Zhao
Hiroshi Nagamochi
Tatsuya Akutsu
author_facet Jianshen Zhu
Naveed Ahmed Azam
Kazuya Haraguchi
Liang Zhao
Hiroshi Nagamochi
Tatsuya Akutsu
author_sort Jianshen Zhu
collection DOAJ
description Background: Drug design is one of the important applications of biological science. Extensive studies have been done on computer-aided drug design based on inverse quantitative structure activity relationship (inverse QSAR), which is to infer chemical compounds from given chemical activities and constraints. However, exact or optimal solutions are not guaranteed in most of the existing methods. Method: Recently a novel framework based on artificial neural networks (ANNs) and mixed integer linear programming (MILP) has been proposed for designing chemical structures. This framework consists of two phases: an ANN is used to construct a prediction function, and then an MILP formulated on the trained ANN and a graph search algorithm are used to infer desired chemical structures. In this paper, we use linear regression instead of ANNs to construct a prediction function. For this, we derive a novel MILP formulation that simulates the computation process of a prediction function by linear regression. Results: For the first phase, we performed computational experiments using 18 chemical properties, and the proposed method achieved good prediction accuracy for a relatively large number of properties, in comparison with ANNs in our previous work. For the second phase, we performed computational experiments on five chemical properties, and the method could infer chemical structures with around up to 50 non-hydrogen atoms. Conclusions: Combination of linear regression and integer programming is a potentially useful approach to computational molecular design.
first_indexed 2024-12-12T17:16:18Z
format Article
id doaj.art-c4d5b2d8c8264ab3baac7b57322ebbd0
institution Directory Open Access Journal
issn 2768-6701
language English
last_indexed 2024-12-12T17:16:18Z
publishDate 2022-06-01
publisher IMR Press
record_format Article
series Frontiers in Bioscience-Landmark
spelling doaj.art-c4d5b2d8c8264ab3baac7b57322ebbd02022-12-22T00:17:46ZengIMR PressFrontiers in Bioscience-Landmark2768-67012022-06-0127618810.31083/j.fbl2706188S2768-6701(22)00551-2An Inverse QSAR Method Based on Linear Regression and Integer ProgrammingJianshen Zhu0Naveed Ahmed Azam1Kazuya Haraguchi2Liang Zhao3Hiroshi Nagamochi4Tatsuya Akutsu5Department of Applied Mathematics and Physics, Kyoto University, 606-8501 Kyoto, JapanDepartment of Applied Mathematics and Physics, Kyoto University, 606-8501 Kyoto, JapanDepartment of Applied Mathematics and Physics, Kyoto University, 606-8501 Kyoto, JapanGraduate School of Advanced Integrated Studies in Human Survavibility (Shishu-Kan), Kyoto University, 606-8306 Kyoto, JapanDepartment of Applied Mathematics and Physics, Kyoto University, 606-8501 Kyoto, JapanBioinformatics Center, Institute for Chemical Research, Kyoto University, 611-0011 Uji, JapanBackground: Drug design is one of the important applications of biological science. Extensive studies have been done on computer-aided drug design based on inverse quantitative structure activity relationship (inverse QSAR), which is to infer chemical compounds from given chemical activities and constraints. However, exact or optimal solutions are not guaranteed in most of the existing methods. Method: Recently a novel framework based on artificial neural networks (ANNs) and mixed integer linear programming (MILP) has been proposed for designing chemical structures. This framework consists of two phases: an ANN is used to construct a prediction function, and then an MILP formulated on the trained ANN and a graph search algorithm are used to infer desired chemical structures. In this paper, we use linear regression instead of ANNs to construct a prediction function. For this, we derive a novel MILP formulation that simulates the computation process of a prediction function by linear regression. Results: For the first phase, we performed computational experiments using 18 chemical properties, and the proposed method achieved good prediction accuracy for a relatively large number of properties, in comparison with ANNs in our previous work. For the second phase, we performed computational experiments on five chemical properties, and the method could infer chemical structures with around up to 50 non-hydrogen atoms. Conclusions: Combination of linear regression and integer programming is a potentially useful approach to computational molecular design.https://www.imrpress.com/journal/FBL/27/6/10.31083/j.fbl2706188machine learninglinear regressioninteger programmingchemoinformaticsmaterials informaticsqsar/qsprmolecular design
spellingShingle Jianshen Zhu
Naveed Ahmed Azam
Kazuya Haraguchi
Liang Zhao
Hiroshi Nagamochi
Tatsuya Akutsu
An Inverse QSAR Method Based on Linear Regression and Integer Programming
Frontiers in Bioscience-Landmark
machine learning
linear regression
integer programming
chemoinformatics
materials informatics
qsar/qspr
molecular design
title An Inverse QSAR Method Based on Linear Regression and Integer Programming
title_full An Inverse QSAR Method Based on Linear Regression and Integer Programming
title_fullStr An Inverse QSAR Method Based on Linear Regression and Integer Programming
title_full_unstemmed An Inverse QSAR Method Based on Linear Regression and Integer Programming
title_short An Inverse QSAR Method Based on Linear Regression and Integer Programming
title_sort inverse qsar method based on linear regression and integer programming
topic machine learning
linear regression
integer programming
chemoinformatics
materials informatics
qsar/qspr
molecular design
url https://www.imrpress.com/journal/FBL/27/6/10.31083/j.fbl2706188
work_keys_str_mv AT jianshenzhu aninverseqsarmethodbasedonlinearregressionandintegerprogramming
AT naveedahmedazam aninverseqsarmethodbasedonlinearregressionandintegerprogramming
AT kazuyaharaguchi aninverseqsarmethodbasedonlinearregressionandintegerprogramming
AT liangzhao aninverseqsarmethodbasedonlinearregressionandintegerprogramming
AT hiroshinagamochi aninverseqsarmethodbasedonlinearregressionandintegerprogramming
AT tatsuyaakutsu aninverseqsarmethodbasedonlinearregressionandintegerprogramming
AT jianshenzhu inverseqsarmethodbasedonlinearregressionandintegerprogramming
AT naveedahmedazam inverseqsarmethodbasedonlinearregressionandintegerprogramming
AT kazuyaharaguchi inverseqsarmethodbasedonlinearregressionandintegerprogramming
AT liangzhao inverseqsarmethodbasedonlinearregressionandintegerprogramming
AT hiroshinagamochi inverseqsarmethodbasedonlinearregressionandintegerprogramming
AT tatsuyaakutsu inverseqsarmethodbasedonlinearregressionandintegerprogramming