Combined substituent number utilized machine learning for the development of antimicrobial agent

Abstract The utilization of machine learning has a potential to improve the environment of the development of antimicrobial agents. For practical use of machine learning, it is important that the conversion of molecules information to an appropriate descriptor because too informative descriptor requ...

Full description

Bibliographic Details
Main Authors: Keitaro Yamauchi, Hirotaka Nakatsuji, Takaaki Kamishima, Yoshitaka Koseki, Masaki Kubo, Hitoshi Kasai
Format: Article
Language:English
Published: Nature Portfolio 2024-02-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-024-53888-2
_version_ 1827327515082883072
author Keitaro Yamauchi
Hirotaka Nakatsuji
Takaaki Kamishima
Yoshitaka Koseki
Masaki Kubo
Hitoshi Kasai
author_facet Keitaro Yamauchi
Hirotaka Nakatsuji
Takaaki Kamishima
Yoshitaka Koseki
Masaki Kubo
Hitoshi Kasai
author_sort Keitaro Yamauchi
collection DOAJ
description Abstract The utilization of machine learning has a potential to improve the environment of the development of antimicrobial agents. For practical use of machine learning, it is important that the conversion of molecules information to an appropriate descriptor because too informative descriptor requires enormous computation time and experiments for gathering data, whereas a less informative descriptor has problems in validity. In this study, we utilized a descriptor only focused on substituent. The type and the position of substituents on the molecules that have a 4-quinolone structure (11,879 compounds) were converted to the combined substituent number (CSN). While the CSN does not include information on the detailed structure, physical properties, and quantum chemistry of molecules, the prediction model constructed by machine learning of CSN indicated a sufficient coefficient of determination (0.719 for the training dataset and 0.519 for the validation dataset). In addition, this CSN can easily construct the unknown molecules library which has a relatively consistent structure by recombination of substituents (32,079,318 compounds) and screening of them. The validity of the prediction model was also confirmed by growth inhibition experiments for E. coli using the model-suggested molecules and commercially available antimicrobial agents.
first_indexed 2024-03-07T15:01:23Z
format Article
id doaj.art-74a2197075c64f0589e3ed0f3423dd8c
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-03-07T15:01:23Z
publishDate 2024-02-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-74a2197075c64f0589e3ed0f3423dd8c2024-03-05T19:06:37ZengNature PortfolioScientific Reports2045-23222024-02-011411710.1038/s41598-024-53888-2Combined substituent number utilized machine learning for the development of antimicrobial agentKeitaro Yamauchi0Hirotaka Nakatsuji1Takaaki Kamishima2Yoshitaka Koseki3Masaki Kubo4Hitoshi Kasai5Institute of Multidisciplinary Research for Advance Materials (IMRAM), Tohoku UniversityInstitute of Multidisciplinary Research for Advance Materials (IMRAM), Tohoku UniversityEast Tokyo Laboratory, Genesis Research Institute, Inc.Institute of Multidisciplinary Research for Advance Materials (IMRAM), Tohoku UniversityDepartment of Chemical Engineering, Graduate School of Engineering, Tohoku UniversityInstitute of Multidisciplinary Research for Advance Materials (IMRAM), Tohoku UniversityAbstract The utilization of machine learning has a potential to improve the environment of the development of antimicrobial agents. For practical use of machine learning, it is important that the conversion of molecules information to an appropriate descriptor because too informative descriptor requires enormous computation time and experiments for gathering data, whereas a less informative descriptor has problems in validity. In this study, we utilized a descriptor only focused on substituent. The type and the position of substituents on the molecules that have a 4-quinolone structure (11,879 compounds) were converted to the combined substituent number (CSN). While the CSN does not include information on the detailed structure, physical properties, and quantum chemistry of molecules, the prediction model constructed by machine learning of CSN indicated a sufficient coefficient of determination (0.719 for the training dataset and 0.519 for the validation dataset). In addition, this CSN can easily construct the unknown molecules library which has a relatively consistent structure by recombination of substituents (32,079,318 compounds) and screening of them. The validity of the prediction model was also confirmed by growth inhibition experiments for E. coli using the model-suggested molecules and commercially available antimicrobial agents.https://doi.org/10.1038/s41598-024-53888-2
spellingShingle Keitaro Yamauchi
Hirotaka Nakatsuji
Takaaki Kamishima
Yoshitaka Koseki
Masaki Kubo
Hitoshi Kasai
Combined substituent number utilized machine learning for the development of antimicrobial agent
Scientific Reports
title Combined substituent number utilized machine learning for the development of antimicrobial agent
title_full Combined substituent number utilized machine learning for the development of antimicrobial agent
title_fullStr Combined substituent number utilized machine learning for the development of antimicrobial agent
title_full_unstemmed Combined substituent number utilized machine learning for the development of antimicrobial agent
title_short Combined substituent number utilized machine learning for the development of antimicrobial agent
title_sort combined substituent number utilized machine learning for the development of antimicrobial agent
url https://doi.org/10.1038/s41598-024-53888-2
work_keys_str_mv AT keitaroyamauchi combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent
AT hirotakanakatsuji combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent
AT takaakikamishima combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent
AT yoshitakakoseki combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent
AT masakikubo combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent
AT hitoshikasai combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent