Combined substituent number utilized machine learning for the development of antimicrobial agent
Abstract The utilization of machine learning has a potential to improve the environment of the development of antimicrobial agents. For practical use of machine learning, it is important that the conversion of molecules information to an appropriate descriptor because too informative descriptor requ...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2024-02-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-024-53888-2 |
_version_ | 1827327515082883072 |
---|---|
author | Keitaro Yamauchi Hirotaka Nakatsuji Takaaki Kamishima Yoshitaka Koseki Masaki Kubo Hitoshi Kasai |
author_facet | Keitaro Yamauchi Hirotaka Nakatsuji Takaaki Kamishima Yoshitaka Koseki Masaki Kubo Hitoshi Kasai |
author_sort | Keitaro Yamauchi |
collection | DOAJ |
description | Abstract The utilization of machine learning has a potential to improve the environment of the development of antimicrobial agents. For practical use of machine learning, it is important that the conversion of molecules information to an appropriate descriptor because too informative descriptor requires enormous computation time and experiments for gathering data, whereas a less informative descriptor has problems in validity. In this study, we utilized a descriptor only focused on substituent. The type and the position of substituents on the molecules that have a 4-quinolone structure (11,879 compounds) were converted to the combined substituent number (CSN). While the CSN does not include information on the detailed structure, physical properties, and quantum chemistry of molecules, the prediction model constructed by machine learning of CSN indicated a sufficient coefficient of determination (0.719 for the training dataset and 0.519 for the validation dataset). In addition, this CSN can easily construct the unknown molecules library which has a relatively consistent structure by recombination of substituents (32,079,318 compounds) and screening of them. The validity of the prediction model was also confirmed by growth inhibition experiments for E. coli using the model-suggested molecules and commercially available antimicrobial agents. |
first_indexed | 2024-03-07T15:01:23Z |
format | Article |
id | doaj.art-74a2197075c64f0589e3ed0f3423dd8c |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-03-07T15:01:23Z |
publishDate | 2024-02-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-74a2197075c64f0589e3ed0f3423dd8c2024-03-05T19:06:37ZengNature PortfolioScientific Reports2045-23222024-02-011411710.1038/s41598-024-53888-2Combined substituent number utilized machine learning for the development of antimicrobial agentKeitaro Yamauchi0Hirotaka Nakatsuji1Takaaki Kamishima2Yoshitaka Koseki3Masaki Kubo4Hitoshi Kasai5Institute of Multidisciplinary Research for Advance Materials (IMRAM), Tohoku UniversityInstitute of Multidisciplinary Research for Advance Materials (IMRAM), Tohoku UniversityEast Tokyo Laboratory, Genesis Research Institute, Inc.Institute of Multidisciplinary Research for Advance Materials (IMRAM), Tohoku UniversityDepartment of Chemical Engineering, Graduate School of Engineering, Tohoku UniversityInstitute of Multidisciplinary Research for Advance Materials (IMRAM), Tohoku UniversityAbstract The utilization of machine learning has a potential to improve the environment of the development of antimicrobial agents. For practical use of machine learning, it is important that the conversion of molecules information to an appropriate descriptor because too informative descriptor requires enormous computation time and experiments for gathering data, whereas a less informative descriptor has problems in validity. In this study, we utilized a descriptor only focused on substituent. The type and the position of substituents on the molecules that have a 4-quinolone structure (11,879 compounds) were converted to the combined substituent number (CSN). While the CSN does not include information on the detailed structure, physical properties, and quantum chemistry of molecules, the prediction model constructed by machine learning of CSN indicated a sufficient coefficient of determination (0.719 for the training dataset and 0.519 for the validation dataset). In addition, this CSN can easily construct the unknown molecules library which has a relatively consistent structure by recombination of substituents (32,079,318 compounds) and screening of them. The validity of the prediction model was also confirmed by growth inhibition experiments for E. coli using the model-suggested molecules and commercially available antimicrobial agents.https://doi.org/10.1038/s41598-024-53888-2 |
spellingShingle | Keitaro Yamauchi Hirotaka Nakatsuji Takaaki Kamishima Yoshitaka Koseki Masaki Kubo Hitoshi Kasai Combined substituent number utilized machine learning for the development of antimicrobial agent Scientific Reports |
title | Combined substituent number utilized machine learning for the development of antimicrobial agent |
title_full | Combined substituent number utilized machine learning for the development of antimicrobial agent |
title_fullStr | Combined substituent number utilized machine learning for the development of antimicrobial agent |
title_full_unstemmed | Combined substituent number utilized machine learning for the development of antimicrobial agent |
title_short | Combined substituent number utilized machine learning for the development of antimicrobial agent |
title_sort | combined substituent number utilized machine learning for the development of antimicrobial agent |
url | https://doi.org/10.1038/s41598-024-53888-2 |
work_keys_str_mv | AT keitaroyamauchi combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent AT hirotakanakatsuji combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent AT takaakikamishima combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent AT yoshitakakoseki combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent AT masakikubo combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent AT hitoshikasai combinedsubstituentnumberutilizedmachinelearningforthedevelopmentofantimicrobialagent |