A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery.

Screening for novel antibacterial compounds in small molecule libraries has a low success rate. We applied machine learning (ML)-based virtual screening for antibacterial activity and evaluated its predictive power by experimental validation. We first binarized 29,537 compounds according to their gr...

Full description

Bibliographic Details
Main Authors: A S M Zisanur Rahman, Chengyou Liu, Hunter Sturm, Andrew M Hogan, Rebecca Davis, Pingzhao Hu, Silvia T Cardona
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2022-10-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1010613
_version_ 1798044707572416512
author A S M Zisanur Rahman
Chengyou Liu
Hunter Sturm
Andrew M Hogan
Rebecca Davis
Pingzhao Hu
Silvia T Cardona
author_facet A S M Zisanur Rahman
Chengyou Liu
Hunter Sturm
Andrew M Hogan
Rebecca Davis
Pingzhao Hu
Silvia T Cardona
author_sort A S M Zisanur Rahman
collection DOAJ
description Screening for novel antibacterial compounds in small molecule libraries has a low success rate. We applied machine learning (ML)-based virtual screening for antibacterial activity and evaluated its predictive power by experimental validation. We first binarized 29,537 compounds according to their growth inhibitory activity (hit rate 0.87%) against the antibiotic-resistant bacterium Burkholderia cenocepacia and described their molecular features with a directed-message passing neural network (D-MPNN). Then, we used the data to train an ML model that achieved a receiver operating characteristic (ROC) score of 0.823 on the test set. Finally, we predicted antibacterial activity in virtual libraries corresponding to 1,614 compounds from the Food and Drug Administration (FDA)-approved list and 224,205 natural products. Hit rates of 26% and 12%, respectively, were obtained when we tested the top-ranked predicted compounds for growth inhibitory activity against B. cenocepacia, which represents at least a 14-fold increase from the previous hit rate. In addition, more than 51% of the predicted antibacterial natural compounds inhibited ESKAPE pathogens showing that predictions expand beyond the organism-specific dataset to a broad range of bacteria. Overall, the developed ML approach can be used for compound prioritization before screening, increasing the typical hit rate of drug discovery.
first_indexed 2024-04-11T23:08:29Z
format Article
id doaj.art-5a6fd3b2e7f448648e12f1623ece1423
institution Directory Open Access Journal
issn 1553-734X
1553-7358
language English
last_indexed 2024-04-11T23:08:29Z
publishDate 2022-10-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj.art-5a6fd3b2e7f448648e12f1623ece14232022-12-22T03:57:55ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582022-10-011810e101061310.1371/journal.pcbi.1010613A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery.A S M Zisanur RahmanChengyou LiuHunter SturmAndrew M HoganRebecca DavisPingzhao HuSilvia T CardonaScreening for novel antibacterial compounds in small molecule libraries has a low success rate. We applied machine learning (ML)-based virtual screening for antibacterial activity and evaluated its predictive power by experimental validation. We first binarized 29,537 compounds according to their growth inhibitory activity (hit rate 0.87%) against the antibiotic-resistant bacterium Burkholderia cenocepacia and described their molecular features with a directed-message passing neural network (D-MPNN). Then, we used the data to train an ML model that achieved a receiver operating characteristic (ROC) score of 0.823 on the test set. Finally, we predicted antibacterial activity in virtual libraries corresponding to 1,614 compounds from the Food and Drug Administration (FDA)-approved list and 224,205 natural products. Hit rates of 26% and 12%, respectively, were obtained when we tested the top-ranked predicted compounds for growth inhibitory activity against B. cenocepacia, which represents at least a 14-fold increase from the previous hit rate. In addition, more than 51% of the predicted antibacterial natural compounds inhibited ESKAPE pathogens showing that predictions expand beyond the organism-specific dataset to a broad range of bacteria. Overall, the developed ML approach can be used for compound prioritization before screening, increasing the typical hit rate of drug discovery.https://doi.org/10.1371/journal.pcbi.1010613
spellingShingle A S M Zisanur Rahman
Chengyou Liu
Hunter Sturm
Andrew M Hogan
Rebecca Davis
Pingzhao Hu
Silvia T Cardona
A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery.
PLoS Computational Biology
title A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery.
title_full A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery.
title_fullStr A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery.
title_full_unstemmed A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery.
title_short A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery.
title_sort machine learning model trained on a high throughput antibacterial screen increases the hit rate of drug discovery
url https://doi.org/10.1371/journal.pcbi.1010613
work_keys_str_mv AT asmzisanurrahman amachinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT chengyouliu amachinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT huntersturm amachinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT andrewmhogan amachinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT rebeccadavis amachinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT pingzhaohu amachinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT silviatcardona amachinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT asmzisanurrahman machinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT chengyouliu machinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT huntersturm machinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT andrewmhogan machinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT rebeccadavis machinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT pingzhaohu machinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery
AT silviatcardona machinelearningmodeltrainedonahighthroughputantibacterialscreenincreasesthehitrateofdrugdiscovery