Drug target prediction through deep learning functional representation of gene signatures

Abstract Many machine learning applications in bioinformatics currently rely on matching gene identities when analyzing input gene signatures and fail to take advantage of preexisting knowledge about gene functions. To further enable comparative analysis of OMICS datasets, including target deconvolu...

Full description

Bibliographic Details
Main Authors: Hao Chen, Frederick J. King, Bin Zhou, Yu Wang, Carter J. Canedy, Joel Hayashi, Yang Zhong, Max W. Chang, Lars Pache, Julian L. Wong, Yong Jia, John Joslin, Tao Jiang, Christopher Benner, Sumit K. Chanda, Yingyao Zhou
Format: Article
Language:English
Published: Nature Portfolio 2024-02-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-024-46089-y
_version_ 1797274017559740416
author Hao Chen
Frederick J. King
Bin Zhou
Yu Wang
Carter J. Canedy
Joel Hayashi
Yang Zhong
Max W. Chang
Lars Pache
Julian L. Wong
Yong Jia
John Joslin
Tao Jiang
Christopher Benner
Sumit K. Chanda
Yingyao Zhou
author_facet Hao Chen
Frederick J. King
Bin Zhou
Yu Wang
Carter J. Canedy
Joel Hayashi
Yang Zhong
Max W. Chang
Lars Pache
Julian L. Wong
Yong Jia
John Joslin
Tao Jiang
Christopher Benner
Sumit K. Chanda
Yingyao Zhou
author_sort Hao Chen
collection DOAJ
description Abstract Many machine learning applications in bioinformatics currently rely on matching gene identities when analyzing input gene signatures and fail to take advantage of preexisting knowledge about gene functions. To further enable comparative analysis of OMICS datasets, including target deconvolution and mechanism of action studies, we develop an approach that represents gene signatures projected onto their biological functions, instead of their identities, similar to how the word2vec technique works in natural language processing. We develop the Functional Representation of Gene Signatures (FRoGS) approach by training a deep learning model and demonstrate that its application to the Broad Institute’s L1000 datasets results in more effective compound-target predictions than models based on gene identities alone. By integrating additional pharmacological activity data sources, FRoGS significantly increases the number of high-quality compound-target predictions relative to existing approaches, many of which are supported by in silico and/or experimental evidence. These results underscore the general utility of FRoGS in machine learning-based bioinformatics applications. Prediction networks pre-equipped with the knowledge of gene functions may help uncover new relationships among gene signatures acquired by large-scale OMICs studies on compounds, cell types, disease models, and patient cohorts.
first_indexed 2024-03-07T14:52:23Z
format Article
id doaj.art-3b4f238e44764f15bb18667e531420c4
institution Directory Open Access Journal
issn 2041-1723
language English
last_indexed 2024-03-07T14:52:23Z
publishDate 2024-02-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj.art-3b4f238e44764f15bb18667e531420c42024-03-05T19:37:51ZengNature PortfolioNature Communications2041-17232024-02-0115111510.1038/s41467-024-46089-yDrug target prediction through deep learning functional representation of gene signaturesHao Chen0Frederick J. King1Bin Zhou2Yu Wang3Carter J. Canedy4Joel Hayashi5Yang Zhong6Max W. Chang7Lars Pache8Julian L. Wong9Yong Jia10John Joslin11Tao Jiang12Christopher Benner13Sumit K. Chanda14Yingyao Zhou15Novartis Biomedical ResearchNovartis Biomedical ResearchNovartis Biomedical ResearchNovartis Biomedical ResearchNovartis Biomedical ResearchNovartis Biomedical ResearchNovartis Biomedical ResearchDepartment of Medicine, University of California, San DiegoNCI Designated Cancer Center, Sanford Burnham Prebys Medical Discovery InstituteNovartis Biomedical ResearchNovartis Biomedical ResearchNovartis Biomedical ResearchDepartment of Computer Science and Engineering, University of California, RiversideDepartment of Medicine, University of California, San DiegoDepartment of Immunology and Microbiology, Scripps ResearchNovartis Biomedical ResearchAbstract Many machine learning applications in bioinformatics currently rely on matching gene identities when analyzing input gene signatures and fail to take advantage of preexisting knowledge about gene functions. To further enable comparative analysis of OMICS datasets, including target deconvolution and mechanism of action studies, we develop an approach that represents gene signatures projected onto their biological functions, instead of their identities, similar to how the word2vec technique works in natural language processing. We develop the Functional Representation of Gene Signatures (FRoGS) approach by training a deep learning model and demonstrate that its application to the Broad Institute’s L1000 datasets results in more effective compound-target predictions than models based on gene identities alone. By integrating additional pharmacological activity data sources, FRoGS significantly increases the number of high-quality compound-target predictions relative to existing approaches, many of which are supported by in silico and/or experimental evidence. These results underscore the general utility of FRoGS in machine learning-based bioinformatics applications. Prediction networks pre-equipped with the knowledge of gene functions may help uncover new relationships among gene signatures acquired by large-scale OMICs studies on compounds, cell types, disease models, and patient cohorts.https://doi.org/10.1038/s41467-024-46089-y
spellingShingle Hao Chen
Frederick J. King
Bin Zhou
Yu Wang
Carter J. Canedy
Joel Hayashi
Yang Zhong
Max W. Chang
Lars Pache
Julian L. Wong
Yong Jia
John Joslin
Tao Jiang
Christopher Benner
Sumit K. Chanda
Yingyao Zhou
Drug target prediction through deep learning functional representation of gene signatures
Nature Communications
title Drug target prediction through deep learning functional representation of gene signatures
title_full Drug target prediction through deep learning functional representation of gene signatures
title_fullStr Drug target prediction through deep learning functional representation of gene signatures
title_full_unstemmed Drug target prediction through deep learning functional representation of gene signatures
title_short Drug target prediction through deep learning functional representation of gene signatures
title_sort drug target prediction through deep learning functional representation of gene signatures
url https://doi.org/10.1038/s41467-024-46089-y
work_keys_str_mv AT haochen drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT frederickjking drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT binzhou drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT yuwang drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT carterjcanedy drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT joelhayashi drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT yangzhong drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT maxwchang drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT larspache drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT julianlwong drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT yongjia drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT johnjoslin drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT taojiang drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT christopherbenner drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT sumitkchanda drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures
AT yingyaozhou drugtargetpredictionthroughdeeplearningfunctionalrepresentationofgenesignatures