Discovering Cerebral Ischemic Stroke Associated Genes Based on Network Representation Learning

Cerebral ischemic stroke (IS) is a complex disease caused by multiple factors including vascular risk factors, genetic factors, and environment factors, which accentuates the difficulty in discovering corresponding disease-related genes. Identifying the genes associated with IS is critical for under...

Full description

Bibliographic Details
Main Authors: Haijie Liu, Liping Hou, Shanhu Xu, He Li, Xiuju Chen, Juan Gao, Ziwen Wang, Bo Han, Xiaoli Liu, Shu Wan
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-09-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2021.728333/full
_version_ 1818606778692141056
author Haijie Liu
Liping Hou
Shanhu Xu
He Li
Xiuju Chen
Juan Gao
Ziwen Wang
Bo Han
Xiaoli Liu
Shu Wan
author_facet Haijie Liu
Liping Hou
Shanhu Xu
He Li
Xiuju Chen
Juan Gao
Ziwen Wang
Bo Han
Xiaoli Liu
Shu Wan
author_sort Haijie Liu
collection DOAJ
description Cerebral ischemic stroke (IS) is a complex disease caused by multiple factors including vascular risk factors, genetic factors, and environment factors, which accentuates the difficulty in discovering corresponding disease-related genes. Identifying the genes associated with IS is critical for understanding the biological mechanism of IS, which would be significantly beneficial to the diagnosis and clinical treatment of cerebral IS. However, existing methods to predict IS-related genes are mainly based on the hypothesis of guilt-by-association (GBA). These methods cannot capture the global structure information of the whole protein–protein interaction (PPI) network. Inspired by the success of network representation learning (NRL) in the field of network analysis, we apply NRL to the discovery of disease-related genes and launch the framework to identify the disease-related genes of cerebral IS. The utilized framework contains three main parts: capturing the topological information of the PPI network with NRL, denoising the gene feature with the participation of a stacked autoencoder (SAE), and optimizing a support vector machine (SVM) classifier to identify IS-related genes. Superior to the existing methods on IS-related gene prediction, our framework presents more accurate results. The case study also shows that the proposed method can identify IS-related genes.
first_indexed 2024-12-16T14:16:16Z
format Article
id doaj.art-f1159f795d424dcc82acefb24ede06c7
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-12-16T14:16:16Z
publishDate 2021-09-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-f1159f795d424dcc82acefb24ede06c72022-12-21T22:28:36ZengFrontiers Media S.A.Frontiers in Genetics1664-80212021-09-011210.3389/fgene.2021.728333728333Discovering Cerebral Ischemic Stroke Associated Genes Based on Network Representation LearningHaijie Liu0Liping Hou1Shanhu Xu2He Li3Xiuju Chen4Juan Gao5Ziwen Wang6Bo Han7Xiaoli Liu8Shu Wan9Department of Neurology, Xuanwu Hospital, Capital Medical University, Beijing, ChinaDepartment of Clinical Laboratory, General Hospital of Heilongjiang Province Land Reclamation Bureau, Harbin, ChinaAffiliated Zhejiang Hospital, Zhejiang University School of Medicine, Hangzhou, ChinaDepartment of Automation, College of Information Science and Engineering, Tianjin Tianshi College, Tianjin, ChinaDepartment of Neurology, Tianjin Nankai Hospital, Tianjin, ChinaDepartment of Neurology, Baoding No. 1 Central Hospital, Baoding, ChinaGraduate School of Chengde Medical College, Chengde, ChinaDepartment of Neurology, Xuanwu Hospital, Capital Medical University, Beijing, ChinaAffiliated Zhejiang Hospital, Zhejiang University School of Medicine, Hangzhou, ChinaAffiliated Zhejiang Hospital, Zhejiang University School of Medicine, Hangzhou, ChinaCerebral ischemic stroke (IS) is a complex disease caused by multiple factors including vascular risk factors, genetic factors, and environment factors, which accentuates the difficulty in discovering corresponding disease-related genes. Identifying the genes associated with IS is critical for understanding the biological mechanism of IS, which would be significantly beneficial to the diagnosis and clinical treatment of cerebral IS. However, existing methods to predict IS-related genes are mainly based on the hypothesis of guilt-by-association (GBA). These methods cannot capture the global structure information of the whole protein–protein interaction (PPI) network. Inspired by the success of network representation learning (NRL) in the field of network analysis, we apply NRL to the discovery of disease-related genes and launch the framework to identify the disease-related genes of cerebral IS. The utilized framework contains three main parts: capturing the topological information of the PPI network with NRL, denoising the gene feature with the participation of a stacked autoencoder (SAE), and optimizing a support vector machine (SVM) classifier to identify IS-related genes. Superior to the existing methods on IS-related gene prediction, our framework presents more accurate results. The case study also shows that the proposed method can identify IS-related genes.https://www.frontiersin.org/articles/10.3389/fgene.2021.728333/fullcerebral ischemic strokenetwork embeddingdisease gene predictionPPI networknetwork representation learning
spellingShingle Haijie Liu
Liping Hou
Shanhu Xu
He Li
Xiuju Chen
Juan Gao
Ziwen Wang
Bo Han
Xiaoli Liu
Shu Wan
Discovering Cerebral Ischemic Stroke Associated Genes Based on Network Representation Learning
Frontiers in Genetics
cerebral ischemic stroke
network embedding
disease gene prediction
PPI network
network representation learning
title Discovering Cerebral Ischemic Stroke Associated Genes Based on Network Representation Learning
title_full Discovering Cerebral Ischemic Stroke Associated Genes Based on Network Representation Learning
title_fullStr Discovering Cerebral Ischemic Stroke Associated Genes Based on Network Representation Learning
title_full_unstemmed Discovering Cerebral Ischemic Stroke Associated Genes Based on Network Representation Learning
title_short Discovering Cerebral Ischemic Stroke Associated Genes Based on Network Representation Learning
title_sort discovering cerebral ischemic stroke associated genes based on network representation learning
topic cerebral ischemic stroke
network embedding
disease gene prediction
PPI network
network representation learning
url https://www.frontiersin.org/articles/10.3389/fgene.2021.728333/full
work_keys_str_mv AT haijieliu discoveringcerebralischemicstrokeassociatedgenesbasedonnetworkrepresentationlearning
AT lipinghou discoveringcerebralischemicstrokeassociatedgenesbasedonnetworkrepresentationlearning
AT shanhuxu discoveringcerebralischemicstrokeassociatedgenesbasedonnetworkrepresentationlearning
AT heli discoveringcerebralischemicstrokeassociatedgenesbasedonnetworkrepresentationlearning
AT xiujuchen discoveringcerebralischemicstrokeassociatedgenesbasedonnetworkrepresentationlearning
AT juangao discoveringcerebralischemicstrokeassociatedgenesbasedonnetworkrepresentationlearning
AT ziwenwang discoveringcerebralischemicstrokeassociatedgenesbasedonnetworkrepresentationlearning
AT bohan discoveringcerebralischemicstrokeassociatedgenesbasedonnetworkrepresentationlearning
AT xiaoliliu discoveringcerebralischemicstrokeassociatedgenesbasedonnetworkrepresentationlearning
AT shuwan discoveringcerebralischemicstrokeassociatedgenesbasedonnetworkrepresentationlearning