Label propagation-based semi-supervised feature selection on decoding clinical phenotypes with RNA-seq data
Abstract Background Clinically, behavior, cognitive, and mental functions are affected during the neurodegenerative disease progression. To date, the molecular pathogenesis of these complex disease is still unclear. With the rapid development of sequencing technologies, it is possible to delicately...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2021-08-01
|
Series: | BMC Medical Genomics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12920-021-00985-0 |
_version_ | 1819115226368311296 |
---|---|
author | Xue Jiang Miao Chen Weichen Song Guan Ning Lin |
author_facet | Xue Jiang Miao Chen Weichen Song Guan Ning Lin |
author_sort | Xue Jiang |
collection | DOAJ |
description | Abstract Background Clinically, behavior, cognitive, and mental functions are affected during the neurodegenerative disease progression. To date, the molecular pathogenesis of these complex disease is still unclear. With the rapid development of sequencing technologies, it is possible to delicately decode the molecular mechanisms corresponding to different clinical phenotypes at the genome-wide transcriptomic level using computational methods. Our previous studies have shown that it is difficult to distinguish disease genes from non-disease genes. Therefore, to precisely explore the molecular pathogenesis under complex clinical phenotypes, it is better to identify biomarkers corresponding to different disease stages or clinical phenotypes. So, in this study, we designed a label propagation-based semi-supervised feature selection approach (LPFS) to prioritize disease-associated genes corresponding to different disease stages or clinical phenotypes. Methods In this study, we pioneering put label propagation clustering and feature selection into one framework and proposed label propagation-based semi-supervised feature selection approach. LPFS prioritizes disease genes related to different disease stages or phenotypes through the alternative iteration of label propagation clustering based on sample network and feature selection with gene expression profiles. Then the GO and KEGG pathway enrichment analysis were carried as well as the gene functional analysis to explore molecular mechanisms of specific disease phenotypes, thus to decode the changes in individual behavioral and mental characteristics during neurodegenerative disease progression. Results Large amounts of experiments were conducted to verify the performance of LPFS with Huntington’s gene expression data. Experimental results shown that LPFS performs better in comparison with the-state-of-art methods. GO and KEGG enrichment analysis of key gene sets shown that TGF-beta signaling pathway, cytokine-cytokine receptor interaction, immune response, and inflammatory response were gradually affected during the Huntington’s disease progression. In addition, we found that the expression of SLC4A11, ZFP474, AMBP, TOP2A, PBK, CCDC33, APSL, DLGAP5, and Al662270 changed seriously by the development of the disease. Conclusions In this study, we designed a label propagation-based semi-supervised feature selection model to precisely selected key genes of different disease phenotypes. We conducted experiments using the model with Huntington’s disease mice gene expression data to decode the mechanisms of it. We found many cell types, including astrocyte, microglia, and GABAergic neuron, could be involved in the pathological process. |
first_indexed | 2024-12-22T04:57:49Z |
format | Article |
id | doaj.art-d5dcf21af2f34ef1a2177982c1470d5b |
institution | Directory Open Access Journal |
issn | 1755-8794 |
language | English |
last_indexed | 2024-12-22T04:57:49Z |
publishDate | 2021-08-01 |
publisher | BMC |
record_format | Article |
series | BMC Medical Genomics |
spelling | doaj.art-d5dcf21af2f34ef1a2177982c1470d5b2022-12-21T18:38:20ZengBMCBMC Medical Genomics1755-87942021-08-0114S111110.1186/s12920-021-00985-0Label propagation-based semi-supervised feature selection on decoding clinical phenotypes with RNA-seq dataXue Jiang0Miao Chen1Weichen Song2Guan Ning Lin3Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong UniversityShanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong UniversityShanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong UniversityShanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong UniversityAbstract Background Clinically, behavior, cognitive, and mental functions are affected during the neurodegenerative disease progression. To date, the molecular pathogenesis of these complex disease is still unclear. With the rapid development of sequencing technologies, it is possible to delicately decode the molecular mechanisms corresponding to different clinical phenotypes at the genome-wide transcriptomic level using computational methods. Our previous studies have shown that it is difficult to distinguish disease genes from non-disease genes. Therefore, to precisely explore the molecular pathogenesis under complex clinical phenotypes, it is better to identify biomarkers corresponding to different disease stages or clinical phenotypes. So, in this study, we designed a label propagation-based semi-supervised feature selection approach (LPFS) to prioritize disease-associated genes corresponding to different disease stages or clinical phenotypes. Methods In this study, we pioneering put label propagation clustering and feature selection into one framework and proposed label propagation-based semi-supervised feature selection approach. LPFS prioritizes disease genes related to different disease stages or phenotypes through the alternative iteration of label propagation clustering based on sample network and feature selection with gene expression profiles. Then the GO and KEGG pathway enrichment analysis were carried as well as the gene functional analysis to explore molecular mechanisms of specific disease phenotypes, thus to decode the changes in individual behavioral and mental characteristics during neurodegenerative disease progression. Results Large amounts of experiments were conducted to verify the performance of LPFS with Huntington’s gene expression data. Experimental results shown that LPFS performs better in comparison with the-state-of-art methods. GO and KEGG enrichment analysis of key gene sets shown that TGF-beta signaling pathway, cytokine-cytokine receptor interaction, immune response, and inflammatory response were gradually affected during the Huntington’s disease progression. In addition, we found that the expression of SLC4A11, ZFP474, AMBP, TOP2A, PBK, CCDC33, APSL, DLGAP5, and Al662270 changed seriously by the development of the disease. Conclusions In this study, we designed a label propagation-based semi-supervised feature selection model to precisely selected key genes of different disease phenotypes. We conducted experiments using the model with Huntington’s disease mice gene expression data to decode the mechanisms of it. We found many cell types, including astrocyte, microglia, and GABAergic neuron, could be involved in the pathological process.https://doi.org/10.1186/s12920-021-00985-0Biomarkers that corresponding to clinical phenotypesLabel propagation clusteringFeature selection |
spellingShingle | Xue Jiang Miao Chen Weichen Song Guan Ning Lin Label propagation-based semi-supervised feature selection on decoding clinical phenotypes with RNA-seq data BMC Medical Genomics Biomarkers that corresponding to clinical phenotypes Label propagation clustering Feature selection |
title | Label propagation-based semi-supervised feature selection on decoding clinical phenotypes with RNA-seq data |
title_full | Label propagation-based semi-supervised feature selection on decoding clinical phenotypes with RNA-seq data |
title_fullStr | Label propagation-based semi-supervised feature selection on decoding clinical phenotypes with RNA-seq data |
title_full_unstemmed | Label propagation-based semi-supervised feature selection on decoding clinical phenotypes with RNA-seq data |
title_short | Label propagation-based semi-supervised feature selection on decoding clinical phenotypes with RNA-seq data |
title_sort | label propagation based semi supervised feature selection on decoding clinical phenotypes with rna seq data |
topic | Biomarkers that corresponding to clinical phenotypes Label propagation clustering Feature selection |
url | https://doi.org/10.1186/s12920-021-00985-0 |
work_keys_str_mv | AT xuejiang labelpropagationbasedsemisupervisedfeatureselectionondecodingclinicalphenotypeswithrnaseqdata AT miaochen labelpropagationbasedsemisupervisedfeatureselectionondecodingclinicalphenotypeswithrnaseqdata AT weichensong labelpropagationbasedsemisupervisedfeatureselectionondecodingclinicalphenotypeswithrnaseqdata AT guanninglin labelpropagationbasedsemisupervisedfeatureselectionondecodingclinicalphenotypeswithrnaseqdata |