Discovery of stroke-related blood biomarkers from gene expression network models
Abstract Background Identifying molecular biomarkers characteristic of ischemic stroke has the potential to aid in distinguishing stroke cases from stroke mimicking symptoms, as well as advancing the understanding of the physiological changes that underlie the body’s response to stroke. This study u...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-08-01
|
Series: | BMC Medical Genomics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12920-019-0566-8 |
_version_ | 1818455651526901760 |
---|---|
author | Konstantinos Theofilatos Aigli Korfiati Seferina Mavroudi Matthew C. Cowperthwaite Max Shpak |
author_facet | Konstantinos Theofilatos Aigli Korfiati Seferina Mavroudi Matthew C. Cowperthwaite Max Shpak |
author_sort | Konstantinos Theofilatos |
collection | DOAJ |
description | Abstract Background Identifying molecular biomarkers characteristic of ischemic stroke has the potential to aid in distinguishing stroke cases from stroke mimicking symptoms, as well as advancing the understanding of the physiological changes that underlie the body’s response to stroke. This study uses machine learning-based analysis of gene co-expression to identify transcription patterns characteristic of patients with acute ischemic stroke. Methods Mutual information values for the expression levels among 13,243 quantified transcripts were computed for blood samples from 82 stroke patients and 68 controls to construct a co-expression network of genes (separately) for stroke and control samples. Page rank centrality scores were computed for every gene; a gene’s significance in the network was assessed according to the differences in their network’s pagerank centrality between stroke and control expression patterns. A hybrid genetic algorithm – support vector machine learning tool was used to classify samples based on gene centrality in order to identify an optimal set of predictor genes for stroke while minimizing the number of genes in the model. Results A predictive model with 89.6% accuracy was identified using 6 network-central and differentially expressed genes (ID3, MBTPS1, NOG, SFXN2, BMX, SLC22A1), characterized by large differences in association network connectivity between stroke and control samples. In contrast, classification models based solely on individual genes identified by significant fold-changes in expression level provided lower predictive accuracies: < 71% for any single gene, and even models with larger (10–25) numbers of gene transcript biomarkers gave lower predictive accuracies (≤ 82%) than the 6 network-based gene signature classification. miRNA:mRNA target prediction computational analysis revealed 8 differentially expressed micro-RNAs (miRNAs) that are significantly associated with at least 2 of the 6 network-central genes. Conclusions Network-based models have the potential to identify a more statistically robust pattern of gene expression typical of acute ischemic stroke and to generate hypotheses about possible interactions among functionally relevant genes, leading to the identification of more informative biomarkers. |
first_indexed | 2024-12-14T22:14:10Z |
format | Article |
id | doaj.art-981f8adb2b58456cab70b087a1c49086 |
institution | Directory Open Access Journal |
issn | 1755-8794 |
language | English |
last_indexed | 2024-12-14T22:14:10Z |
publishDate | 2019-08-01 |
publisher | BMC |
record_format | Article |
series | BMC Medical Genomics |
spelling | doaj.art-981f8adb2b58456cab70b087a1c490862022-12-21T22:45:41ZengBMCBMC Medical Genomics1755-87942019-08-0112111510.1186/s12920-019-0566-8Discovery of stroke-related blood biomarkers from gene expression network modelsKonstantinos Theofilatos0Aigli Korfiati1Seferina Mavroudi2Matthew C. Cowperthwaite3Max Shpak4InSyBio: Intelligent Systems BiologyInSyBio: Intelligent Systems BiologyInSyBio: Intelligent Systems BiologySt. David’s Medical CenterCenter for Systems and Synthetic Biology, University of Texas at AustinAbstract Background Identifying molecular biomarkers characteristic of ischemic stroke has the potential to aid in distinguishing stroke cases from stroke mimicking symptoms, as well as advancing the understanding of the physiological changes that underlie the body’s response to stroke. This study uses machine learning-based analysis of gene co-expression to identify transcription patterns characteristic of patients with acute ischemic stroke. Methods Mutual information values for the expression levels among 13,243 quantified transcripts were computed for blood samples from 82 stroke patients and 68 controls to construct a co-expression network of genes (separately) for stroke and control samples. Page rank centrality scores were computed for every gene; a gene’s significance in the network was assessed according to the differences in their network’s pagerank centrality between stroke and control expression patterns. A hybrid genetic algorithm – support vector machine learning tool was used to classify samples based on gene centrality in order to identify an optimal set of predictor genes for stroke while minimizing the number of genes in the model. Results A predictive model with 89.6% accuracy was identified using 6 network-central and differentially expressed genes (ID3, MBTPS1, NOG, SFXN2, BMX, SLC22A1), characterized by large differences in association network connectivity between stroke and control samples. In contrast, classification models based solely on individual genes identified by significant fold-changes in expression level provided lower predictive accuracies: < 71% for any single gene, and even models with larger (10–25) numbers of gene transcript biomarkers gave lower predictive accuracies (≤ 82%) than the 6 network-based gene signature classification. miRNA:mRNA target prediction computational analysis revealed 8 differentially expressed micro-RNAs (miRNAs) that are significantly associated with at least 2 of the 6 network-central genes. Conclusions Network-based models have the potential to identify a more statistically robust pattern of gene expression typical of acute ischemic stroke and to generate hypotheses about possible interactions among functionally relevant genes, leading to the identification of more informative biomarkers.http://link.springer.com/article/10.1186/s12920-019-0566-8StrokeGene expressionGene networksBiomarkers |
spellingShingle | Konstantinos Theofilatos Aigli Korfiati Seferina Mavroudi Matthew C. Cowperthwaite Max Shpak Discovery of stroke-related blood biomarkers from gene expression network models BMC Medical Genomics Stroke Gene expression Gene networks Biomarkers |
title | Discovery of stroke-related blood biomarkers from gene expression network models |
title_full | Discovery of stroke-related blood biomarkers from gene expression network models |
title_fullStr | Discovery of stroke-related blood biomarkers from gene expression network models |
title_full_unstemmed | Discovery of stroke-related blood biomarkers from gene expression network models |
title_short | Discovery of stroke-related blood biomarkers from gene expression network models |
title_sort | discovery of stroke related blood biomarkers from gene expression network models |
topic | Stroke Gene expression Gene networks Biomarkers |
url | http://link.springer.com/article/10.1186/s12920-019-0566-8 |
work_keys_str_mv | AT konstantinostheofilatos discoveryofstrokerelatedbloodbiomarkersfromgeneexpressionnetworkmodels AT aiglikorfiati discoveryofstrokerelatedbloodbiomarkersfromgeneexpressionnetworkmodels AT seferinamavroudi discoveryofstrokerelatedbloodbiomarkersfromgeneexpressionnetworkmodels AT matthewccowperthwaite discoveryofstrokerelatedbloodbiomarkersfromgeneexpressionnetworkmodels AT maxshpak discoveryofstrokerelatedbloodbiomarkersfromgeneexpressionnetworkmodels |